Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Cross Product of Two Tuples?

Copy link to this message
Cross Product of Two Tuples?
Hi Folks,
I'm currently trying to do something I figured would be trivial, but
actually wound up being a bit of work for me, so I'm wondering if I'm
missing something. All I want to do is get a cross product of two
tuples. So for example, given an input of:

('hello', 'howdy', 'hi'), ('hola', 'bonjour')

I'd get:

('hello', 'hola')
('hello', 'bonjour')
('howdy', 'hola')
('howdy', 'bonjour')
('hi', 'hola')
('hi', 'bonjour')

At first, I figured I could FLATTEN(TOBAG(tuple1, tuple2)), but that's
no good cause the tuples are first themselves put into new tuples. So,
what I'm left with no is writing a dirty and slow python udf for this.
Is there really no better way to do this? I'd think it would be a pretty
standard task.