Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Tuple Cross Products?

Copy link to this message
Tuple Cross Products?
Eli Finkelshteyn 2013-03-04, 22:56
Say I have a relation of the form:
posts: {post_id: chararray,tags: (),user_id: chararray}
In this case, I have one post_id, one user_id, and a tuple of chararray tags. I want to get the cross product of the tags with the other items in each row, so if I had a relation such as:
user_id, (1,2,3), post_id
I'd be returned:
user_id, 1, post_id
user_id, 2, post_id
user_id, 3, post_id.

Basically, the same behavior one would expect if tags was a bag of tuples instead of one tuple. I used to be able to accomplish this by doing a FLATTEN(TOBAG(FLATTEN(tags))) in earlier versions of Pig, but when I do that now, I get:

[main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve FLATTEN using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

Is there some new, preferred way to accomplish this?