Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Tuple Cross Products?


Copy link to this message
-
Tuple Cross Products?
Say I have a relation of the form:
posts: {post_id: chararray,tags: (),user_id: chararray}
In this case, I have one post_id, one user_id, and a tuple of chararray tags. I want to get the cross product of the tags with the other items in each row, so if I had a relation such as:
user_id, (1,2,3), post_id
I'd be returned:
user_id, 1, post_id
user_id, 2, post_id
user_id, 3, post_id.

Basically, the same behavior one would expect if tags was a bag of tuples instead of one tuple. I used to be able to accomplish this by doing a FLATTEN(TOBAG(FLATTEN(tags))) in earlier versions of Pig, but when I do that now, I get:

[main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve FLATTEN using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

Is there some new, preferred way to accomplish this?

Eli
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB