Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> A question regarding schema

Copy link to this message
A question regarding schema
Hi All,I have a UDF that returns a tuple. The number of elements in the tuple will differ for each user. For example:
(userid1, item1, item2, item 100, item 400)(userid1, item1, item200)(userid1, item1, item2, item 100, item200, item250, item300, item 400)(userid1, item 100, item 200, item250, item300, item380, item400, item450, item480, item560, item800, item1000)
Pig script:
A = LOAD '/scratch/input.seq' USING $SEQFILE_LOADER ( '-c $TEXT_CONVERTER', '-c $TEXT_CONVERTER') AS (key: chararray, value: chararray);
UserItemAssoc = FOREACH A GENERATE myparser.myUDF(key, value) AS {(userid: chararray, itemtid: How to specify this???)};
If I want to specify the schema in the AS clause, how do I do it since the number of fields will differ in each row? Is it possible to somehow do this dynamically?
centerqi hu 2013-12-31, 09:45