Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> pig 0.8.1 - Iterating contents of a Bag


Copy link to this message
-
Re: pig 0.8.1 - Iterating contents of a Bag
Thanks for the quick response.
However I do not want to flatten because I plan to invoke a previously written UDF which accepts a chararray to using each value in the Bag.

I am not sure if it at all is possible with 0.8.1 but just thought to seek view from experts on this mailing list.
Regards,
Amit

 From: Serega Sheypak <[EMAIL PROTECTED]>

To: [EMAIL PROTECTED]; Amit <[EMAIL PROTECTED]>
Sent: Tuesday, July 23, 2013 4:23 PM
Subject: Re: pig 0.8.1 - Iterating contents of a Bag
 
Hi, I'm new to pig, will try to help you.
B = FOREACH A {
    GENERATE FLATTEN(keywords.keyword) as keyword;
};
OR
B = FOREACH A {
    GENERATE FLATTEN(keywords.keyword) as (keyword);
};
You need flatten the bag.
2013/7/24 Amit <[EMAIL PROTECTED]>

Hello there,
>I am loading a data in form of
>
>A1: {key: chararray,keywords: {keywords_tuple: (keyword: chararray)}}
>
>I believe the Sample data would look like the following
>
>{1, {('amit'),('yahoo'),('pig')}
>
>I am trying to write a foreach where I can loop through the each keyword in the bag.
>
>I tried writing this but it seems to not dump the output the way I want to see
>
>
>B = FOREACH A {
>    GENERATE keywords.keyword;
>};
>
>I would like to see
>
>('amit')
>('yahoo')
>('pig')
>
>Instead it prints the entire bag at once like the one below.
>
>{('amit'),('yahoo'),('pig')}
>
>
>
>Please note I do not want to flatten the bag as what I want to process each keyword in the bag using a UDF later on.
>
>Appreciate any of your inputs.
>
>Regards,
>Amit   
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB