Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - pig 0.8.1 - Iterating contents of a Bag


Copy link to this message
-
Re: pig 0.8.1 - Iterating contents of a Bag
Shahab Yunus 2013-07-23, 21:58
Amit, have you looked into TOBAG and TOTUPLE built-in UDFs? They are not
helpful?

Regards,
Shahab
On Tue, Jul 23, 2013 at 5:46 PM, Amit <[EMAIL PROTECTED]> wrote:

> Hello,
> Based on your suggestion worked my way around using flatten.
>
> This is what I am doing now
> B = FOREACH A GENERATE key,FLATTEN(keywords);
> C = FOREACH B GENERATE myUDF(keywords::keyword)
>
>
> The Relation B gives me following records to work with -
> 1,amit
> 1,yahoo
> 1,pig
>
> However I believe it would have been much helpful if we can iterate the
> tuples in the Bag without flattening.
>
> Thank you for your help though.
>
>
> Regards,
> Amit
>
>
>
> ________________________________
>  From: Amit <[EMAIL PROTECTED]>
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Sent: Tuesday, July 23, 2013 4:25 PM
> Subject: Re: pig 0.8.1 - Iterating contents of a Bag
>
>
> Thanks for the quick response.
> However I do not want to flatten because I plan to invoke a previously
> written UDF which accepts a chararray to using each value in the Bag.
>
> I am not sure if it at all is possible with 0.8.1 but just thought to seek
> view from experts on this mailing list.
>
>
> Regards,
> Amit
>
>  From: Serega Sheypak <[EMAIL PROTECTED]>
>
> To: [EMAIL PROTECTED]; Amit <[EMAIL PROTECTED]>
> Sent: Tuesday, July 23, 2013 4:23 PM
> Subject: Re: pig 0.8.1 - Iterating contents of a Bag
>
>
>
> Hi, I'm new to pig, will try to help you.
> B = FOREACH A {
>     GENERATE FLATTEN(keywords.keyword) as keyword;
> };
>
>
> OR
> B = FOREACH A {
>     GENERATE FLATTEN(keywords.keyword) as (keyword);
> };
>
>
> You need flatten the bag.
>
>
>
>
> 2013/7/24 Amit <[EMAIL PROTECTED]>
>
> Hello there,
> >I am loading a data in form of
> >
> >A1: {key: chararray,keywords: {keywords_tuple: (keyword: chararray)}}
> >
> >I believe the Sample data would look like the following
> >
> >{1, {('amit'),('yahoo'),('pig')}
> >
> >I am trying to write a foreach where I can loop through the each keyword
> in the bag.
> >
> >I tried writing this but it seems to not dump the output the way I want
> to see
> >
> >
> >B = FOREACH A {
> >    GENERATE keywords.keyword;
> >};
> >
> >I would like to see
> >
> >('amit')
> >('yahoo')
> >('pig')
> >
> >Instead it prints the entire bag at once like the one below.
> >
> >{('amit'),('yahoo'),('pig')}
> >
> >
> >
> >Please note I do not want to flatten the bag as what I want to process
> each keyword in the bag using a UDF later on.
> >
> >Appreciate any of your inputs.
> >
> >Regards,
> >Amit
> >
>