Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - how to operate a map type


Copy link to this message
-
Re: how to operate a map type
Jameson Li 2011-05-24, 02:07
And how to filter a map key or a map value? And also only UDF?

b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1) as m;
c = filter b by m.key == 'aaa' or m.value> 0.2;

How could I write the code?
Any other way without writing UDF?

And I have a doubt since only writing UDF can operate a map type, why not
have the official functions about the map type?

Thanks.

2011/5/24 Daniel Dai <[EMAIL PROTECTED]>

> I cannot think of a way without writing UDF. You can write two UDF:
> * GetKey, input a map, output the key of the map
> * GetValues, input a bag of map, output a bag of map values
>
> The script is like:
> b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1) as m;
> c = foreach b generate GetKey(m) as key, m;
> d = group c by key;
> e = foreach c generate group, SUM(GetValues(c.m));
>
>
> Daniel
>
>
> On 05/23/2011 07:06 AM, Jameson Li wrote:
>
>> Hi all,
>>
>> I have the below pig code:
>>
>> register /home/uu/project/lib/pigudfs.jar
>> ruls = load 'testurl' as (url:chararray);
>>
>> b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1);
>>
>> here when dump b, it will return:
>> ([4#0.1677963])
>> ([193#0.16985779,81#0.10994483])
>> ([418#0.14138427,9#0.1107544,282#0.18699136])
>>
>> I just want group by the map key and sum the map value just like:
>> c = group b by $0#key;
>> d = foreach c generate group,SUM(b.$0#value);
>>
>> How could I write the code?
>>
>> Thanks,
>> Jameson Li.
>>
>
>