Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - how to operate a map type


Copy link to this message
-
Re: how to operate a map type
Jameson Li 2011-05-24, 10:05
OK.OK.I know that just write UDFs.
I have to write UDFs, and see you......
And I still think there should be grammar support for map operation both
static key and dynamic key.............

Thanks.

2011/5/24 Daniel Dai <[EMAIL PROTECTED]>

> GetKey(m) already get the key, so you can filter the key. For value, you
> may need to put into UDF.
>
> Grammar support for map is based on static key, eg: m#'key1'. Your use case
> is mostly dealing dynamic keys, which you may rely on yourself currently.
>
> Daniel
>
> -----Original Message----- From: Jameson Li
> Sent: Monday, May 23, 2011 7:07 PM
> To: Daniel Dai
> Cc: [EMAIL PROTECTED]
> Subject: Re: how to operate a map type
>
>
> And how to filter a map key or a map value? And also only UDF?
>
> b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1) as m;
> c = filter b by m.key == 'aaa' or m.value> 0.2;
>
> How could I write the code?
> Any other way without writing UDF?
>
> And I have a doubt since only writing UDF can operate a map type, why not
> have the official functions about the map type?
>
> Thanks.
>
> 2011/5/24 Daniel Dai <[EMAIL PROTECTED]>
>
>  I cannot think of a way without writing UDF. You can write two UDF:
>> * GetKey, input a map, output the key of the map
>> * GetValues, input a bag of map, output a bag of map values
>>
>> The script is like:
>> b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1) as m;
>> c = foreach b generate GetKey(m) as key, m;
>> d = group c by key;
>> e = foreach c generate group, SUM(GetValues(c.m));
>>
>>
>> Daniel
>>
>>
>> On 05/23/2011 07:06 AM, Jameson Li wrote:
>>
>>  Hi all,
>>>
>>> I have the below pig code:
>>>
>>> register /home/uu/project/lib/pigudfs.jar
>>> ruls = load 'testurl' as (url:chararray);
>>>
>>> b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1);
>>>
>>> here when dump b, it will return:
>>> ([4#0.1677963])
>>> ([193#0.16985779,81#0.10994483])
>>> ([418#0.14138427,9#0.1107544,282#0.18699136])
>>>
>>> I just want group by the map key and sum the map value just like:
>>> c = group b by $0#key;
>>> d = foreach c generate group,SUM(b.$0#value);
>>>
>>> How could I write the code?
>>>
>>> Thanks,
>>> Jameson Li.
>>>
>>>
>>
>>
>