Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> how to operate a map type


Copy link to this message
-
Re: how to operate a map type
OK.OK.I know that just write UDFs.
I have to write UDFs, and see you......
And I still think there should be grammar support for map operation both
static key and dynamic key.............

Thanks.

2011/5/24 Daniel Dai <[EMAIL PROTECTED]>

> GetKey(m) already get the key, so you can filter the key. For value, you
> may need to put into UDF.
>
> Grammar support for map is based on static key, eg: m#'key1'. Your use case
> is mostly dealing dynamic keys, which you may rely on yourself currently.
>
> Daniel
>
> -----Original Message----- From: Jameson Li
> Sent: Monday, May 23, 2011 7:07 PM
> To: Daniel Dai
> Cc: [EMAIL PROTECTED]
> Subject: Re: how to operate a map type
>
>
> And how to filter a map key or a map value? And also only UDF?
>
> b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1) as m;
> c = filter b by m.key == 'aaa' or m.value> 0.2;
>
> How could I write the code?
> Any other way without writing UDF?
>
> And I have a doubt since only writing UDF can operate a map type, why not
> have the official functions about the map type?
>
> Thanks.
>
> 2011/5/24 Daniel Dai <[EMAIL PROTECTED]>
>
>  I cannot think of a way without writing UDF. You can write two UDF:
>> * GetKey, input a map, output the key of the map
>> * GetValues, input a bag of map, output a bag of map values
>>
>> The script is like:
>> b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1) as m;
>> c = foreach b generate GetKey(m) as key, m;
>> d = group c by key;
>> e = foreach c generate group, SUM(GetValues(c.m));
>>
>>
>> Daniel
>>
>>
>> On 05/23/2011 07:06 AM, Jameson Li wrote:
>>
>>  Hi all,
>>>
>>> I have the below pig code:
>>>
>>> register /home/uu/project/lib/pigudfs.jar
>>> ruls = load 'testurl' as (url:chararray);
>>>
>>> b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1);
>>>
>>> here when dump b, it will return:
>>> ([4#0.1677963])
>>> ([193#0.16985779,81#0.10994483])
>>> ([418#0.14138427,9#0.1107544,282#0.18699136])
>>>
>>> I just want group by the map key and sum the map value just like:
>>> c = group b by $0#key;
>>> d = foreach c generate group,SUM(b.$0#value);
>>>
>>> How could I write the code?
>>>
>>> Thanks,
>>> Jameson Li.
>>>
>>>
>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB