Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> how to operate a map type


Copy link to this message
-
Re: how to operate a map type
And how to filter a map key or a map value? And also only UDF?

b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1) as m;
c = filter b by m.key == 'aaa' or m.value> 0.2;

How could I write the code?
Any other way without writing UDF?

And I have a doubt since only writing UDF can operate a map type, why not
have the official functions about the map type?

Thanks.

2011/5/24 Daniel Dai <[EMAIL PROTECTED]>

> I cannot think of a way without writing UDF. You can write two UDF:
> * GetKey, input a map, output the key of the map
> * GetValues, input a bag of map, output a bag of map values
>
> The script is like:
> b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1) as m;
> c = foreach b generate GetKey(m) as key, m;
> d = group c by key;
> e = foreach c generate group, SUM(GetValues(c.m));
>
>
> Daniel
>
>
> On 05/23/2011 07:06 AM, Jameson Li wrote:
>
>> Hi all,
>>
>> I have the below pig code:
>>
>> register /home/uu/project/lib/pigudfs.jar
>> ruls = load 'testurl' as (url:chararray);
>>
>> b = foreach ruls generate com.company.pig.GetURLContent($0,3,0.1);
>>
>> here when dump b, it will return:
>> ([4#0.1677963])
>> ([193#0.16985779,81#0.10994483])
>> ([418#0.14138427,9#0.1107544,282#0.18699136])
>>
>> I just want group by the map key and sum the map value just like:
>> c = group b by $0#key;
>> d = foreach c generate group,SUM(b.$0#value);
>>
>> How could I write the code?
>>
>> Thanks,
>> Jameson Li.
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB