Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> how to perform GROUP BY in PIG for this case:


Copy link to this message
-
Re: how to perform GROUP BY in PIG for this case:
You'll need to build pig. Assuming you have the source, run 'ant' in
the base directory and in contrib/Piggybank/java

Russell Jurney http://datasyndrome.com

On Sep 29, 2012, at 8:19 PM, yogesh dhari <[EMAIL PROTECTED]> wrote:

>
>
> Hi russell,
>
> I am using Pig-0.10.0 version and I checked the directory /opt/Pig-0.10.0/contrib/piggybank/java/
>
> there is no any jar files. :-(
>
> grunt> register /opt/pig-0.10.0/contrib/piggybank/java/piggybank.jar
>
> [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 101: file '/opt/pig-0.10.0/contrib/piggybank/java/piggybank.jar' does not exist.
> Details at logfile: /opt/pig-0.10.0/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/datetime/convert/pig_1348974384533.log
>
> similarly
>
> there is no path /opt/build/ivy/lib/Pig/
>
> instead /opt/pig-0.10.0/ivy is there. but it has no /lib/Pig/
>
> Please suggest  & help
>
> Thanks & regards
> Yogesh Kumar
>
>
>
>
>> From: [EMAIL PROTECTED]
>> Date: Sat, 29 Sep 2012 19:21:17 -0700
>> Subject: Re: how to perform GROUP BY in PIG for this case:
>> To: [EMAIL PROTECTED]
>>
>> My bad - you will need to register the Piggybank and jodatime jars. Replace
>> /me/pig with your pig install path.
>>
>> register /me/pig/contrib/piggybank/java/piggybank.jar
>> register /me/pig/build/ivy/lib/Pig/joda-time-1.6.jar
>>
>> define CustomFormatToISO
>> org.apache.pig.piggybank.evaluation.datetime.convert.CustomFormatToISO();
>>
>> define ISOToMonth
>> org.apache.pig.piggybank.evaluation.datetime.truncate.ISOToMonth()
>>
>>
>> That should take care of the error.
>>
>> This example may help:
>> https://github.com/rjurney/Collecting-Data/blob/master/src/pig/rfc1123_to_iso8601.pig
>>
>> Russell Jurney http://datasyndrome.com
>>
>> On Sep 29, 2012, at 4:33 PM, yogesh dhari <[EMAIL PROTECTED]> wrote:
>>
>>
>> Thanks Russell,
>>
>> I am new to Pig. I have tried this command.
>> and got this exception.
>>
>> 2012-09-30 04:53:22,995 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>> ERROR 1070: Could not resolve ISOToMonth using imports: [,
>> org.apache.pig.builtin., org.apache.pig.impl.builtin.]
>>
>> Is there some thing more I need to do like import or some thing like that.
>>
>> Please suggest.
>>
>> Thanks & regards
>> Yogesh Kumar
>>
>> From: [EMAIL PROTECTED]
>>
>> Date: Sat, 29 Sep 2012 16:15:18 -0700
>>
>> Subject: Re: how to perform GROUP BY in PIG for this case:
>>
>> To: [EMAIL PROTECTED]
>>
>>
>> answer = foreach (group data by ISOToMonth(Date)) generate group as
>>
>> month, MAX(data.rate) as max_rate;
>>
>>
>> Note, you will need your date in ISO8601 format, and you can use
>>
>> CustomFormatToISO to convert it if it's is a string, or UnixToISO if
>>
>> your date is a long.
>>
>>
>> See:
>>
>>
>> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/CustomFormatToISO.html
>>
>> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/UnixToISO.html
>>
>> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/truncate/ISOToMonth.html
>>
>>
>>
>>
>> Russell Jurney http://datasyndrome.com
>>
>>
>> On Sep 29, 2012, at 3:02 PM, yogesh dhari <[EMAIL PROTECTED]> wrote:
>>
>>
>>
>>
>> Hi all,
>>
>>
>>
>>
>> I have this data, having fields  (Date, symbol, rate)
>>
>>
>>
>>
>> and I want it to be group by Months, and to find out the maximum rate value
>> for each month.
>>
>>
>>
>>
>> like: for month (08, 36.3), (09, 36.4), (10, 36.8), (11, 37.5) ..
>>
>>
>>
>>
>>
>>
>> (2009-08-21,CLI,33.38)
>>
>>
>> (2009-08-24,CLI,33.03)
>>
>>
>> (2009-08-25,CLI,33.16)
>>
>>
>> (2009-08-26,CLI,32.78)
>>
>>
>> (2009-08-27,CLI,32.79)
>>
>>
>> (2009-08-28,CLI,33.37)
>>
>>
>> (2009-08-31,CLI,32.51)
>>
>>
>> (2009-09-11,CLI,34.08)
>>
>>
>> (2009-09-14,CLI,35.19)
>>
>>
>> (2009-09-15,CLI,35.82)
>>
>>
>> (2009-09-16,CLI,36.58)
>>
>>
>> (2009-09-24,CLI,33.98)
>>
>>
>> (2009-09-25,CLI,32.44)
>>
>>
>> (2009-09-28,CLI,33.34)
>>
>>
>> (2009-09-29,CLI,33.6)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB