Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> how to perform GROUP BY in PIG for this case:


Copy link to this message
-
how to perform GROUP BY in PIG for this case:


Hi all,

I have this data, having fields  (Date, symbol, rate)

and I want it to be group by Months, and to find out the maximum rate value for each month.

like: for month (08, 36.3), (09, 36.4), (10, 36.8), (11, 37.5) ..  

(2009-08-21,CLI,33.38)

(2009-08-24,CLI,33.03)

(2009-08-25,CLI,33.16)

(2009-08-26,CLI,32.78)

(2009-08-27,CLI,32.79)

(2009-08-28,CLI,33.37)

(2009-08-31,CLI,32.51)

(2009-09-11,CLI,34.08)

(2009-09-14,CLI,35.19)

(2009-09-15,CLI,35.82)

(2009-09-16,CLI,36.58)

(2009-09-24,CLI,33.98)

(2009-09-25,CLI,32.44)

(2009-09-28,CLI,33.34)

(2009-09-29,CLI,33.6)

(2009-09-30,CLI,33.24)

(2009-10-01,CLI,31.98)

(2009-10-02,CLI,31.21)

(2009-10-05,CLI,31.31)

(2009-10-21,CLI,32.86)

(2009-10-26,CLI,33.15)

(2009-10-27,CLI,32.71)

(2009-10-28,CLI,32.03)

(2009-10-29,CLI,32.05)

(2009-10-30,CLI,31.88)

(2009-11-02,CLI,31.88)

(2009-11-03,CLI,31.16)

(2009-11-04,CLI,31.47)

(2009-11-09,CLI,31.59)

(2009-11-25,CLI,30.58)

(2009-11-27,CLI,30.19)

(2009-11-30,CLI,30.86)

(2009-12-01,CLI,31.74)

(2009-12-02,CLI,32.62)

(2009-12-03,CLI,33.43)

(2009-12-04,CLI,34.12)

(2009-12-07,CLI,33.77)

(2009-12-08,CLI,33.8)

(2009-12-09,CLI,33.71)

Please help and suggest .

Thanks & Regards

Yogesh Kumar
     
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB