|
|
-
Re: How to combine muliple group byAlan Gates 2012-05-16, 00:39
Pig will auto-combine these for you. In the script example you give Pig should already be combining both group bys into a single MR job. You can check this by running explain on it.
Alan. On May 15, 2012, at 3:11 PM, shan s wrote: > Thanks Bill. > > My objective is to improve performance. So I do want to combine the logic. > If we were to do this in java, we could do this in single foreach. > > Will the macro help in this regard? Or will it just act as code generator? > > On Tue, May 15, 2012 at 8:30 PM, Bill Graham <[EMAIL PROTECTED]> wrote: > >> You can combine multiple relations using the UNION operator. If you're >> trying to combine logic, you can use a macro to do e2-e5 below that takes >> (e1, empid) or (e1, group). See the example here: >> >> http://hortonworks.com/blog/new-apache-pig-features-part-1-macro/ >> >> On Tue, May 15, 2012 at 6:50 AM, shan s <[EMAIL PROTECTED]> wrote: >> >>> How can I combine multiple group by that are performed on essentially >> same >>> relation? >>> In the case below, can I do this in single foreach? >>> >>> e1 = load 'emp' using PigStorage() as (empid, school, district, score); >>> >>> e2 = group e1 by empid; >>> e3 = foreach e2 generate group, AVG(e1.score) as s; >>> e4 = order e3 by s desc; >>> e5 = limit e4 3; >>> dump e5; >>> >>> e2 = group e1 by school; >>> e3 = foreach e2 generate group, AVG(e1.score) as s; >>> e4 = order e3 by s desc; >>> e5 = limit e4 3; >>> dump e5; >>> Thank You, >>> Prashant. >>> >> >> >> >> -- >> *Note that I'm no longer using my Yahoo! email address. Please email me at >> [EMAIL PROTECTED] going forward.* >> |