Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Cumulative totals in an ORDERed relation.

Copy link to this message
Cumulative totals in an ORDERed relation.

Is there some sort of mechanism by which I could cause a value to
accumulate within a relation? What I'd like to do is something along the
lines of having a long called accumulator, and an outer bag called
hourlyTotals with a schema of (hour:int, collected:int)

accumulator = 0L; -- I know this line doesn't work
ORDER hourlyTotals BY collected;
cumulativeTotals = FOREACH hourlyTotals {
accumulator += collected;
GENERATE day, accumulator AS collected;

Could something like this be made to work? Is there something similar that
I can do instead? Do I just need to pipe the relation through an
external script to get what I want?


Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3