Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Cumulative totals in an ORDERed relation.


Copy link to this message
-
Cumulative totals in an ORDERed relation.
Hello,

Is there some sort of mechanism by which I could cause a value to
accumulate within a relation? What I'd like to do is something along the
lines of having a long called accumulator, and an outer bag called
hourlyTotals with a schema of (hour:int, collected:int)

accumulator = 0L; -- I know this line doesn't work
ORDER hourlyTotals BY collected;
cumulativeTotals = FOREACH hourlyTotals {
accumulator += collected;
GENERATE day, accumulator AS collected;
}

Could something like this be made to work? Is there something similar that
I can do instead? Do I just need to pipe the relation through an
external script to get what I want?

Thanks,
Kris

--
Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB