Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Aggregation for chronologically ordered dataset


Copy link to this message
-
Re: Aggregation for chronologically ordered dataset
Hello everyone,
its like a local SUM operation.

any pointers, hints would be much appreciated.
let me know if any additional info is required.
thanks,

On Fri, Mar 15, 2013 at 10:33 PM, pranjal rajput <[EMAIL PROTECTED]
> wrote:

> Hi,
> I am new to Pig.
> I have a dataset from a time-tracker application.
> It records the the time that users spend on various activities.
> For example:
> UserId | Activity          |  Tool  |  BeginTime | EndTime | DurationMinute
> 1        |  development  | tool1  |  10:00        |    10:15   |   15
> 1        |  development  | tool2  |  10:15        |    10:30   |   15
> 1        |  other             | tool3  |  10:30        |    11:00   |   30
> 1        |  development  | tool1  |  11:00        |    11:20   |   20
> 1        |  other             | tool4  |  11:20        |    12:00   |   40
> 1        |  development  | tool1  |  12:00        |    12:15   |   15
> 2        |  other             | tool3  |  10:00        |    11:00   |   60
> 2        |  development  | tool1  |  11:00        |    11:20   |   20
> 2        |  development  | tool2  |  11:20        |    11:30   |   10
>
> I wish to find out, un-interrupted time slots spent on
> Activity=development. like this:
>
> UserId    |   Activity          |  SumDurationMinutes
> 1           |   development   |  30   /*notice tht two slots are summed*/
> 1           |   other              |  30
> 1           |   development   |  20
> 1           |   other              |  40
> 1           |   development   |  15
> 2           |   other              |  60
> 2           |   development   |  30 /*again sum*/
>
> How can this be done in pig?
> I am open to writing a UDF for the same, or any other work around.
> Thanks in anticipation,
>
> --
> Best Regards
> Pranjal Rajput
>
>
--
Best Regards
Pranjal Rajput
+91-81090-71747
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB