Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> creating a graph over time


Copy link to this message
-
Re: creating a graph over time
you can loop from python. I've never tried it but you have a pretty good
explanation here (
http://ofps.oreilly.com/titles/9781449302641/embedding.html )
recently, I have to analyze some log files and I needed to loop (to
calculate some stats) and I used an UDF

in your case, I would go with Bill proposal

On Thu, Oct 27, 2011 at 5:56 AM, Marco Cadetg <[EMAIL PROTECTED]> wrote:

> I have a problem where I don't know how or if pig is even suitable to solve
> it.
>
> I have a schema like this:
>
> student-id,student-name,start-time,duration,course
> 1,marco,1319708213,500,math
> 2,ralf,1319708111,112,english
> 3,greg,1319708321,333,french
> 4,diva,1319708444,80,english
> 5,susanne,1319708123,2000,math
> 1,marco,1319708564,500,french
> 2,ralf,1319708789,123,french
> 7,fred,1319708213,5675,french
> 8,laura,1319708233,123,math
> 10,sab,1319708999,777,math
> 11,fibo,1319708789,565,math
> 6,dan,1319708456,50,english
> 9,marco,1319708123,60,english
> 12,bo,1319708456,345,math
> 1,marco,1319708789,673,math
> ...
> ...
>
> I would like to retrieve a graph (interpolation) over time grouped by
> course. Meaning how many students are learning for a course based on a 30
> sec interval.
> The grouping by course is easy but from there I've no clue how I would
> achieve the rest. I guess the rest needs to be achieved via some UDF
> or is there any way how to this in pig? I often think that I need a "for
> loop" or something similar in pig.
>
> Thanks for your help!
> -Marco
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB