Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Put a "Google summer of code 2013" cwiki page


Copy link to this message
-
Re: Put a "Google summer of code 2013" cwiki page
This is a little different than how we've done such things before, but how
about a project to get Pig to run on Spark (aka, Spork)? The Twitter pig
folks have some code we'd love to share that got us half-way there, it was
looking pretty promising (if anyone is curious, it's the "spork" branch on
my github fork of pig: https://github.com/dvryaboy/pig )

D

On Thu, Mar 21, 2013 at 2:05 PM, Prasanth J <[EMAIL PROTECTED]>wrote:

> One more idea for GSoC project.
>
> YSmart uses correlation between multiple MR jobs to reduce the number of
> MR jobs generated. I remember Dmitriy bringing this up early. The
> techniques specified in this paper (Input, Job Flow, Transit correlations)
> has been patched into Hive. If Pig doesn't use these optimizations then I
> think it will be good to have them in Pig as well.
>
> Here is the link to the paper
> http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf
>
> I think this can be a good candidate project for GSoC.
>
> Thanks
> -- Prasanth
>
> On Mar 21, 2013, at 3:51 PM, Olga Natkovich <[EMAIL PROTECTED]> wrote:
>
> > +1 on that
> >
> >
> > ________________________________
> > From: Russell Jurney <[EMAIL PROTECTED]>
> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> > Sent: Thursday, March 21, 2013 11:54 AM
> > Subject: Re: Put a "Google summer of code 2013" cwiki page
> >
> > Make Grunt use Antlr - high priority one for me. Once Grunt uses Antlr,
> > macros will flourish.
> >
> >
> > On Wed, Mar 20, 2013 at 6:25 PM, Daniel Dai <[EMAIL PROTECTED]>
> wrote:
> >
> >> https://cwiki.apache.org/confluence/display/PIG/GSoc2013
> >>
> >> Feel free to add more project which could fit in the timeline of a
> >> student summer project.
> >>
> >> I remember there are several projects we discussed in our last meetup:
> >> * Allow Pig use Hive UDFs, Alan, do we have a ticket for that?
> >> * A general framework for Pig performance test, Rohini, do we have a
> >> ticket?
> >>
> >> Thanks,
> >> Daniel
> >>
> >
> >
> >
> > --
> > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED]
> datasyndrome.com
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB