This is a little different than how we've done such things before, but how
about a project to get Pig to run on Spark (aka, Spork)? The Twitter pig
folks have some code we'd love to share that got us half-way there, it was
looking pretty promising (if anyone is curious, it's the "spork" branch on
my github fork of pig: https://github.com/dvryaboy/pig )
On Thu, Mar 21, 2013 at 2:05 PM, Prasanth J <[EMAIL PROTECTED]>wrote:
> One more idea for GSoC project.
> YSmart uses correlation between multiple MR jobs to reduce the number of
> MR jobs generated. I remember Dmitriy bringing this up early. The
> techniques specified in this paper (Input, Job Flow, Transit correlations)
> has been patched into Hive. If Pig doesn't use these optimizations then I
> think it will be good to have them in Pig as well.
> Here is the link to the paper
> I think this can be a good candidate project for GSoC.
> -- Prasanth
> On Mar 21, 2013, at 3:51 PM, Olga Natkovich <[EMAIL PROTECTED]> wrote:
> > +1 on that
> > ________________________________
> > From: Russell Jurney <[EMAIL PROTECTED]>
> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> > Sent: Thursday, March 21, 2013 11:54 AM
> > Subject: Re: Put a "Google summer of code 2013" cwiki page
> > Make Grunt use Antlr - high priority one for me. Once Grunt uses Antlr,
> > macros will flourish.
> > On Wed, Mar 20, 2013 at 6:25 PM, Daniel Dai <[EMAIL PROTECTED]>
> >> https://cwiki.apache.org/confluence/display/PIG/GSoc2013
> >> Feel free to add more project which could fit in the timeline of a
> >> student summer project.
> >> I remember there are several projects we discussed in our last meetup:
> >> * Allow Pig use Hive UDFs, Alan, do we have a ticket for that?
> >> * A general framework for Pig performance test, Rohini, do we have a
> >> ticket?
> >> Thanks,
> >> Daniel
> > --
> > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED]