Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Put a "Google summer of code 2013" cwiki page

Copy link to this message
Re: Put a "Google summer of code 2013" cwiki page
One more idea for GSoC project.

YSmart uses correlation between multiple MR jobs to reduce the number of MR jobs generated. I remember Dmitriy bringing this up early. The techniques specified in this paper (Input, Job Flow, Transit correlations) has been patched into Hive. If Pig doesn't use these optimizations then I think it will be good to have them in Pig as well.

Here is the link to the paper http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf

I think this can be a good candidate project for GSoC.

-- Prasanth

On Mar 21, 2013, at 3:51 PM, Olga Natkovich <[EMAIL PROTECTED]> wrote:

> +1 on that
> ________________________________
> From: Russell Jurney <[EMAIL PROTECTED]>
> Sent: Thursday, March 21, 2013 11:54 AM
> Subject: Re: Put a "Google summer of code 2013" cwiki page
> Make Grunt use Antlr - high priority one for me. Once Grunt uses Antlr,
> macros will flourish.
> On Wed, Mar 20, 2013 at 6:25 PM, Daniel Dai <[EMAIL PROTECTED]> wrote:
>> https://cwiki.apache.org/confluence/display/PIG/GSoc2013
>> Feel free to add more project which could fit in the timeline of a
>> student summer project.
>> I remember there are several projects we discussed in our last meetup:
>> * Allow Pig use Hive UDFs, Alan, do we have a ticket for that?
>> * A general framework for Pig performance test, Rohini, do we have a
>> ticket?
>> Thanks,
>> Daniel
> --
> Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com