|
Daniel Dai
2013-03-21, 01:25
Russell Jurney
2013-03-21, 18:54
Olga Natkovich
2013-03-21, 19:51
Prasanth J
2013-03-21, 21:05
Dmitriy Ryaboy
2013-03-22, 17:04
Johnny Zhang
2013-03-26, 20:40
Daniel Dai
2013-04-24, 19:00
Daniel Dai
2013-04-24, 19:05
Johnny Zhang
2013-04-24, 22:33
|
-
Put a "Google summer of code 2013" cwiki pageDaniel Dai 2013-03-21, 01:25
https://cwiki.apache.org/confluence/display/PIG/GSoc2013
Feel free to add more project which could fit in the timeline of a student summer project. I remember there are several projects we discussed in our last meetup: * Allow Pig use Hive UDFs, Alan, do we have a ticket for that? * A general framework for Pig performance test, Rohini, do we have a ticket? Thanks, Daniel
-
Re: Put a "Google summer of code 2013" cwiki pageRussell Jurney 2013-03-21, 18:54
Make Grunt use Antlr - high priority one for me. Once Grunt uses Antlr,
macros will flourish. On Wed, Mar 20, 2013 at 6:25 PM, Daniel Dai <[EMAIL PROTECTED]> wrote: > https://cwiki.apache.org/confluence/display/PIG/GSoc2013 > > Feel free to add more project which could fit in the timeline of a > student summer project. > > I remember there are several projects we discussed in our last meetup: > * Allow Pig use Hive UDFs, Alan, do we have a ticket for that? > * A general framework for Pig performance test, Rohini, do we have a > ticket? > > Thanks, > Daniel > -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
-
Re: Put a "Google summer of code 2013" cwiki pageOlga Natkovich 2013-03-21, 19:51
+1 on that
________________________________ From: Russell Jurney <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Sent: Thursday, March 21, 2013 11:54 AM Subject: Re: Put a "Google summer of code 2013" cwiki page Make Grunt use Antlr - high priority one for me. Once Grunt uses Antlr, macros will flourish. On Wed, Mar 20, 2013 at 6:25 PM, Daniel Dai <[EMAIL PROTECTED]> wrote: > https://cwiki.apache.org/confluence/display/PIG/GSoc2013 > > Feel free to add more project which could fit in the timeline of a > student summer project. > > I remember there are several projects we discussed in our last meetup: > * Allow Pig use Hive UDFs, Alan, do we have a ticket for that? > * A general framework for Pig performance test, Rohini, do we have a > ticket? > > Thanks, > Daniel > -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
-
Re: Put a "Google summer of code 2013" cwiki pagePrasanth J 2013-03-21, 21:05
One more idea for GSoC project.
YSmart uses correlation between multiple MR jobs to reduce the number of MR jobs generated. I remember Dmitriy bringing this up early. The techniques specified in this paper (Input, Job Flow, Transit correlations) has been patched into Hive. If Pig doesn't use these optimizations then I think it will be good to have them in Pig as well. Here is the link to the paper http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf I think this can be a good candidate project for GSoC. Thanks -- Prasanth On Mar 21, 2013, at 3:51 PM, Olga Natkovich <[EMAIL PROTECTED]> wrote: > +1 on that > > > ________________________________ > From: Russell Jurney <[EMAIL PROTECTED]> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Sent: Thursday, March 21, 2013 11:54 AM > Subject: Re: Put a "Google summer of code 2013" cwiki page > > Make Grunt use Antlr - high priority one for me. Once Grunt uses Antlr, > macros will flourish. > > > On Wed, Mar 20, 2013 at 6:25 PM, Daniel Dai <[EMAIL PROTECTED]> wrote: > >> https://cwiki.apache.org/confluence/display/PIG/GSoc2013 >> >> Feel free to add more project which could fit in the timeline of a >> student summer project. >> >> I remember there are several projects we discussed in our last meetup: >> * Allow Pig use Hive UDFs, Alan, do we have a ticket for that? >> * A general framework for Pig performance test, Rohini, do we have a >> ticket? >> >> Thanks, >> Daniel >> > > > > -- > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
-
Re: Put a "Google summer of code 2013" cwiki pageDmitriy Ryaboy 2013-03-22, 17:04
This is a little different than how we've done such things before, but how
about a project to get Pig to run on Spark (aka, Spork)? The Twitter pig folks have some code we'd love to share that got us half-way there, it was looking pretty promising (if anyone is curious, it's the "spork" branch on my github fork of pig: https://github.com/dvryaboy/pig ) D On Thu, Mar 21, 2013 at 2:05 PM, Prasanth J <[EMAIL PROTECTED]>wrote: > One more idea for GSoC project. > > YSmart uses correlation between multiple MR jobs to reduce the number of > MR jobs generated. I remember Dmitriy bringing this up early. The > techniques specified in this paper (Input, Job Flow, Transit correlations) > has been patched into Hive. If Pig doesn't use these optimizations then I > think it will be good to have them in Pig as well. > > Here is the link to the paper > http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf > > I think this can be a good candidate project for GSoC. > > Thanks > -- Prasanth > > On Mar 21, 2013, at 3:51 PM, Olga Natkovich <[EMAIL PROTECTED]> wrote: > > > +1 on that > > > > > > ________________________________ > > From: Russell Jurney <[EMAIL PROTECTED]> > > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > > Sent: Thursday, March 21, 2013 11:54 AM > > Subject: Re: Put a "Google summer of code 2013" cwiki page > > > > Make Grunt use Antlr - high priority one for me. Once Grunt uses Antlr, > > macros will flourish. > > > > > > On Wed, Mar 20, 2013 at 6:25 PM, Daniel Dai <[EMAIL PROTECTED]> > wrote: > > > >> https://cwiki.apache.org/confluence/display/PIG/GSoc2013 > >> > >> Feel free to add more project which could fit in the timeline of a > >> student summer project. > >> > >> I remember there are several projects we discussed in our last meetup: > >> * Allow Pig use Hive UDFs, Alan, do we have a ticket for that? > >> * A general framework for Pig performance test, Rohini, do we have a > >> ticket? > >> > >> Thanks, > >> Daniel > >> > > > > > > > > -- > > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] > datasyndrome.com > >
-
Re: Put a "Google summer of code 2013" cwiki pageJohnny Zhang 2013-03-26, 20:40
I have another idea for GSoC project: parallel running the unit tests. I
think several people mentioned this in last Pig meetup. The objective is enabling us to run whole unit tests before commit any patch. The fix should include two parts: (1) unit test doesn't interferes each other (e.g. moving test dir from /tmp to build/test/tmp so test doesn't delete other test's dir) (2) need to make sure Pig is thread safe Johnny On Fri, Mar 22, 2013 at 10:04 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote: > This is a little different than how we've done such things before, but how > about a project to get Pig to run on Spark (aka, Spork)? The Twitter pig > folks have some code we'd love to share that got us half-way there, it was > looking pretty promising (if anyone is curious, it's the "spork" branch on > my github fork of pig: https://github.com/dvryaboy/pig ) > > D > > On Thu, Mar 21, 2013 at 2:05 PM, Prasanth J <[EMAIL PROTECTED] > >wrote: > > > One more idea for GSoC project. > > > > YSmart uses correlation between multiple MR jobs to reduce the number of > > MR jobs generated. I remember Dmitriy bringing this up early. The > > techniques specified in this paper (Input, Job Flow, Transit > correlations) > > has been patched into Hive. If Pig doesn't use these optimizations then I > > think it will be good to have them in Pig as well. > > > > Here is the link to the paper > > > http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf > > > > I think this can be a good candidate project for GSoC. > > > > Thanks > > -- Prasanth > > > > On Mar 21, 2013, at 3:51 PM, Olga Natkovich <[EMAIL PROTECTED]> > wrote: > > > > > +1 on that > > > > > > > > > ________________________________ > > > From: Russell Jurney <[EMAIL PROTECTED]> > > > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > > > Sent: Thursday, March 21, 2013 11:54 AM > > > Subject: Re: Put a "Google summer of code 2013" cwiki page > > > > > > Make Grunt use Antlr - high priority one for me. Once Grunt uses Antlr, > > > macros will flourish. > > > > > > > > > On Wed, Mar 20, 2013 at 6:25 PM, Daniel Dai <[EMAIL PROTECTED]> > > wrote: > > > > > >> https://cwiki.apache.org/confluence/display/PIG/GSoc2013 > > >> > > >> Feel free to add more project which could fit in the timeline of a > > >> student summer project. > > >> > > >> I remember there are several projects we discussed in our last meetup: > > >> * Allow Pig use Hive UDFs, Alan, do we have a ticket for that? > > >> * A general framework for Pig performance test, Rohini, do we have a > > >> ticket? > > >> > > >> Thanks, > > >> Daniel > > >> > > > > > > > > > > > > -- > > > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] > > datasyndrome.com > > > > >
-
Re: Put a "Google summer of code 2013" cwiki pageDaniel Dai 2013-04-24, 19:00
Hi, Johnny,
If you want to mentor this in GSoC, please create a Jira ticket and label it gsoc2013. Thanks, Daniel On Tue, Mar 26, 2013 at 1:40 PM, Johnny Zhang <[EMAIL PROTECTED]> wrote: > I have another idea for GSoC project: parallel running the unit tests. I > think several people mentioned this in last Pig meetup. The objective is > enabling us to run whole unit tests before commit any patch. The fix should > include two parts: > > (1) unit test doesn't interferes each other (e.g. moving test dir from /tmp > to build/test/tmp so test doesn't delete other test's dir) > (2) need to make sure Pig is thread safe > > Johnny > > > On Fri, Mar 22, 2013 at 10:04 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]> > wrote: > > > This is a little different than how we've done such things before, but > how > > about a project to get Pig to run on Spark (aka, Spork)? The Twitter pig > > folks have some code we'd love to share that got us half-way there, it > was > > looking pretty promising (if anyone is curious, it's the "spork" branch > on > > my github fork of pig: https://github.com/dvryaboy/pig ) > > > > D > > > > On Thu, Mar 21, 2013 at 2:05 PM, Prasanth J <[EMAIL PROTECTED] > > >wrote: > > > > > One more idea for GSoC project. > > > > > > YSmart uses correlation between multiple MR jobs to reduce the number > of > > > MR jobs generated. I remember Dmitriy bringing this up early. The > > > techniques specified in this paper (Input, Job Flow, Transit > > correlations) > > > has been patched into Hive. If Pig doesn't use these optimizations > then I > > > think it will be good to have them in Pig as well. > > > > > > Here is the link to the paper > > > > > > http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf > > > > > > I think this can be a good candidate project for GSoC. > > > > > > Thanks > > > -- Prasanth > > > > > > On Mar 21, 2013, at 3:51 PM, Olga Natkovich <[EMAIL PROTECTED]> > > wrote: > > > > > > > +1 on that > > > > > > > > > > > > ________________________________ > > > > From: Russell Jurney <[EMAIL PROTECTED]> > > > > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > > > > Sent: Thursday, March 21, 2013 11:54 AM > > > > Subject: Re: Put a "Google summer of code 2013" cwiki page > > > > > > > > Make Grunt use Antlr - high priority one for me. Once Grunt uses > Antlr, > > > > macros will flourish. > > > > > > > > > > > > On Wed, Mar 20, 2013 at 6:25 PM, Daniel Dai <[EMAIL PROTECTED]> > > > wrote: > > > > > > > >> https://cwiki.apache.org/confluence/display/PIG/GSoc2013 > > > >> > > > >> Feel free to add more project which could fit in the timeline of a > > > >> student summer project. > > > >> > > > >> I remember there are several projects we discussed in our last > meetup: > > > >> * Allow Pig use Hive UDFs, Alan, do we have a ticket for that? > > > >> * A general framework for Pig performance test, Rohini, do we have a > > > >> ticket? > > > >> > > > >> Thanks, > > > >> Daniel > > > >> > > > > > > > > > > > > > > > > -- > > > > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] > > > datasyndrome.com > > > > > > > > >
-
Re: Put a "Google summer of code 2013" cwiki pageDaniel Dai 2013-04-24, 19:05
Ops, sorry, only committer can mentor. But you can create the ticket
anyway. Thanks! On Wed, Apr 24, 2013 at 12:00 PM, Daniel Dai <[EMAIL PROTECTED]> wrote: > Hi, Johnny, > If you want to mentor this in GSoC, please create a Jira ticket and label > it gsoc2013. > > Thanks, > Daniel > > > On Tue, Mar 26, 2013 at 1:40 PM, Johnny Zhang <[EMAIL PROTECTED]>wrote: > >> I have another idea for GSoC project: parallel running the unit tests. I >> think several people mentioned this in last Pig meetup. The objective is >> enabling us to run whole unit tests before commit any patch. The fix >> should >> include two parts: >> >> (1) unit test doesn't interferes each other (e.g. moving test dir from >> /tmp >> to build/test/tmp so test doesn't delete other test's dir) >> (2) need to make sure Pig is thread safe >> >> Johnny >> >> >> On Fri, Mar 22, 2013 at 10:04 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]> >> wrote: >> >> > This is a little different than how we've done such things before, but >> how >> > about a project to get Pig to run on Spark (aka, Spork)? The Twitter pig >> > folks have some code we'd love to share that got us half-way there, it >> was >> > looking pretty promising (if anyone is curious, it's the "spork" branch >> on >> > my github fork of pig: https://github.com/dvryaboy/pig ) >> > >> > D >> > >> > On Thu, Mar 21, 2013 at 2:05 PM, Prasanth J <[EMAIL PROTECTED] >> > >wrote: >> > >> > > One more idea for GSoC project. >> > > >> > > YSmart uses correlation between multiple MR jobs to reduce the number >> of >> > > MR jobs generated. I remember Dmitriy bringing this up early. The >> > > techniques specified in this paper (Input, Job Flow, Transit >> > correlations) >> > > has been patched into Hive. If Pig doesn't use these optimizations >> then I >> > > think it will be good to have them in Pig as well. >> > > >> > > Here is the link to the paper >> > > >> > >> http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf >> > > >> > > I think this can be a good candidate project for GSoC. >> > > >> > > Thanks >> > > -- Prasanth >> > > >> > > On Mar 21, 2013, at 3:51 PM, Olga Natkovich <[EMAIL PROTECTED]> >> > wrote: >> > > >> > > > +1 on that >> > > > >> > > > >> > > > ________________________________ >> > > > From: Russell Jurney <[EMAIL PROTECTED]> >> > > > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >> > > > Sent: Thursday, March 21, 2013 11:54 AM >> > > > Subject: Re: Put a "Google summer of code 2013" cwiki page >> > > > >> > > > Make Grunt use Antlr - high priority one for me. Once Grunt uses >> Antlr, >> > > > macros will flourish. >> > > > >> > > > >> > > > On Wed, Mar 20, 2013 at 6:25 PM, Daniel Dai <[EMAIL PROTECTED]> >> > > wrote: >> > > > >> > > >> https://cwiki.apache.org/confluence/display/PIG/GSoc2013 >> > > >> >> > > >> Feel free to add more project which could fit in the timeline of a >> > > >> student summer project. >> > > >> >> > > >> I remember there are several projects we discussed in our last >> meetup: >> > > >> * Allow Pig use Hive UDFs, Alan, do we have a ticket for that? >> > > >> * A general framework for Pig performance test, Rohini, do we have >> a >> > > >> ticket? >> > > >> >> > > >> Thanks, >> > > >> Daniel >> > > >> >> > > > >> > > > >> > > > >> > > > -- >> > > > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] >> > > datasyndrome.com >> > > >> > > >> > >> > >
-
Re: Put a "Google summer of code 2013" cwiki pageJohnny Zhang 2013-04-24, 22:33
Daniel, thanks for reminder! I just created ticket
https://issues.apache.org/jira/browse/PIG-3296 Thanks, Johnny On Wed, Apr 24, 2013 at 12:05 PM, Daniel Dai <[EMAIL PROTECTED]> wrote: > Ops, sorry, only committer can mentor. But you can create the ticket > anyway. Thanks! > > > On Wed, Apr 24, 2013 at 12:00 PM, Daniel Dai <[EMAIL PROTECTED]> > wrote: > > > Hi, Johnny, > > If you want to mentor this in GSoC, please create a Jira ticket and label > > it gsoc2013. > > > > Thanks, > > Daniel > > > > > > On Tue, Mar 26, 2013 at 1:40 PM, Johnny Zhang <[EMAIL PROTECTED] > >wrote: > > > >> I have another idea for GSoC project: parallel running the unit tests. I > >> think several people mentioned this in last Pig meetup. The objective is > >> enabling us to run whole unit tests before commit any patch. The fix > >> should > >> include two parts: > >> > >> (1) unit test doesn't interferes each other (e.g. moving test dir from > >> /tmp > >> to build/test/tmp so test doesn't delete other test's dir) > >> (2) need to make sure Pig is thread safe > >> > >> Johnny > >> > >> > >> On Fri, Mar 22, 2013 at 10:04 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]> > >> wrote: > >> > >> > This is a little different than how we've done such things before, but > >> how > >> > about a project to get Pig to run on Spark (aka, Spork)? The Twitter > pig > >> > folks have some code we'd love to share that got us half-way there, it > >> was > >> > looking pretty promising (if anyone is curious, it's the "spork" > branch > >> on > >> > my github fork of pig: https://github.com/dvryaboy/pig ) > >> > > >> > D > >> > > >> > On Thu, Mar 21, 2013 at 2:05 PM, Prasanth J < > [EMAIL PROTECTED] > >> > >wrote: > >> > > >> > > One more idea for GSoC project. > >> > > > >> > > YSmart uses correlation between multiple MR jobs to reduce the > number > >> of > >> > > MR jobs generated. I remember Dmitriy bringing this up early. The > >> > > techniques specified in this paper (Input, Job Flow, Transit > >> > correlations) > >> > > has been patched into Hive. If Pig doesn't use these optimizations > >> then I > >> > > think it will be good to have them in Pig as well. > >> > > > >> > > Here is the link to the paper > >> > > > >> > > >> > http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf > >> > > > >> > > I think this can be a good candidate project for GSoC. > >> > > > >> > > Thanks > >> > > -- Prasanth > >> > > > >> > > On Mar 21, 2013, at 3:51 PM, Olga Natkovich <[EMAIL PROTECTED]> > >> > wrote: > >> > > > >> > > > +1 on that > >> > > > > >> > > > > >> > > > ________________________________ > >> > > > From: Russell Jurney <[EMAIL PROTECTED]> > >> > > > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > >> > > > Sent: Thursday, March 21, 2013 11:54 AM > >> > > > Subject: Re: Put a "Google summer of code 2013" cwiki page > >> > > > > >> > > > Make Grunt use Antlr - high priority one for me. Once Grunt uses > >> Antlr, > >> > > > macros will flourish. > >> > > > > >> > > > > >> > > > On Wed, Mar 20, 2013 at 6:25 PM, Daniel Dai < > [EMAIL PROTECTED]> > >> > > wrote: > >> > > > > >> > > >> https://cwiki.apache.org/confluence/display/PIG/GSoc2013 > >> > > >> > >> > > >> Feel free to add more project which could fit in the timeline of > a > >> > > >> student summer project. > >> > > >> > >> > > >> I remember there are several projects we discussed in our last > >> meetup: > >> > > >> * Allow Pig use Hive UDFs, Alan, do we have a ticket for that? > >> > > >> * A general framework for Pig performance test, Rohini, do we > have > >> a > >> > > >> ticket? > >> > > >> > >> > > >> Thanks, > >> > > >> Daniel > >> > > >> > >> > > > > >> > > > > >> > > > > >> > > > -- > >> > > > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] > >> > > datasyndrome.com > >> > > > >> > > > >> > > >> > > > > > |