|
Olga Natkovich
2011-02-03, 18:42
Benjamin Reed
2011-02-03, 19:24
Ashutosh Chauhan
2011-02-03, 19:28
Olga Natkovich
2011-02-08, 18:29
Dmitriy Ryaboy
2011-02-09, 05:38
Dmitriy Ryaboy
2011-02-12, 01:33
Santhosh Srinivasan
2011-02-12, 02:30
arvind@...)
2011-02-14, 20:55
Dmitriy Ryaboy
2011-02-14, 23:48
Olga Natkovich
2011-02-15, 00:11
Renato Marroquín Mogrovej...
2011-02-15, 00:53
Alan Gates
2011-02-15, 08:54
Ashutosh Chauhan
2011-02-15, 09:05
Milind Bhandarkar
2011-02-23, 02:08
|
-
REMINDER: Pig developer meeting in FebruaryOlga Natkovich 2011-02-03, 18:42
Hi guys,
This is just a reminder that the meeting will be held next Wednesday, 2/9 4-6 pm at Yahoo campus. If you have not yet responded but planning to attend, please, let me know. Olga -----Original Message----- From: Santhosh Srinivasan [mailto:[EMAIL PROTECTED]] Sent: Friday, January 28, 2011 3:36 PM To: [EMAIL PROTECTED] Subject: RE: Pig developer meeting in February I am planning to attend. -----Original Message----- From: Olga Natkovich [mailto:[EMAIL PROTECTED]] Sent: Friday, January 28, 2011 12:58 PM To: [EMAIL PROTECTED] Subject: RE: Pig developer meeting in February I believe we have critical mass so the meeting is on! If you have not responded yet but planning to attend, please, let me know. Thanks, Olga -----Original Message----- From: Julien Le Dem [mailto:[EMAIL PROTECTED]] Sent: Thursday, January 27, 2011 5:21 PM To: [EMAIL PROTECTED] Subject: Re: Pig developer meeting in February Me too. Julien On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <[EMAIL PROTECTED]> wrote: Ok yeah I'll come :). On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <[EMAIL PROTECTED]> wrote: > While there is a lively discussion on this thread, I have not actually > gotten any responses to having the meeting with exception of 1 person :). > > Please, let me know by the end of the week if you are planning to attend. > If we don't get at least a few more responses I suggest we postpone > the meeting. > > Thanks, > > Olga > > -----Original Message----- > From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, January 26, 2011 6:04 PM > To: [EMAIL PROTECTED] > Subject: Re: Pig developer meeting in February > > Right, we do partition filtering, but not true predicate pushdown. > > On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <[EMAIL PROTECTED]> > wrote: > > > Are you talking about LoadMetadata.setPartitionFilter? > > PartitionFilterOptimizer will do that. > > > > Daniel > > > > > > Dmitriy Ryaboy wrote: > > > >> I may be wrong but I think predicate pushdown is designed for, but > >> not actually implemented in the current LoadPushdown interface (you > >> can only push projections). If I am wrong, that's great.. but if > >> not, that would > be > >> an important feature to add, as people are trying to connect Pig to > >> "smart" > >> storage systems like rdbmses, HBase, and Cassandra more and more. > >> I > think > >> we only kind of simulate this with partition keys info, which is > >> not always sufficient > >> > >> D > >> > >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem > >> <[EMAIL PROTECTED]> > >> wrote: > >> > >> > >> > >>> If making Pig Thread safe (i.e.: two threads running a different > >>> pig > >>> script) is important then we need to change some of the APIs from > static > >>> singleton access to a dependency injection pattern. > >>> In that case, this should probably be done before 1.0 For example: > >>> UDFContext should be passed to the UDF after construction (similar > >>> to the SevrletContext in Servlet or the way Hadoop passes the > >>> context to tasks) Also a clearly separated API that does not > >>> depend on the Pig implementation would help. > >>> For example UDFContext is in org.apache.pig.impl.util when it > >>> would be better in org.apache.pig.api (Or at least an interface > >>> defining it) > >>> > >>> Julien > >>> > >>> On 1/24/11 10:14 AM, "Olga Natkovich" <[EMAIL PROTECTED]> wrote: > >>> > >>> Hi Guys, > >>> > >>> I think it is time for us to have another meeting. Yahoo would be > >>> happy to host if this works for everybody. How about Wednesday, > >>> 2/9 4-6 pm. > >>> Please, > >>> let us know if you are planning to attend and if the date/time > >>> works > for > >>> you. > >>> > >>> Things that come to mind to discuss and as always feel free to > >>> suggest > >>> others: > >>> > >>> - Error handling proposal - this might be easier to finalize > >>> face-to-face > >>> - Pig 0.9 plan > >>> - Pig Roadmap beyond 0.9 > > +
Olga Natkovich 2011-02-03, 18:42
-
Re: REMINDER: Pig developer meeting in FebruaryBenjamin Reed 2011-02-03, 19:24
i'll be there.
ben On 02/03/2011 10:42 AM, Olga Natkovich wrote: > Hi guys, > > This is just a reminder that the meeting will be held next Wednesday, 2/9 4-6 pm at Yahoo campus. > > If you have not yet responded but planning to attend, please, let me know. > > Olga > > -----Original Message----- > From: Santhosh Srinivasan [mailto:[EMAIL PROTECTED]] > Sent: Friday, January 28, 2011 3:36 PM > To: [EMAIL PROTECTED] > Subject: RE: Pig developer meeting in February > > I am planning to attend. > > -----Original Message----- > From: Olga Natkovich [mailto:[EMAIL PROTECTED]] > Sent: Friday, January 28, 2011 12:58 PM > To: [EMAIL PROTECTED] > Subject: RE: Pig developer meeting in February > > I believe we have critical mass so the meeting is on! > > If you have not responded yet but planning to attend, please, let me know. > > Thanks, > > Olga > > -----Original Message----- > From: Julien Le Dem [mailto:[EMAIL PROTECTED]] > Sent: Thursday, January 27, 2011 5:21 PM > To: [EMAIL PROTECTED] > Subject: Re: Pig developer meeting in February > > Me too. > Julien > > > On 1/27/11 4:09 PM, "Dmitriy Ryaboy"<[EMAIL PROTECTED]> wrote: > > Ok yeah I'll come :). > > > > On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich<[EMAIL PROTECTED]> wrote: > >> While there is a lively discussion on this thread, I have not actually >> gotten any responses to having the meeting with exception of 1 person :). >> >> Please, let me know by the end of the week if you are planning to attend. >> If we don't get at least a few more responses I suggest we postpone >> the meeting. >> >> Thanks, >> >> Olga >> >> -----Original Message----- >> From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] >> Sent: Wednesday, January 26, 2011 6:04 PM >> To: [EMAIL PROTECTED] >> Subject: Re: Pig developer meeting in February >> >> Right, we do partition filtering, but not true predicate pushdown. >> >> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai<[EMAIL PROTECTED]> >> wrote: >> >>> Are you talking about LoadMetadata.setPartitionFilter? >>> PartitionFilterOptimizer will do that. >>> >>> Daniel >>> >>> >>> Dmitriy Ryaboy wrote: >>> >>>> I may be wrong but I think predicate pushdown is designed for, but >>>> not actually implemented in the current LoadPushdown interface (you >>>> can only push projections). If I am wrong, that's great.. but if >>>> not, that would >> be >>>> an important feature to add, as people are trying to connect Pig to >>>> "smart" >>>> storage systems like rdbmses, HBase, and Cassandra more and more. >>>> I >> think >>>> we only kind of simulate this with partition keys info, which is >>>> not always sufficient >>>> >>>> D >>>> >>>> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem >>>> <[EMAIL PROTECTED]> >>>> wrote: >>>> >>>> >>>> >>>>> If making Pig Thread safe (i.e.: two threads running a different >>>>> pig >>>>> script) is important then we need to change some of the APIs from >> static >>>>> singleton access to a dependency injection pattern. >>>>> In that case, this should probably be done before 1.0 For example: >>>>> UDFContext should be passed to the UDF after construction (similar >>>>> to the SevrletContext in Servlet or the way Hadoop passes the >>>>> context to tasks) Also a clearly separated API that does not >>>>> depend on the Pig implementation would help. >>>>> For example UDFContext is in org.apache.pig.impl.util when it >>>>> would be better in org.apache.pig.api (Or at least an interface >>>>> defining it) >>>>> >>>>> Julien >>>>> >>>>> On 1/24/11 10:14 AM, "Olga Natkovich"<[EMAIL PROTECTED]> wrote: >>>>> >>>>> Hi Guys, >>>>> >>>>> I think it is time for us to have another meeting. Yahoo would be >>>>> happy to host if this works for everybody. How about Wednesday, >>>>> 2/9 4-6 pm. >>>>> Please, >>>>> let us know if you are planning to attend and if the date/time >>>>> works >> for >>>>> you. >>>>> >>>>> Things that come to mind to discuss and as always feel free to >>>>> suggest >>>>> others: >>>>> >>>>> - Error handling proposal - this might be easier to finalize +
Benjamin Reed 2011-02-03, 19:24
-
Re: REMINDER: Pig developer meeting in FebruaryAshutosh Chauhan 2011-02-03, 19:28
I'll be there.
Ashutosh On Thu, Feb 3, 2011 at 11:24, Benjamin Reed <[EMAIL PROTECTED]> wrote: > i'll be there. > > ben > > On 02/03/2011 10:42 AM, Olga Natkovich wrote: >> >> Hi guys, >> >> This is just a reminder that the meeting will be held next Wednesday, 2/9 >> 4-6 pm at Yahoo campus. >> >> If you have not yet responded but planning to attend, please, let me know. >> >> Olga >> >> -----Original Message----- >> From: Santhosh Srinivasan [mailto:[EMAIL PROTECTED]] >> Sent: Friday, January 28, 2011 3:36 PM >> To: [EMAIL PROTECTED] >> Subject: RE: Pig developer meeting in February >> >> I am planning to attend. >> >> -----Original Message----- >> From: Olga Natkovich [mailto:[EMAIL PROTECTED]] >> Sent: Friday, January 28, 2011 12:58 PM >> To: [EMAIL PROTECTED] >> Subject: RE: Pig developer meeting in February >> >> I believe we have critical mass so the meeting is on! >> >> If you have not responded yet but planning to attend, please, let me know. >> >> Thanks, >> >> Olga >> >> -----Original Message----- >> From: Julien Le Dem [mailto:[EMAIL PROTECTED]] >> Sent: Thursday, January 27, 2011 5:21 PM >> To: [EMAIL PROTECTED] >> Subject: Re: Pig developer meeting in February >> >> Me too. >> Julien >> >> >> On 1/27/11 4:09 PM, "Dmitriy Ryaboy"<[EMAIL PROTECTED]> wrote: >> >> Ok yeah I'll come :). >> >> >> >> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich<[EMAIL PROTECTED]> >> wrote: >> >>> While there is a lively discussion on this thread, I have not actually >>> gotten any responses to having the meeting with exception of 1 person :). >>> >>> Please, let me know by the end of the week if you are planning to attend. >>> If we don't get at least a few more responses I suggest we postpone >>> the meeting. >>> >>> Thanks, >>> >>> Olga >>> >>> -----Original Message----- >>> From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] >>> Sent: Wednesday, January 26, 2011 6:04 PM >>> To: [EMAIL PROTECTED] >>> Subject: Re: Pig developer meeting in February >>> >>> Right, we do partition filtering, but not true predicate pushdown. >>> >>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai<[EMAIL PROTECTED]> >>> wrote: >>> >>>> Are you talking about LoadMetadata.setPartitionFilter? >>>> PartitionFilterOptimizer will do that. >>>> >>>> Daniel >>>> >>>> >>>> Dmitriy Ryaboy wrote: >>>> >>>>> I may be wrong but I think predicate pushdown is designed for, but >>>>> not actually implemented in the current LoadPushdown interface (you >>>>> can only push projections). If I am wrong, that's great.. but if >>>>> not, that would >>> >>> be >>>>> >>>>> an important feature to add, as people are trying to connect Pig to >>>>> "smart" >>>>> storage systems like rdbmses, HBase, and Cassandra more and more. >>>>> I >>> >>> think >>>>> >>>>> we only kind of simulate this with partition keys info, which is >>>>> not always sufficient >>>>> >>>>> D >>>>> >>>>> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem >>>>> <[EMAIL PROTECTED]> >>>>> wrote: >>>>> >>>>> >>>>> >>>>>> If making Pig Thread safe (i.e.: two threads running a different >>>>>> pig >>>>>> script) is important then we need to change some of the APIs from >>> >>> static >>>>>> >>>>>> singleton access to a dependency injection pattern. >>>>>> In that case, this should probably be done before 1.0 For example: >>>>>> UDFContext should be passed to the UDF after construction (similar >>>>>> to the SevrletContext in Servlet or the way Hadoop passes the >>>>>> context to tasks) Also a clearly separated API that does not >>>>>> depend on the Pig implementation would help. >>>>>> For example UDFContext is in org.apache.pig.impl.util when it >>>>>> would be better in org.apache.pig.api (Or at least an interface >>>>>> defining it) >>>>>> >>>>>> Julien >>>>>> >>>>>> On 1/24/11 10:14 AM, "Olga Natkovich"<[EMAIL PROTECTED]> wrote: >>>>>> >>>>>> Hi Guys, >>>>>> >>>>>> I think it is time for us to have another meeting. Yahoo would be >>>>>> happy to host if this works for everybody. How about Wednesday, +
Ashutosh Chauhan 2011-02-03, 19:28
-
RE: REMINDER: Pig developer meeting in FebruaryOlga Natkovich 2011-02-08, 18:29
Hi Guys,
We are looking forward to see you tomorrow at 4 pm at Yahoo campus in Sunnyvale. Yahoo address is 701 First Ave. Sunnyvale, CA 94089 We are in building E. Please, ask for Alan or me at the reception. Olga -----Original Message----- From: Olga Natkovich [mailto:[EMAIL PROTECTED]] Sent: Thursday, February 03, 2011 10:42 AM To: [EMAIL PROTECTED] Subject: REMINDER: Pig developer meeting in February Hi guys, This is just a reminder that the meeting will be held next Wednesday, 2/9 4-6 pm at Yahoo campus. If you have not yet responded but planning to attend, please, let me know. Olga -----Original Message----- From: Santhosh Srinivasan [mailto:[EMAIL PROTECTED]] Sent: Friday, January 28, 2011 3:36 PM To: [EMAIL PROTECTED] Subject: RE: Pig developer meeting in February I am planning to attend. -----Original Message----- From: Olga Natkovich [mailto:[EMAIL PROTECTED]] Sent: Friday, January 28, 2011 12:58 PM To: [EMAIL PROTECTED] Subject: RE: Pig developer meeting in February I believe we have critical mass so the meeting is on! If you have not responded yet but planning to attend, please, let me know. Thanks, Olga -----Original Message----- From: Julien Le Dem [mailto:[EMAIL PROTECTED]] Sent: Thursday, January 27, 2011 5:21 PM To: [EMAIL PROTECTED] Subject: Re: Pig developer meeting in February Me too. Julien On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <[EMAIL PROTECTED]> wrote: Ok yeah I'll come :). On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <[EMAIL PROTECTED]> wrote: > While there is a lively discussion on this thread, I have not actually > gotten any responses to having the meeting with exception of 1 person :). > > Please, let me know by the end of the week if you are planning to attend. > If we don't get at least a few more responses I suggest we postpone > the meeting. > > Thanks, > > Olga > > -----Original Message----- > From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, January 26, 2011 6:04 PM > To: [EMAIL PROTECTED] > Subject: Re: Pig developer meeting in February > > Right, we do partition filtering, but not true predicate pushdown. > > On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <[EMAIL PROTECTED]> > wrote: > > > Are you talking about LoadMetadata.setPartitionFilter? > > PartitionFilterOptimizer will do that. > > > > Daniel > > > > > > Dmitriy Ryaboy wrote: > > > >> I may be wrong but I think predicate pushdown is designed for, but > >> not actually implemented in the current LoadPushdown interface (you > >> can only push projections). If I am wrong, that's great.. but if > >> not, that would > be > >> an important feature to add, as people are trying to connect Pig to > >> "smart" > >> storage systems like rdbmses, HBase, and Cassandra more and more. > >> I > think > >> we only kind of simulate this with partition keys info, which is > >> not always sufficient > >> > >> D > >> > >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem > >> <[EMAIL PROTECTED]> > >> wrote: > >> > >> > >> > >>> If making Pig Thread safe (i.e.: two threads running a different > >>> pig > >>> script) is important then we need to change some of the APIs from > static > >>> singleton access to a dependency injection pattern. > >>> In that case, this should probably be done before 1.0 For example: > >>> UDFContext should be passed to the UDF after construction (similar > >>> to the SevrletContext in Servlet or the way Hadoop passes the > >>> context to tasks) Also a clearly separated API that does not > >>> depend on the Pig implementation would help. > >>> For example UDFContext is in org.apache.pig.impl.util when it > >>> would be better in org.apache.pig.api (Or at least an interface > >>> defining it) > >>> > >>> Julien > >>> > >>> On 1/24/11 10:14 AM, "Olga Natkovich" <[EMAIL PROTECTED]> wrote: > >>> > >>> Hi Guys, > >>> > >>> I think it is time for us to have another meeting. Yahoo would be > >>> happy to host if this works for everybody. How about Wednesday, +
Olga Natkovich 2011-02-08, 18:29
-
Re: REMINDER: Pig developer meeting in FebruaryDmitriy Ryaboy 2011-02-09, 05:38
Hi All,
I got sick and won't be able to make it. Would love to see some notes after the meeting :). D On Tue, Feb 8, 2011 at 10:29 AM, Olga Natkovich <[EMAIL PROTECTED]> wrote: > Hi Guys, > > We are looking forward to see you tomorrow at 4 pm at Yahoo campus in Sunnyvale. > > Yahoo address is > > 701 First Ave. > Sunnyvale, CA 94089 > > We are in building E. Please, ask for Alan or me at the reception. > > Olga > > -----Original Message----- > From: Olga Natkovich [mailto:[EMAIL PROTECTED]] > Sent: Thursday, February 03, 2011 10:42 AM > To: [EMAIL PROTECTED] > Subject: REMINDER: Pig developer meeting in February > > Hi guys, > > This is just a reminder that the meeting will be held next Wednesday, 2/9 4-6 pm at Yahoo campus. > > If you have not yet responded but planning to attend, please, let me know. > > Olga > > -----Original Message----- > From: Santhosh Srinivasan [mailto:[EMAIL PROTECTED]] > Sent: Friday, January 28, 2011 3:36 PM > To: [EMAIL PROTECTED] > Subject: RE: Pig developer meeting in February > > I am planning to attend. > > -----Original Message----- > From: Olga Natkovich [mailto:[EMAIL PROTECTED]] > Sent: Friday, January 28, 2011 12:58 PM > To: [EMAIL PROTECTED] > Subject: RE: Pig developer meeting in February > > I believe we have critical mass so the meeting is on! > > If you have not responded yet but planning to attend, please, let me know. > > Thanks, > > Olga > > -----Original Message----- > From: Julien Le Dem [mailto:[EMAIL PROTECTED]] > Sent: Thursday, January 27, 2011 5:21 PM > To: [EMAIL PROTECTED] > Subject: Re: Pig developer meeting in February > > Me too. > Julien > > > On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <[EMAIL PROTECTED]> wrote: > > Ok yeah I'll come :). > > > > On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <[EMAIL PROTECTED]> wrote: > >> While there is a lively discussion on this thread, I have not actually >> gotten any responses to having the meeting with exception of 1 person :). >> >> Please, let me know by the end of the week if you are planning to attend. >> If we don't get at least a few more responses I suggest we postpone >> the meeting. >> >> Thanks, >> >> Olga >> >> -----Original Message----- >> From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] >> Sent: Wednesday, January 26, 2011 6:04 PM >> To: [EMAIL PROTECTED] >> Subject: Re: Pig developer meeting in February >> >> Right, we do partition filtering, but not true predicate pushdown. >> >> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <[EMAIL PROTECTED]> >> wrote: >> >> > Are you talking about LoadMetadata.setPartitionFilter? >> > PartitionFilterOptimizer will do that. >> > >> > Daniel >> > >> > >> > Dmitriy Ryaboy wrote: >> > >> >> I may be wrong but I think predicate pushdown is designed for, but >> >> not actually implemented in the current LoadPushdown interface (you >> >> can only push projections). If I am wrong, that's great.. but if >> >> not, that would >> be >> >> an important feature to add, as people are trying to connect Pig to >> >> "smart" >> >> storage systems like rdbmses, HBase, and Cassandra more and more. >> >> I >> think >> >> we only kind of simulate this with partition keys info, which is >> >> not always sufficient >> >> >> >> D >> >> >> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem >> >> <[EMAIL PROTECTED]> >> >> wrote: >> >> >> >> >> >> >> >>> If making Pig Thread safe (i.e.: two threads running a different >> >>> pig >> >>> script) is important then we need to change some of the APIs from >> static >> >>> singleton access to a dependency injection pattern. >> >>> In that case, this should probably be done before 1.0 For example: >> >>> UDFContext should be passed to the UDF after construction (similar >> >>> to the SevrletContext in Servlet or the way Hadoop passes the >> >>> context to tasks) Also a clearly separated API that does not >> >>> depend on the Pig implementation would help. >> >>> For example UDFContext is in org.apache.pig.impl.util when it >> >>> would be better in org.apache.pig.api (Or at least an interface +
Dmitriy Ryaboy 2011-02-09, 05:38
-
Re: REMINDER: Pig developer meeting in FebruaryDmitriy Ryaboy 2011-02-12, 01:33
Hi folks,
Any chance someone took notes? :) D On Tue, Feb 8, 2011 at 9:38 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote: > Hi All, > I got sick and won't be able to make it. Would love to see some notes > after the meeting :). > > D > > On Tue, Feb 8, 2011 at 10:29 AM, Olga Natkovich <[EMAIL PROTECTED]> wrote: >> Hi Guys, >> >> We are looking forward to see you tomorrow at 4 pm at Yahoo campus in Sunnyvale. >> >> Yahoo address is >> >> 701 First Ave. >> Sunnyvale, CA 94089 >> >> We are in building E. Please, ask for Alan or me at the reception. >> >> Olga >> >> -----Original Message----- >> From: Olga Natkovich [mailto:[EMAIL PROTECTED]] >> Sent: Thursday, February 03, 2011 10:42 AM >> To: [EMAIL PROTECTED] >> Subject: REMINDER: Pig developer meeting in February >> >> Hi guys, >> >> This is just a reminder that the meeting will be held next Wednesday, 2/9 4-6 pm at Yahoo campus. >> >> If you have not yet responded but planning to attend, please, let me know. >> >> Olga >> >> -----Original Message----- >> From: Santhosh Srinivasan [mailto:[EMAIL PROTECTED]] >> Sent: Friday, January 28, 2011 3:36 PM >> To: [EMAIL PROTECTED] >> Subject: RE: Pig developer meeting in February >> >> I am planning to attend. >> >> -----Original Message----- >> From: Olga Natkovich [mailto:[EMAIL PROTECTED]] >> Sent: Friday, January 28, 2011 12:58 PM >> To: [EMAIL PROTECTED] >> Subject: RE: Pig developer meeting in February >> >> I believe we have critical mass so the meeting is on! >> >> If you have not responded yet but planning to attend, please, let me know. >> >> Thanks, >> >> Olga >> >> -----Original Message----- >> From: Julien Le Dem [mailto:[EMAIL PROTECTED]] >> Sent: Thursday, January 27, 2011 5:21 PM >> To: [EMAIL PROTECTED] >> Subject: Re: Pig developer meeting in February >> >> Me too. >> Julien >> >> >> On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <[EMAIL PROTECTED]> wrote: >> >> Ok yeah I'll come :). >> >> >> >> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <[EMAIL PROTECTED]> wrote: >> >>> While there is a lively discussion on this thread, I have not actually >>> gotten any responses to having the meeting with exception of 1 person :). >>> >>> Please, let me know by the end of the week if you are planning to attend. >>> If we don't get at least a few more responses I suggest we postpone >>> the meeting. >>> >>> Thanks, >>> >>> Olga >>> >>> -----Original Message----- >>> From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] >>> Sent: Wednesday, January 26, 2011 6:04 PM >>> To: [EMAIL PROTECTED] >>> Subject: Re: Pig developer meeting in February >>> >>> Right, we do partition filtering, but not true predicate pushdown. >>> >>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <[EMAIL PROTECTED]> >>> wrote: >>> >>> > Are you talking about LoadMetadata.setPartitionFilter? >>> > PartitionFilterOptimizer will do that. >>> > >>> > Daniel >>> > >>> > >>> > Dmitriy Ryaboy wrote: >>> > >>> >> I may be wrong but I think predicate pushdown is designed for, but >>> >> not actually implemented in the current LoadPushdown interface (you >>> >> can only push projections). If I am wrong, that's great.. but if >>> >> not, that would >>> be >>> >> an important feature to add, as people are trying to connect Pig to >>> >> "smart" >>> >> storage systems like rdbmses, HBase, and Cassandra more and more. >>> >> I >>> think >>> >> we only kind of simulate this with partition keys info, which is >>> >> not always sufficient >>> >> >>> >> D >>> >> >>> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem >>> >> <[EMAIL PROTECTED]> >>> >> wrote: >>> >> >>> >> >>> >> >>> >>> If making Pig Thread safe (i.e.: two threads running a different >>> >>> pig >>> >>> script) is important then we need to change some of the APIs from >>> static >>> >>> singleton access to a dependency injection pattern. >>> >>> In that case, this should probably be done before 1.0 For example: >>> >>> UDFContext should be passed to the UDF after construction (similar >>> >>> to the SevrletContext in Servlet or the way Hadoop passes the +
Dmitriy Ryaboy 2011-02-12, 01:33
-
RE: REMINDER: Pig developer meeting in FebruarySanthosh Srinivasan 2011-02-12, 02:30
Arvind from Cloudera took excellent notes. You should see it next week after Alan gets a chance to review them.
Santhosh -----Original Message----- From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] Sent: Friday, February 11, 2011 5:34 PM To: [EMAIL PROTECTED] Subject: Re: REMINDER: Pig developer meeting in February Hi folks, Any chance someone took notes? :) D On Tue, Feb 8, 2011 at 9:38 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote: > Hi All, > I got sick and won't be able to make it. Would love to see some notes > after the meeting :). > > D > > On Tue, Feb 8, 2011 at 10:29 AM, Olga Natkovich <[EMAIL PROTECTED]> wrote: >> Hi Guys, >> >> We are looking forward to see you tomorrow at 4 pm at Yahoo campus in Sunnyvale. >> >> Yahoo address is >> >> 701 First Ave. >> Sunnyvale, CA 94089 >> >> We are in building E. Please, ask for Alan or me at the reception. >> >> Olga >> >> -----Original Message----- >> From: Olga Natkovich [mailto:[EMAIL PROTECTED]] >> Sent: Thursday, February 03, 2011 10:42 AM >> To: [EMAIL PROTECTED] >> Subject: REMINDER: Pig developer meeting in February >> >> Hi guys, >> >> This is just a reminder that the meeting will be held next Wednesday, 2/9 4-6 pm at Yahoo campus. >> >> If you have not yet responded but planning to attend, please, let me know. >> >> Olga >> >> -----Original Message----- >> From: Santhosh Srinivasan [mailto:[EMAIL PROTECTED]] >> Sent: Friday, January 28, 2011 3:36 PM >> To: [EMAIL PROTECTED] >> Subject: RE: Pig developer meeting in February >> >> I am planning to attend. >> >> -----Original Message----- >> From: Olga Natkovich [mailto:[EMAIL PROTECTED]] >> Sent: Friday, January 28, 2011 12:58 PM >> To: [EMAIL PROTECTED] >> Subject: RE: Pig developer meeting in February >> >> I believe we have critical mass so the meeting is on! >> >> If you have not responded yet but planning to attend, please, let me know. >> >> Thanks, >> >> Olga >> >> -----Original Message----- >> From: Julien Le Dem [mailto:[EMAIL PROTECTED]] >> Sent: Thursday, January 27, 2011 5:21 PM >> To: [EMAIL PROTECTED] >> Subject: Re: Pig developer meeting in February >> >> Me too. >> Julien >> >> >> On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <[EMAIL PROTECTED]> wrote: >> >> Ok yeah I'll come :). >> >> >> >> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <[EMAIL PROTECTED]> wrote: >> >>> While there is a lively discussion on this thread, I have not >>> actually gotten any responses to having the meeting with exception of 1 person :). >>> >>> Please, let me know by the end of the week if you are planning to attend. >>> If we don't get at least a few more responses I suggest we postpone >>> the meeting. >>> >>> Thanks, >>> >>> Olga >>> >>> -----Original Message----- >>> From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] >>> Sent: Wednesday, January 26, 2011 6:04 PM >>> To: [EMAIL PROTECTED] >>> Subject: Re: Pig developer meeting in February >>> >>> Right, we do partition filtering, but not true predicate pushdown. >>> >>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <[EMAIL PROTECTED]> >>> wrote: >>> >>> > Are you talking about LoadMetadata.setPartitionFilter? >>> > PartitionFilterOptimizer will do that. >>> > >>> > Daniel >>> > >>> > >>> > Dmitriy Ryaboy wrote: >>> > >>> >> I may be wrong but I think predicate pushdown is designed for, >>> >> but not actually implemented in the current LoadPushdown >>> >> interface (you can only push projections). If I am wrong, that's >>> >> great.. but if not, that would >>> be >>> >> an important feature to add, as people are trying to connect Pig >>> >> to "smart" >>> >> storage systems like rdbmses, HBase, and Cassandra more and more. >>> >> I >>> think >>> >> we only kind of simulate this with partition keys info, which is >>> >> not always sufficient >>> >> >>> >> D >>> >> >>> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem >>> >> <[EMAIL PROTECTED]> >>> >> wrote: >>> >> >>> >> >>> >> >>> >>> If making Pig Thread safe (i.e.: two threads running a different +
Santhosh Srinivasan 2011-02-12, 02:30
-
Re: REMINDER: Pig developer meeting in Februaryarvind@...) 2011-02-14, 20:55
Hi,
Sorry for the delay in sending this. Following are the notes from the last developer's meeting. Arvind ----------- *Attendees* - From Y!: Alan, Santosh, Romain, Daniel, Richard, Ashutosh, Ben, Julian - From Cloudera: Arvind *Agenda* - Error Handling - Brainstorming Ideas For 0.9 - Brainstorming Ideas Beyond 0.9 *Error Handling Suggestions/Proposal Discussion:* - Allow each statement to declare ONERROR clause with a UDF to handle the control in case of error. - This would be better than current behavior of exiting on error. - Alternatively, allow ONERROR to be declared for an entire script/session which would allow individual statements to override and provide a more specialized UDF for error handling. - Yet another alternative - allow the specification of a threshold number of errors that Pig ignores before exiting. - Key idea is to ensure that the error handling is focused on data error handling and not control-flow. - Action Item: Post the key proposal on the Wiki. *Brainstorming Ideas For 0.9:* - Internal development done by March - Release tentatively by May - Support for ILLUSTRATE. - Current status: - Parser rewrite almost complete - Working on load data according to schema - support for padding missing values - No support for Boolean type planned yet. - Big features in 0.9 - Parser change - Macro support - Jython/Script support - Penny (Formally Inspector Gadget): framework to instrument scripts. Allows detection of bad records that cause failures, implement constraints. - Works by integrating with the optimizer to produce wrappers for key UDFs of interest. - Agents can be added in different parts of the query - Prepackaged agents available, but framework allows the creation of custom agents as needed. - Pending work - implementation of unit tests, and turning this into a patch. *Brainstorming Ideas Beyond 0.9:* - Support for different backends for Pig (MR, Piranha, Local, Oozie) - Execution engine that can generate plans specific to the underlying architecture and allow controlling routines to rewrite/re-optimize the plan mid-execution. - Thread safety when running local jobs - to allow better embedding of Pig as a light-weight tool in web-applications and other multi-threaded environments. - Work includes making UDF context thread-safe and removing statics from the implementation. - Will benefit Oozie and other systems that embed Pig without having to worry about side-effects. - Allow execution to resume from where it left off after due to runtime failure. - May be done by allowing Oozie as a backend where the plan is converted into an Oozie workflow. - Alternatively Pig could delegate blocks of execution to Oozie. - Scalability: Pig should support users who may not know the intricate details of the job/architecture. Things such as memory allocation, skew handling etc automatically without user involvement. - Allow pig to kill jobs already submitted if the shell exits due to a Control+C or other failures. - UDF 2.0 - simplify UDF interfaces, along with support for multiple versions of the UDF at the same time. *General* - Loops in Pig: No direct support, but available indirectly by integration with scripting environments. - Would be good to allow Pig to be provisioned across the cluster for faster job startup. - Pig-pen: not under active development and not supported. On Fri, Feb 11, 2011 at 6:30 PM, Santhosh Srinivasan <[EMAIL PROTECTED]>wrote: > Arvind from Cloudera took excellent notes. You should see it next week > after Alan gets a chance to review them. > > Santhosh > > -----Original Message----- > From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] > Sent: Friday, February 11, 2011 5:34 PM > To: [EMAIL PROTECTED] +
arvind@...) 2011-02-14, 20:55
-
Re: REMINDER: Pig developer meeting in FebruaryDmitriy Ryaboy 2011-02-14, 23:48
Thanks for that, arvind.
Y! folks, is there any public documentation for Penny? Is there overlap there with the error handling proposal? Also: think error handling can make it into 0.9 or are we thinking 0.10? D On Mon, Feb 14, 2011 at 12:55 PM, [EMAIL PROTECTED] <[EMAIL PROTECTED]>wrote: > Hi, > > Sorry for the delay in sending this. Following are the notes from the last > developer's meeting. > > Arvind > ----------- > *Attendees* > > - From Y!: Alan, Santosh, Romain, Daniel, Richard, Ashutosh, Ben, Julian > - From Cloudera: Arvind > > *Agenda* > > - Error Handling > - Brainstorming Ideas For 0.9 > - Brainstorming Ideas Beyond 0.9 > > *Error Handling Suggestions/Proposal Discussion:* > > - Allow each statement to declare ONERROR clause with a UDF to handle the > control in case of error. > - This would be better than current behavior of exiting on error. > - Alternatively, allow ONERROR to be declared for an entire > script/session which would allow individual statements to override and > provide a more specialized UDF for error handling. > - Yet another alternative - allow the specification of a threshold number > of errors that Pig ignores before exiting. > - Key idea is to ensure that the error handling is focused on data error > handling and not control-flow. > - Action Item: Post the key proposal on the Wiki. > > *Brainstorming Ideas For 0.9:* > > - Internal development done by March > - Release tentatively by May > - Support for ILLUSTRATE. > - Current status: > - Parser rewrite almost complete > - Working on load data according to schema - support for padding > missing values > - No support for Boolean type planned yet. > - Big features in 0.9 > - Parser change > - Macro support > - Jython/Script support > - Penny (Formally Inspector Gadget): framework to instrument scripts. > Allows detection of bad records that cause failures, implement > constraints. > - Works by integrating with the optimizer to produce wrappers for > key UDFs of interest. > - Agents can be added in different parts of the query > - Prepackaged agents available, but framework allows the creation > of custom agents as needed. > - Pending work - implementation of unit tests, and turning this > into a patch. > > *Brainstorming Ideas Beyond 0.9:* > > - Support for different backends for Pig (MR, Piranha, Local, Oozie) > - Execution engine that can generate plans specific to the underlying > architecture and allow controlling routines to > rewrite/re-optimize the plan > mid-execution. > - Thread safety when running local jobs - to allow better embedding of > Pig as a light-weight tool in web-applications and other multi-threaded > environments. > - Work includes making UDF context thread-safe and removing statics > from the implementation. > - Will benefit Oozie and other systems that embed Pig without having > to worry about side-effects. > - Allow execution to resume from where it left off after due to runtime > failure. > - May be done by allowing Oozie as a backend where the plan is > converted into an Oozie workflow. > - Alternatively Pig could delegate blocks of execution to Oozie. > - Scalability: Pig should support users who may not know the intricate > details of the job/architecture. Things such as memory allocation, skew > handling etc automatically without user involvement. > - Allow pig to kill jobs already submitted if the shell exits due to a > Control+C or other failures. > - UDF 2.0 - simplify UDF interfaces, along with support for multiple > versions of the UDF at the same time. > > > *General* > > - Loops in Pig: No direct support, but available indirectly by > integration with scripting environments. > - Would be good to allow Pig to be provisioned across the cluster for > faster job startup. > - Pig-pen: not under active development and not supported. +
Dmitriy Ryaboy 2011-02-14, 23:48
-
RE: REMINDER: Pig developer meeting in FebruaryOlga Natkovich 2011-02-15, 00:11
We do not yet have anything public about Penny yet - still trying to figure out when/if it is going out. Don't think there is whole lot of interaction with the error handling proposal but I will let Alan to comment on that.
Given that the error handling proposal is still not finalized and 0.9 already has lots of changes and little time left, I would suggest delaying it to the release after 0.9. Olga -----Original Message----- From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] Sent: Monday, February 14, 2011 3:49 PM To: [EMAIL PROTECTED] Subject: Re: REMINDER: Pig developer meeting in February Thanks for that, arvind. Y! folks, is there any public documentation for Penny? Is there overlap there with the error handling proposal? Also: think error handling can make it into 0.9 or are we thinking 0.10? D On Mon, Feb 14, 2011 at 12:55 PM, [EMAIL PROTECTED] <[EMAIL PROTECTED]>wrote: > Hi, > > Sorry for the delay in sending this. Following are the notes from the last > developer's meeting. > > Arvind > ----------- > *Attendees* > > - From Y!: Alan, Santosh, Romain, Daniel, Richard, Ashutosh, Ben, Julian > - From Cloudera: Arvind > > *Agenda* > > - Error Handling > - Brainstorming Ideas For 0.9 > - Brainstorming Ideas Beyond 0.9 > > *Error Handling Suggestions/Proposal Discussion:* > > - Allow each statement to declare ONERROR clause with a UDF to handle the > control in case of error. > - This would be better than current behavior of exiting on error. > - Alternatively, allow ONERROR to be declared for an entire > script/session which would allow individual statements to override and > provide a more specialized UDF for error handling. > - Yet another alternative - allow the specification of a threshold number > of errors that Pig ignores before exiting. > - Key idea is to ensure that the error handling is focused on data error > handling and not control-flow. > - Action Item: Post the key proposal on the Wiki. > > *Brainstorming Ideas For 0.9:* > > - Internal development done by March > - Release tentatively by May > - Support for ILLUSTRATE. > - Current status: > - Parser rewrite almost complete > - Working on load data according to schema - support for padding > missing values > - No support for Boolean type planned yet. > - Big features in 0.9 > - Parser change > - Macro support > - Jython/Script support > - Penny (Formally Inspector Gadget): framework to instrument scripts. > Allows detection of bad records that cause failures, implement > constraints. > - Works by integrating with the optimizer to produce wrappers for > key UDFs of interest. > - Agents can be added in different parts of the query > - Prepackaged agents available, but framework allows the creation > of custom agents as needed. > - Pending work - implementation of unit tests, and turning this > into a patch. > > *Brainstorming Ideas Beyond 0.9:* > > - Support for different backends for Pig (MR, Piranha, Local, Oozie) > - Execution engine that can generate plans specific to the underlying > architecture and allow controlling routines to > rewrite/re-optimize the plan > mid-execution. > - Thread safety when running local jobs - to allow better embedding of > Pig as a light-weight tool in web-applications and other multi-threaded > environments. > - Work includes making UDF context thread-safe and removing statics > from the implementation. > - Will benefit Oozie and other systems that embed Pig without having > to worry about side-effects. > - Allow execution to resume from where it left off after due to runtime > failure. > - May be done by allowing Oozie as a backend where the plan is > converted into an Oozie workflow. > - Alternatively Pig could delegate blocks of execution to Oozie. > - Scalability: Pig should support users who may not know the intricate +
Olga Natkovich 2011-02-15, 00:11
-
Re: REMINDER: Pig developer meeting in FebruaryRenato Marroquín Mogrovej... 2011-02-15, 00:53
Hey, there are slides from Chris Olston's talk.
http://infolab.stanford.edu/infoseminar/olston.txt http://infolab.stanford.edu/infoseminar/olston-slides.pdf But more formal documentation about Penny/InspectorGadget (cool name btw) would be awesome. Renato M. 2011/2/14 Olga Natkovich <[EMAIL PROTECTED]> > We do not yet have anything public about Penny yet - still trying to figure > out when/if it is going out. Don't think there is whole lot of interaction > with the error handling proposal but I will let Alan to comment on that. > > Given that the error handling proposal is still not finalized and 0.9 > already has lots of changes and little time left, I would suggest delaying > it to the release after 0.9. > > Olga > > -----Original Message----- > From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] > Sent: Monday, February 14, 2011 3:49 PM > To: [EMAIL PROTECTED] > Subject: Re: REMINDER: Pig developer meeting in February > > Thanks for that, arvind. > > Y! folks, is there any public documentation for Penny? > Is there overlap there with the error handling proposal? > > Also: think error handling can make it into 0.9 or are we thinking 0.10? > > D > > On Mon, Feb 14, 2011 at 12:55 PM, [EMAIL PROTECTED] > <[EMAIL PROTECTED]>wrote: > > > Hi, > > > > Sorry for the delay in sending this. Following are the notes from the > last > > developer's meeting. > > > > Arvind > > ----------- > > *Attendees* > > > > - From Y!: Alan, Santosh, Romain, Daniel, Richard, Ashutosh, Ben, > Julian > > - From Cloudera: Arvind > > > > *Agenda* > > > > - Error Handling > > - Brainstorming Ideas For 0.9 > > - Brainstorming Ideas Beyond 0.9 > > > > *Error Handling Suggestions/Proposal Discussion:* > > > > - Allow each statement to declare ONERROR clause with a UDF to handle > the > > control in case of error. > > - This would be better than current behavior of exiting on error. > > - Alternatively, allow ONERROR to be declared for an entire > > script/session which would allow individual statements to override and > > provide a more specialized UDF for error handling. > > - Yet another alternative - allow the specification of a threshold > number > > of errors that Pig ignores before exiting. > > - Key idea is to ensure that the error handling is focused on data > error > > handling and not control-flow. > > - Action Item: Post the key proposal on the Wiki. > > > > *Brainstorming Ideas For 0.9:* > > > > - Internal development done by March > > - Release tentatively by May > > - Support for ILLUSTRATE. > > - Current status: > > - Parser rewrite almost complete > > - Working on load data according to schema - support for padding > > missing values > > - No support for Boolean type planned yet. > > - Big features in 0.9 > > - Parser change > > - Macro support > > - Jython/Script support > > - Penny (Formally Inspector Gadget): framework to instrument > scripts. > > Allows detection of bad records that cause failures, implement > > constraints. > > - Works by integrating with the optimizer to produce wrappers for > > key UDFs of interest. > > - Agents can be added in different parts of the query > > - Prepackaged agents available, but framework allows the creation > > of custom agents as needed. > > - Pending work - implementation of unit tests, and turning this > > into a patch. > > > > *Brainstorming Ideas Beyond 0.9:* > > > > - Support for different backends for Pig (MR, Piranha, Local, Oozie) > > - Execution engine that can generate plans specific to the > underlying > > architecture and allow controlling routines to > > rewrite/re-optimize the plan > > mid-execution. > > - Thread safety when running local jobs - to allow better embedding of > > Pig as a light-weight tool in web-applications and other multi-threaded > > environments. > > - Work includes making UDF context thread-safe and removing statics +
Renato Marroquín Mogrovej... 2011-02-15, 00:53
-
Re: REMINDER: Pig developer meeting in FebruaryAlan Gates 2011-02-15, 08:54
On Feb 15, 2011, at 5:18 AM, Dmitriy Ryaboy wrote: > Is there overlap there with the error handling proposal? I don't think so. The error handling proposal is about how to handle errors that happen when you are running Pig jobs. Penny is a way to instrument your scripts so that you can do things like crash analysis, etc. In that way it's similar to the interface that java makes available to tools like gcov for doing byte code insertion. Instrumenting code you're using in a production environment would be prohibitively expensive. > > Also: think error handling can make it into 0.9 or are we thinking > 0.10? Given that no one is actively working on it and we're hoping to end feature development on 0.9 in the next month, I don't see how there will be time. Alan. +
Alan Gates 2011-02-15, 08:54
-
Re: REMINDER: Pig developer meeting in FebruaryAshutosh Chauhan 2011-02-15, 09:05
There is a related work overlapping though with (slightly) different
goals and implementations: http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper37.pdf http://www.cidrdb.org/cidr2011/Talks/CIDR11_Ikeda.ppt Ashutosh On Mon, Feb 14, 2011 at 15:48, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote: > Thanks for that, arvind. > > Y! folks, is there any public documentation for Penny? > Is there overlap there with the error handling proposal? > > Also: think error handling can make it into 0.9 or are we thinking 0.10? > > D > > On Mon, Feb 14, 2011 at 12:55 PM, [EMAIL PROTECTED] > <[EMAIL PROTECTED]>wrote: > >> Hi, >> >> Sorry for the delay in sending this. Following are the notes from the last >> developer's meeting. >> >> Arvind >> ----------- >> *Attendees* >> >> - From Y!: Alan, Santosh, Romain, Daniel, Richard, Ashutosh, Ben, Julian >> - From Cloudera: Arvind >> >> *Agenda* >> >> - Error Handling >> - Brainstorming Ideas For 0.9 >> - Brainstorming Ideas Beyond 0.9 >> >> *Error Handling Suggestions/Proposal Discussion:* >> >> - Allow each statement to declare ONERROR clause with a UDF to handle the >> control in case of error. >> - This would be better than current behavior of exiting on error. >> - Alternatively, allow ONERROR to be declared for an entire >> script/session which would allow individual statements to override and >> provide a more specialized UDF for error handling. >> - Yet another alternative - allow the specification of a threshold number >> of errors that Pig ignores before exiting. >> - Key idea is to ensure that the error handling is focused on data error >> handling and not control-flow. >> - Action Item: Post the key proposal on the Wiki. >> >> *Brainstorming Ideas For 0.9:* >> >> - Internal development done by March >> - Release tentatively by May >> - Support for ILLUSTRATE. >> - Current status: >> - Parser rewrite almost complete >> - Working on load data according to schema - support for padding >> missing values >> - No support for Boolean type planned yet. >> - Big features in 0.9 >> - Parser change >> - Macro support >> - Jython/Script support >> - Penny (Formally Inspector Gadget): framework to instrument scripts. >> Allows detection of bad records that cause failures, implement >> constraints. >> - Works by integrating with the optimizer to produce wrappers for >> key UDFs of interest. >> - Agents can be added in different parts of the query >> - Prepackaged agents available, but framework allows the creation >> of custom agents as needed. >> - Pending work - implementation of unit tests, and turning this >> into a patch. >> >> *Brainstorming Ideas Beyond 0.9:* >> >> - Support for different backends for Pig (MR, Piranha, Local, Oozie) >> - Execution engine that can generate plans specific to the underlying >> architecture and allow controlling routines to >> rewrite/re-optimize the plan >> mid-execution. >> - Thread safety when running local jobs - to allow better embedding of >> Pig as a light-weight tool in web-applications and other multi-threaded >> environments. >> - Work includes making UDF context thread-safe and removing statics >> from the implementation. >> - Will benefit Oozie and other systems that embed Pig without having >> to worry about side-effects. >> - Allow execution to resume from where it left off after due to runtime >> failure. >> - May be done by allowing Oozie as a backend where the plan is >> converted into an Oozie workflow. >> - Alternatively Pig could delegate blocks of execution to Oozie. >> - Scalability: Pig should support users who may not know the intricate >> details of the job/architecture. Things such as memory allocation, skew >> handling etc automatically without user involvement. >> - Allow pig to kill jobs already submitted if the shell exits due to a +
Ashutosh Chauhan 2011-02-15, 09:05
-
Re: REMINDER: Pig developer meeting in FebruaryMilind Bhandarkar 2011-02-23, 02:08
On Feb 14, 2011, at 12:55 PM, [EMAIL PROTECTED] wrote: > > - Support for different backends for Pig (MR, Piranha, Local, Oozie) > - Execution engine that can generate plans specific to the underlying > architecture and allow controlling routines to > rewrite/re-optimize the plan > mid-execution. +1 for Oozie backend. > - Allow execution to resume from where it left off after due to runtime > failure. > - May be done by allowing Oozie as a backend where the plan is > converted into an Oozie workflow. > - Alternatively Pig could delegate blocks of execution to Oozie. See above. This would be very beneficial for oozie (even more than pig), because no one should be made to program in xml ! The current workflow specification is a cruelty ! Pig has constructs to invoke hdfs commands, arbitrary jars, mapreduce codes already. If it has a hive execution mode, e.g. A = hive("select xyz from etc..."); all of these could be handed off to oozie (I believe alejandro has a hive action for oozie already). Then there would be no "hive vs pig", instead pigs will really eat anything. - milind --- Milind Bhandarkar [EMAIL PROTECTED] +
Milind Bhandarkar 2011-02-23, 02:08
|