|
Olga Natkovich
2009-08-17, 18:46
Dmitriy Ryaboy
2009-08-17, 19:03
Olga Natkovich
2009-08-17, 19:57
Santhosh Srinivasan
2009-08-17, 20:36
Olga Natkovich
2009-08-17, 20:42
Santhosh Srinivasan
2009-08-17, 20:47
Santhosh Srinivasan
2009-08-17, 22:06
Olga Natkovich
2009-08-17, 22:26
Alan Gates
2009-08-18, 16:56
Dmitriy Ryaboy
2009-08-18, 17:05
Alan Gates
2009-08-18, 17:40
|
-
Pig 0.4.0 releaseOlga Natkovich 2009-08-17, 18:46
Pig Developers,
We have made several significant performance and other improvements over the last couple of months: (1) Added an optimizer with several rules (2) Introduced skew and merge joins (3) Cleaned COUNT and AVG semantics I think it is time for another release to make this functionality available to users. I propose that Pig 0.4.0 is released against Hadoop 18 since most users are still using this version. Once Hadoop 20.1 is released, we will roll Pig 0.5.0 based on Hadoop 20. Please, vote on the proposal by Thursday. Olga
-
Re: Pig 0.4.0 releaseDmitriy Ryaboy 2009-08-17, 19:03
Olga,
Do non-commiters get a vote? Zebra is in trunk, but relies on 0.20, which is somewhat inconsistent even if it's in contrib/ Would love to see dynamic (or at least static) shims incorporated into the 0.4 release (see PIG-660, PIG-924) There are a couple of bugs still outstanding that I think would need to get fixed before a release: https://issues.apache.org/jira/browse/PIG-859 https://issues.apache.org/jira/browse/PIG-925 I think all of these can be solved within a week; assuming we are talking about a release after these go into trunk, +1. -D On Mon, Aug 17, 2009 at 11:46 AM, Olga Natkovich<[EMAIL PROTECTED]> wrote: > Pig Developers, > > > > We have made several significant performance and other improvements over > the last couple of months: > > > > (1) Added an optimizer with several rules > > (2) Introduced skew and merge joins > > (3) Cleaned COUNT and AVG semantics > > > > I think it is time for another release to make this functionality > available to users. > > > > I propose that Pig 0.4.0 is released against Hadoop 18 since most users > are still using this version. Once Hadoop 20.1 is released, we will roll > Pig 0.5.0 based on Hadoop 20. > > > > Please, vote on the proposal by Thursday. > > > > Olga > >
-
RE: Pig 0.4.0 releaseOlga Natkovich 2009-08-17, 19:57
Hi Dmitry,
Non-committers get a non-binding vote. Zebra needs Hadoop 20.1 because it is relying on TFile functionality that is not available in Hadoop 20. In general, the recommendation from the Hadoop team is to wait till hadoop 20.1 is released. For the remainder of the issues, while I see that it would be nice to resolve them, I don't see them as blockers for Pig 0.4.0. My plan was to release what's currently in the trunk and have a follow up patch releases if needed. Olga -----Original Message----- From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 12:04 PM To: [EMAIL PROTECTED] Subject: Re: Pig 0.4.0 release Olga, Do non-commiters get a vote? Zebra is in trunk, but relies on 0.20, which is somewhat inconsistent even if it's in contrib/ Would love to see dynamic (or at least static) shims incorporated into the 0.4 release (see PIG-660, PIG-924) There are a couple of bugs still outstanding that I think would need to get fixed before a release: https://issues.apache.org/jira/browse/PIG-859 https://issues.apache.org/jira/browse/PIG-925 I think all of these can be solved within a week; assuming we are talking about a release after these go into trunk, +1. -D On Mon, Aug 17, 2009 at 11:46 AM, Olga Natkovich<[EMAIL PROTECTED]> wrote: > Pig Developers, > > > > We have made several significant performance and other improvements over > the last couple of months: > > > > (1) Added an optimizer with several rules > > (2) Introduced skew and merge joins > > (3) Cleaned COUNT and AVG semantics > > > > I think it is time for another release to make this functionality > available to users. > > > > I propose that Pig 0.4.0 is released against Hadoop 18 since most users > are still using this version. Once Hadoop 20.1 is released, we will roll > Pig 0.5.0 based on Hadoop 20. > > > > Please, vote on the proposal by Thursday. > > > > Olga > >
-
RE: Pig 0.4.0 releaseSanthosh Srinivasan 2009-08-17, 20:36
I have a question:
Will we be able to fix piggybank sources given that Zebra needs 0.20 and the rest of Pig requires 0.18? If the answer is yes then, +1 for the release. I agree with the plan of making 0.4.0 with Hadoop-0.18 and a later release (0.5.0) for Hadoop-0.20.1. Thanks, Santhosh -----Original Message----- From: Olga Natkovich [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 12:57 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release Hi Dmitry, Non-committers get a non-binding vote. Zebra needs Hadoop 20.1 because it is relying on TFile functionality that is not available in Hadoop 20. In general, the recommendation from the Hadoop team is to wait till hadoop 20.1 is released. For the remainder of the issues, while I see that it would be nice to resolve them, I don't see them as blockers for Pig 0.4.0. My plan was to release what's currently in the trunk and have a follow up patch releases if needed. Olga -----Original Message----- From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 12:04 PM To: [EMAIL PROTECTED] Subject: Re: Pig 0.4.0 release Olga, Do non-commiters get a vote? Zebra is in trunk, but relies on 0.20, which is somewhat inconsistent even if it's in contrib/ Would love to see dynamic (or at least static) shims incorporated into the 0.4 release (see PIG-660, PIG-924) There are a couple of bugs still outstanding that I think would need to get fixed before a release: https://issues.apache.org/jira/browse/PIG-859 https://issues.apache.org/jira/browse/PIG-925 I think all of these can be solved within a week; assuming we are talking about a release after these go into trunk, +1. -D On Mon, Aug 17, 2009 at 11:46 AM, Olga Natkovich<[EMAIL PROTECTED]> wrote: > Pig Developers, > > > > We have made several significant performance and other improvements over > the last couple of months: > > > > (1) Added an optimizer with several rules > > (2) Introduced skew and merge joins > > (3) Cleaned COUNT and AVG semantics > > > > I think it is time for another release to make this functionality > available to users. > > > > I propose that Pig 0.4.0 is released against Hadoop 18 since most users > are still using this version. Once Hadoop 20.1 is released, we will roll > Pig 0.5.0 based on Hadoop 20. > > > > Please, vote on the proposal by Thursday. > > > > Olga > >
-
RE: Pig 0.4.0 releaseOlga Natkovich 2009-08-17, 20:42
Hi Santhosh,
What do you mean by "fixing piggybank"? Olga -----Original Message----- From: Santhosh Srinivasan [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 1:37 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release I have a question: Will we be able to fix piggybank sources given that Zebra needs 0.20 and the rest of Pig requires 0.18? If the answer is yes then, +1 for the release. I agree with the plan of making 0.4.0 with Hadoop-0.18 and a later release (0.5.0) for Hadoop-0.20.1. Thanks, Santhosh -----Original Message----- From: Olga Natkovich [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 12:57 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release Hi Dmitry, Non-committers get a non-binding vote. Zebra needs Hadoop 20.1 because it is relying on TFile functionality that is not available in Hadoop 20. In general, the recommendation from the Hadoop team is to wait till hadoop 20.1 is released. For the remainder of the issues, while I see that it would be nice to resolve them, I don't see them as blockers for Pig 0.4.0. My plan was to release what's currently in the trunk and have a follow up patch releases if needed. Olga -----Original Message----- From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 12:04 PM To: [EMAIL PROTECTED] Subject: Re: Pig 0.4.0 release Olga, Do non-commiters get a vote? Zebra is in trunk, but relies on 0.20, which is somewhat inconsistent even if it's in contrib/ Would love to see dynamic (or at least static) shims incorporated into the 0.4 release (see PIG-660, PIG-924) There are a couple of bugs still outstanding that I think would need to get fixed before a release: https://issues.apache.org/jira/browse/PIG-859 https://issues.apache.org/jira/browse/PIG-925 I think all of these can be solved within a week; assuming we are talking about a release after these go into trunk, +1. -D On Mon, Aug 17, 2009 at 11:46 AM, Olga Natkovich<[EMAIL PROTECTED]> wrote: > Pig Developers, > > > > We have made several significant performance and other improvements over > the last couple of months: > > > > (1) Added an optimizer with several rules > > (2) Introduced skew and merge joins > > (3) Cleaned COUNT and AVG semantics > > > > I think it is time for another release to make this functionality > available to users. > > > > I propose that Pig 0.4.0 is released against Hadoop 18 since most users > are still using this version. Once Hadoop 20.1 is released, we will roll > Pig 0.5.0 based on Hadoop 20. > > > > Please, vote on the proposal by Thursday. > > > > Olga > >
-
RE: Pig 0.4.0 releaseSanthosh Srinivasan 2009-08-17, 20:47
Till we release 0.5.0, will zebra's requirement on 0.20 prevent any bugs/issues with Piggybank?
Santhosh -----Original Message----- From: Olga Natkovich [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 1:43 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release Hi Santhosh, What do you mean by "fixing piggybank"? Olga -----Original Message----- From: Santhosh Srinivasan [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 1:37 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release I have a question: Will we be able to fix piggybank sources given that Zebra needs 0.20 and the rest of Pig requires 0.18? If the answer is yes then, +1 for the release. I agree with the plan of making 0.4.0 with Hadoop-0.18 and a later release (0.5.0) for Hadoop-0.20.1. Thanks, Santhosh -----Original Message----- From: Olga Natkovich [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 12:57 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release Hi Dmitry, Non-committers get a non-binding vote. Zebra needs Hadoop 20.1 because it is relying on TFile functionality that is not available in Hadoop 20. In general, the recommendation from the Hadoop team is to wait till hadoop 20.1 is released. For the remainder of the issues, while I see that it would be nice to resolve them, I don't see them as blockers for Pig 0.4.0. My plan was to release what's currently in the trunk and have a follow up patch releases if needed. Olga -----Original Message----- From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 12:04 PM To: [EMAIL PROTECTED] Subject: Re: Pig 0.4.0 release Olga, Do non-commiters get a vote? Zebra is in trunk, but relies on 0.20, which is somewhat inconsistent even if it's in contrib/ Would love to see dynamic (or at least static) shims incorporated into the 0.4 release (see PIG-660, PIG-924) There are a couple of bugs still outstanding that I think would need to get fixed before a release: https://issues.apache.org/jira/browse/PIG-859 https://issues.apache.org/jira/browse/PIG-925 I think all of these can be solved within a week; assuming we are talking about a release after these go into trunk, +1. -D On Mon, Aug 17, 2009 at 11:46 AM, Olga Natkovich<[EMAIL PROTECTED]> wrote: > Pig Developers, > > > > We have made several significant performance and other improvements over > the last couple of months: > > > > (1) Added an optimizer with several rules > > (2) Introduced skew and merge joins > > (3) Cleaned COUNT and AVG semantics > > > > I think it is time for another release to make this functionality > available to users. > > > > I propose that Pig 0.4.0 is released against Hadoop 18 since most users > are still using this version. Once Hadoop 20.1 is released, we will roll > Pig 0.5.0 based on Hadoop 20. > > > > Please, vote on the proposal by Thursday. > > > > Olga > >
-
RE: Pig 0.4.0 releaseSanthosh Srinivasan 2009-08-17, 22:06
Rephrasing my question:
Till we release 0.5.0, will zebra's requirement on hadoop-0.20 prevent fixing of any bugs/issues with Piggybank? Santhosh -----Original Message----- From: Santhosh Srinivasan [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 1:47 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release Till we release 0.5.0, will zebra's requirement on 0.20 prevent any bugs/issues with Piggybank? Santhosh -----Original Message----- From: Olga Natkovich [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 1:43 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release Hi Santhosh, What do you mean by "fixing piggybank"? Olga -----Original Message----- From: Santhosh Srinivasan [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 1:37 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release I have a question: Will we be able to fix piggybank sources given that Zebra needs 0.20 and the rest of Pig requires 0.18? If the answer is yes then, +1 for the release. I agree with the plan of making 0.4.0 with Hadoop-0.18 and a later release (0.5.0) for Hadoop-0.20.1. Thanks, Santhosh -----Original Message----- From: Olga Natkovich [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 12:57 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release Hi Dmitry, Non-committers get a non-binding vote. Zebra needs Hadoop 20.1 because it is relying on TFile functionality that is not available in Hadoop 20. In general, the recommendation from the Hadoop team is to wait till hadoop 20.1 is released. For the remainder of the issues, while I see that it would be nice to resolve them, I don't see them as blockers for Pig 0.4.0. My plan was to release what's currently in the trunk and have a follow up patch releases if needed. Olga -----Original Message----- From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 12:04 PM To: [EMAIL PROTECTED] Subject: Re: Pig 0.4.0 release Olga, Do non-commiters get a vote? Zebra is in trunk, but relies on 0.20, which is somewhat inconsistent even if it's in contrib/ Would love to see dynamic (or at least static) shims incorporated into the 0.4 release (see PIG-660, PIG-924) There are a couple of bugs still outstanding that I think would need to get fixed before a release: https://issues.apache.org/jira/browse/PIG-859 https://issues.apache.org/jira/browse/PIG-925 I think all of these can be solved within a week; assuming we are talking about a release after these go into trunk, +1. -D On Mon, Aug 17, 2009 at 11:46 AM, Olga Natkovich<[EMAIL PROTECTED]> wrote: > Pig Developers, > > > > We have made several significant performance and other improvements over > the last couple of months: > > > > (1) Added an optimizer with several rules > > (2) Introduced skew and merge joins > > (3) Cleaned COUNT and AVG semantics > > > > I think it is time for another release to make this functionality > available to users. > > > > I propose that Pig 0.4.0 is released against Hadoop 18 since most users > are still using this version. Once Hadoop 20.1 is released, we will roll > Pig 0.5.0 based on Hadoop 20. > > > > Please, vote on the proposal by Thursday. > > > > Olga > >
-
RE: Pig 0.4.0 releaseOlga Natkovich 2009-08-17, 22:26
I don't think so. Each contrib project for now has a separate build.xml.
Olga -----Original Message----- From: Santhosh Srinivasan [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 1:47 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release Till we release 0.5.0, will zebra's requirement on 0.20 prevent any bugs/issues with Piggybank? Santhosh -----Original Message----- From: Olga Natkovich [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 1:43 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release Hi Santhosh, What do you mean by "fixing piggybank"? Olga -----Original Message----- From: Santhosh Srinivasan [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 1:37 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release I have a question: Will we be able to fix piggybank sources given that Zebra needs 0.20 and the rest of Pig requires 0.18? If the answer is yes then, +1 for the release. I agree with the plan of making 0.4.0 with Hadoop-0.18 and a later release (0.5.0) for Hadoop-0.20.1. Thanks, Santhosh -----Original Message----- From: Olga Natkovich [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 12:57 PM To: [EMAIL PROTECTED] Subject: RE: Pig 0.4.0 release Hi Dmitry, Non-committers get a non-binding vote. Zebra needs Hadoop 20.1 because it is relying on TFile functionality that is not available in Hadoop 20. In general, the recommendation from the Hadoop team is to wait till hadoop 20.1 is released. For the remainder of the issues, while I see that it would be nice to resolve them, I don't see them as blockers for Pig 0.4.0. My plan was to release what's currently in the trunk and have a follow up patch releases if needed. Olga -----Original Message----- From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]] Sent: Monday, August 17, 2009 12:04 PM To: [EMAIL PROTECTED] Subject: Re: Pig 0.4.0 release Olga, Do non-commiters get a vote? Zebra is in trunk, but relies on 0.20, which is somewhat inconsistent even if it's in contrib/ Would love to see dynamic (or at least static) shims incorporated into the 0.4 release (see PIG-660, PIG-924) There are a couple of bugs still outstanding that I think would need to get fixed before a release: https://issues.apache.org/jira/browse/PIG-859 https://issues.apache.org/jira/browse/PIG-925 I think all of these can be solved within a week; assuming we are talking about a release after these go into trunk, +1. -D On Mon, Aug 17, 2009 at 11:46 AM, Olga Natkovich<[EMAIL PROTECTED]> wrote: > Pig Developers, > > > > We have made several significant performance and other improvements over > the last couple of months: > > > > (1) Added an optimizer with several rules > > (2) Introduced skew and merge joins > > (3) Cleaned COUNT and AVG semantics > > > > I think it is time for another release to make this functionality > available to users. > > > > I propose that Pig 0.4.0 is released against Hadoop 18 since most users > are still using this version. Once Hadoop 20.1 is released, we will roll > Pig 0.5.0 based on Hadoop 20. > > > > Please, vote on the proposal by Thursday. > > > > Olga > >
-
Re: Pig 0.4.0 releaseAlan Gates 2009-08-18, 16:56
Non-committers certainly get a vote, it just isn't binding.
I agree on PIG-925 as a blocker. I don't see PIG-859 as a blocker since there is a simple work around. If we want to release 0.4.0 within a week or so, dynamic shims won't be an option because we won't be able to solve the bundled hadoop lib problem in that amount of time. I agree that we are not making life easy enough for users who want to build with hadoop 0.20. Based on comments on the JIRA, I'm not sure the patch for the static shims is ready. What if instead we checked in a version of hadoop20.jar that will work for users who want to build with 0.20. This way users can still build this if they want and our release isn't blocked on the patch. Alan. On Aug 17, 2009, at 12:03 PM, Dmitriy Ryaboy wrote: > Olga, > > Do non-commiters get a vote? > > Zebra is in trunk, but relies on 0.20, which is somewhat inconsistent > even if it's in contrib/ > > Would love to see dynamic (or at least static) shims incorporated into > the 0.4 release (see PIG-660, PIG-924) > > There are a couple of bugs still outstanding that I think would need > to get fixed before a release: > > https://issues.apache.org/jira/browse/PIG-859 > https://issues.apache.org/jira/browse/PIG-925 > > I think all of these can be solved within a week; assuming we are > talking about a release after these go into trunk, +1. > > -D > > > On Mon, Aug 17, 2009 at 11:46 AM, Olga Natkovich<olgan@yahoo- > inc.com> wrote: >> Pig Developers, >> >> >> >> We have made several significant performance and other improvements >> over >> the last couple of months: >> >> >> >> (1) Added an optimizer with several rules >> >> (2) Introduced skew and merge joins >> >> (3) Cleaned COUNT and AVG semantics >> >> >> >> I think it is time for another release to make this functionality >> available to users. >> >> >> >> I propose that Pig 0.4.0 is released against Hadoop 18 since most >> users >> are still using this version. Once Hadoop 20.1 is released, we will >> roll >> Pig 0.5.0 based on Hadoop 20. >> >> >> >> Please, vote on the proposal by Thursday. >> >> >> >> Olga >> >>
-
Re: Pig 0.4.0 releaseDmitriy Ryaboy 2009-08-18, 17:05
I am about to submit a cleaned up patch for 924.
It works fine as a static patch (in fact I can attach it to 660 as well) -- compiling with -Dhadoop.version=XX works as proposed for the static shims. It does the necessary prep for the code to be able to switch based on what's in its classpath, but it does not require unbundling to work statically. The hadoop20 jar attached to the zebra ticket is built in a different way than 18 and 19; it does not report its version (18 and 19 do). Right now I get around it by hard-coding a special case ("Unknown" => 20), but that's obviously suboptimal. Could someone rebuild hadoop20.jar the way Pig wants it, and with the proper version identification? If that happens, 924/660 can go in together with hadoop20.jar and users will at least be able to build against a static version of hadoop without requiring a patch. -Dmitriy On Tue, Aug 18, 2009 at 9:56 AM, Alan Gates<[EMAIL PROTECTED]> wrote: > Non-committers certainly get a vote, it just isn't binding. > > I agree on PIG-925 as a blocker. I don't see PIG-859 as a blocker since > there is a simple work around. > > If we want to release 0.4.0 within a week or so, dynamic shims won't be an > option because we won't be able to solve the bundled hadoop lib problem in > that amount of time. I agree that we are not making life easy enough for > users who want to build with hadoop 0.20. Based on comments on the JIRA, > I'm not sure the patch for the static shims is ready. What if instead we > checked in a version of hadoop20.jar that will work for users who want to > build with 0.20. This way users can still build this if they want and our > release isn't blocked on the patch. > > Alan. > > > On Aug 17, 2009, at 12:03 PM, Dmitriy Ryaboy wrote: > >> Olga, >> >> Do non-commiters get a vote? >> >> Zebra is in trunk, but relies on 0.20, which is somewhat inconsistent >> even if it's in contrib/ >> >> Would love to see dynamic (or at least static) shims incorporated into >> the 0.4 release (see PIG-660, PIG-924) >> >> There are a couple of bugs still outstanding that I think would need >> to get fixed before a release: >> >> https://issues.apache.org/jira/browse/PIG-859 >> https://issues.apache.org/jira/browse/PIG-925 >> >> I think all of these can be solved within a week; assuming we are >> talking about a release after these go into trunk, +1. >> >> -D >> >> >> On Mon, Aug 17, 2009 at 11:46 AM, Olga Natkovich<[EMAIL PROTECTED]> >> wrote: >>> >>> Pig Developers, >>> >>> >>> >>> We have made several significant performance and other improvements over >>> the last couple of months: >>> >>> >>> >>> (1) Added an optimizer with several rules >>> >>> (2) Introduced skew and merge joins >>> >>> (3) Cleaned COUNT and AVG semantics >>> >>> >>> >>> I think it is time for another release to make this functionality >>> available to users. >>> >>> >>> >>> I propose that Pig 0.4.0 is released against Hadoop 18 since most users >>> are still using this version. Once Hadoop 20.1 is released, we will roll >>> Pig 0.5.0 based on Hadoop 20. >>> >>> >>> >>> Please, vote on the proposal by Thursday. >>> >>> >>> >>> Olga >>> >>> > >
-
Re: Pig 0.4.0 releaseAlan Gates 2009-08-18, 17:40
On Aug 18, 2009, at 10:05 AM, Dmitriy Ryaboy wrote: > I am about to submit a cleaned up patch for 924. > It works fine as a static patch (in fact I can attach it to 660 as > well) -- compiling with -Dhadoop.version=XX works as proposed for the > static shims. It does the necessary prep for the code to be able to > switch based on what's in its classpath, but it does not require > unbundling to work statically. Ok, we'll take a look. > > The hadoop20 jar attached to the zebra ticket is built in a different > way than 18 and 19; it does not report its version (18 and 19 do). > Right now I get around it by hard-coding a special case ("Unknown" => > 20), but that's obviously suboptimal. Could someone rebuild > hadoop20.jar the way Pig wants it, and with the proper version > identification? If that happens, 924/660 can go in together with > hadoop20.jar and users will at least be able to build against a static > version of hadoop without requiring a patch. The hadoop 0.20 jar submitted with Zebra is not a standard jar. It has extra tfile functionality that was not in 0.20, but will be in 0.20.1. It isn't something we should publish. If we put a hadoop20.jar into pig's lib, it should be from 0.20 (or when available, 0.20.1). Alan. > > -Dmitriy > > On Tue, Aug 18, 2009 at 9:56 AM, Alan Gates<[EMAIL PROTECTED]> > wrote: >> Non-committers certainly get a vote, it just isn't binding. >> >> I agree on PIG-925 as a blocker. I don't see PIG-859 as a blocker >> since >> there is a simple work around. >> >> If we want to release 0.4.0 within a week or so, dynamic shims >> won't be an >> option because we won't be able to solve the bundled hadoop lib >> problem in >> that amount of time. I agree that we are not making life easy >> enough for >> users who want to build with hadoop 0.20. Based on comments on the >> JIRA, >> I'm not sure the patch for the static shims is ready. What if >> instead we >> checked in a version of hadoop20.jar that will work for users who >> want to >> build with 0.20. This way users can still build this if they want >> and our >> release isn't blocked on the patch. >> >> Alan. >> >> >> On Aug 17, 2009, at 12:03 PM, Dmitriy Ryaboy wrote: >> >>> Olga, >>> >>> Do non-commiters get a vote? >>> >>> Zebra is in trunk, but relies on 0.20, which is somewhat >>> inconsistent >>> even if it's in contrib/ >>> >>> Would love to see dynamic (or at least static) shims incorporated >>> into >>> the 0.4 release (see PIG-660, PIG-924) >>> >>> There are a couple of bugs still outstanding that I think would need >>> to get fixed before a release: >>> >>> https://issues.apache.org/jira/browse/PIG-859 >>> https://issues.apache.org/jira/browse/PIG-925 >>> >>> I think all of these can be solved within a week; assuming we are >>> talking about a release after these go into trunk, +1. >>> >>> -D >>> >>> >>> On Mon, Aug 17, 2009 at 11:46 AM, Olga Natkovich<olgan@yahoo- >>> inc.com> >>> wrote: >>>> >>>> Pig Developers, >>>> >>>> >>>> >>>> We have made several significant performance and other >>>> improvements over >>>> the last couple of months: >>>> >>>> >>>> >>>> (1) Added an optimizer with several rules >>>> >>>> (2) Introduced skew and merge joins >>>> >>>> (3) Cleaned COUNT and AVG semantics >>>> >>>> >>>> >>>> I think it is time for another release to make this functionality >>>> available to users. >>>> >>>> >>>> >>>> I propose that Pig 0.4.0 is released against Hadoop 18 since most >>>> users >>>> are still using this version. Once Hadoop 20.1 is released, we >>>> will roll >>>> Pig 0.5.0 based on Hadoop 20. >>>> >>>> >>>> >>>> Please, vote on the proposal by Thursday. >>>> >>>> >>>> >>>> Olga >>>> >>>> >> >> |