|
Stack
2010-12-22, 23:30
Ian Holsman
2010-12-23, 00:05
Stack
2010-12-23, 00:33
Ian Holsman
2010-12-23, 01:03
Ted Yu
2010-12-23, 01:46
Andrew Purtell
2010-12-23, 01:47
Ian Holsman
2010-12-23, 03:31
Owen O'Malley
2010-12-23, 05:31
Stack
2010-12-23, 06:16
Ian Holsman
2010-12-23, 06:24
Roy T. Fielding
2010-12-23, 07:07
Owen O'Malley
2010-12-23, 08:00
Stack
2010-12-23, 17:27
M. C. Srivas
2010-12-23, 17:33
Jakob Homan
2010-12-23, 17:38
M. C. Srivas
2010-12-23, 18:15
Ryan Rawson
2010-12-23, 19:40
Todd Lipcon
2010-12-23, 20:00
Konstantin Shvachko
2010-12-23, 22:18
Ryan Rawson
2010-12-23, 22:39
Ian Holsman
2010-12-23, 23:33
Todd Lipcon
2010-12-23, 23:52
Andrew Purtell
2010-12-24, 00:36
Jeff Hammerbacher
2010-12-24, 01:32
Konstantin Boudnik
2010-12-24, 05:17
Stack
2010-12-24, 07:52
Chris Douglas
2010-12-24, 18:57
Stack
2010-12-24, 19:06
Jeff Hammerbacher
2010-12-24, 19:28
Chris Douglas
2010-12-24, 19:31
Arun C Murthy
2010-12-25, 06:41
Stack
2010-12-27, 17:12
Chris Douglas
2010-12-27, 19:20
Stack
2010-12-29, 06:32
|
-
DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Stack 2010-12-22, 23:30
I propose cutting a release from the tip of the branch-0.20-append
branch [1]. I suggest the release be called hadoop-0.20.0-append. I volunteer to run the release process. Are folks OK with this? Here's some background. The branch-0.20-append was forked from branch-0.20 a few months ago by Dhruba to add an append/sync to 0.20.x era HDFS. The added append facility is made of the patches attached to HDFS-200 and then a bunch of fixup patches done by Dhruba, Hairong, Nicolas, Todd, and others. For a complete list of differences from the tip of the Hadoop branch-0.20, see the CHANGE.txt file in branch-0.20-append [2]. The HDFS-200 append/sync is not the same as the append/sync implementation that is in hadoop 0.21.x and hadoop TRUNK. The branch-0.20-append is a relatively small deviation from hadoop 0.20.x for those who want an append/sync in an (Apache) hadoop 0.20.x [3]. Its for those unwilling to upgrade their clusters to hadoop 0.21.0 and for those who can't wait on the coming hadoop 0.22.0. For applications like HBase [4], an application that runs on HDFS and "loses data" if no working append/sync, its critical that there is an Apache release with a working append/sync. A few of us have been playing with this branch with a while and it seems to do the right thing. Its fairly close to what FB runs internally (correct me if I'm wrong in this last statement Dhruba). Thanks, St.Ack 1. http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/ 2. http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/CHANGES.txt?view=markup 3. Cloudera's CDH3Beta2/3 already include an append/sync based off the HDFS-200++ work. There is no 'official' Apache hadoop 0.20.x with a working append/sync. 4. http://hbase.apache.org
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Ian Holsman 2010-12-23, 00:05
Hi St.Ack.
In general I'm opposed to such a thing. There are already 5 Hadoop 20.x releases out there, I don't think there is a need for another. (personal opinion, not a veto or speaking as the chair) Is there a reason why we couldn't create a hadoop 0.20.3 release that has this patch inside of it, as well as other fixes that have been applied since 0.20.2 (~26 patches)? Would this be too much effort for you to RM?.. I understand there is a large QA effort you would be taking on if you do. does the Append/Sync semantics break/deviate too much from 0.20.2 ? I really don't want to come to a^h^h^h^hget out of the situation where we have multiple releases of 0.20 each with a unique feature. On Dec 23, 2010, at 10:30 AM, Stack wrote: > I propose cutting a release from the tip of the branch-0.20-append > branch [1]. I suggest the release be called hadoop-0.20.0-append. I > volunteer to run the release process. Are folks OK with this? > > Here's some background. > > The branch-0.20-append was forked from branch-0.20 a few months ago by > Dhruba to add an append/sync to 0.20.x era HDFS. The added append > facility is made of the patches attached to HDFS-200 and then a bunch > of fixup patches done by Dhruba, Hairong, Nicolas, Todd, and others. > For a complete list of differences from the tip of the Hadoop > branch-0.20, see the CHANGE.txt file in branch-0.20-append [2]. The > HDFS-200 append/sync is not the same as the append/sync implementation > that is in hadoop 0.21.x and hadoop TRUNK. > > The branch-0.20-append is a relatively small deviation from hadoop > 0.20.x for those who want an append/sync in an (Apache) hadoop 0.20.x > [3]. Its for those unwilling to upgrade their clusters to hadoop > 0.21.0 and for those who can't wait on the coming hadoop 0.22.0. For > applications like HBase [4], an application that runs on HDFS and > "loses data" if no working append/sync, its critical that there is an > Apache release with a working append/sync. > > A few of us have been playing with this branch with a while and it > seems to do the right thing. Its fairly close to what FB runs > internally (correct me if I'm wrong in this last statement Dhruba). > > Thanks, > St.Ack > > 1. http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/ > 2. http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/CHANGES.txt?view=markup > 3. Cloudera's CDH3Beta2/3 already include an append/sync based off the > HDFS-200++ work. There is no 'official' Apache hadoop 0.20.x with a > working append/sync. > 4. http://hbase.apache.org
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Stack 2010-12-23, 00:33
On Wed, Dec 22, 2010 at 4:05 PM, Ian Holsman <[EMAIL PROTECTED]> wrote:
> There are already 5 Hadoop 20.x releases out there, I don't think there is a need for another. (personal opinion, not a veto or speaking as the chair) > Are you counting other than Apache releases? (I see only 4 here, two of which probably should be removed: http://www.gtlib.gatech.edu/pub/apache//hadoop/core/.) > Is there a reason why we couldn't create a hadoop 0.20.3 release that has this patch inside of it, as well as other fixes that have been applied since 0.20.2 (~26 patches)? Would this be too much effort for you to RM?.. > I'd like that but my sense is the general populace of hadoopers would think the append/sync suite of patches destabilizing -- append/sync has a long 'history' in hadoop -- and a violation of the general principal that bug fixes only are added on a branch. > I really don't want to come to a^h^h^h^hget out of the situation where we have multiple releases of 0.20 each with a unique feature. > Sure. The notion has been broached before up on these lists -- e.g. there was talk of a 0.20 Apache release that had security in it -- and at the time folks seemed amenable. Thanks for getting the discussion off the ground, St.Ack
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Ian Holsman 2010-12-23, 01:03
On Dec 23, 2010, at 11:33 AM, Stack wrote: > On Wed, Dec 22, 2010 at 4:05 PM, Ian Holsman <[EMAIL PROTECTED]> wrote: >> There are already 5 Hadoop 20.x releases out there, I don't think there is a need for another. (personal opinion, not a veto or speaking as the chair) >> > > Are you counting other than Apache releases? (I see only 4 here, two > of which probably should be removed: > http://www.gtlib.gatech.edu/pub/apache//hadoop/core/.) yes.. I was referring to the external companies who have decided to release their own version, for their own business purposes. (please don't take that as a negative). > >> Is there a reason why we couldn't create a hadoop 0.20.3 release that has this patch inside of it, as well as other fixes that have been applied since 0.20.2 (~26 patches)? Would this be too much effort for you to RM?.. >> > > I'd like that but my sense is the general populace of hadoopers would > think the append/sync suite of patches destabilizing -- append/sync > has a long 'history' in hadoop -- and a violation of the general > principal that bug fixes only are added on a branch. I'm open with adding it, as lack of append/sync could be seen as a bug to some. (yes i'm playing with words) > > >> I really don't want to come to a^h^h^h^hget out of the situation where we have multiple releases of 0.20 each with a unique feature. >> > > Sure. The notion has been broached before up on these lists -- e.g. > there was talk of a 0.20 Apache release that had security in it -- and > at the time folks seemed amenable. I think that approach encourages groups of individuals/companies to huddle up together to build large features without taking the larger group into account and then 'drop' the feature off and wait for others to thank them & port it to their releases. We then become multiple communities instead of a single one. We will end up with Apache+Security release vs Apache+Append release vs Apache+Avatar release, with various bug-fixes sprinkled into each. And I'm not sure which release Pig or Hbase would target to develop against. Thats why I think we should go to 0.22 ASAP and get companies to build their new features on trunk against that. > > Thanks for getting the discussion off the ground, > St.Ack
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Ted Yu 2010-12-23, 01:46
> Thats why I think we should go to 0.22 ASAP and get companies to build
their new features on trunk against that. There was a thread in Nov - 'Caution using Hadoop 0.21' It would be helpful to see response to 0.22 > > > > Thanks for getting the discussion off the ground, > > St.Ack > >
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Andrew Purtell 2010-12-23, 01:47
I'm on the HBase PMC.
> We will end up with Apache+Security release vs > Apache+Append release vs Apache+Avatar release, The current situation is pretty close to this. HBase has no suitable binary ASF Hadoop release to work against, currently. Vanilla version 0.20 does not have sync/append support. We recommend users adopt Cloudera's CDH3 beta 2, or compile the 0.20-append branch from source. Version 0.21 is marked as unstable, was not tested at scale by Yahoo (unlike 0.20), and has been panned by many would be adopters, if the various tweets and blog posts I have seen in that regard are any indication. > Thats why I think we should go to 0.22 ASAP and get > companies to build their new features on trunk against > that. If Hadoop 0.22 is not vetted at high scale as was 0.20 -- this is the current situation with 0.21 -- then I fear the current situation will not change and HBase will still to refer would be users to a non-ASF release or a source-only branch. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) --- On Wed, 12/22/10, Ian Holsman <[EMAIL PROTECTED]> wrote: > From: Ian Holsman <[EMAIL PROTECTED]> > Subject: Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch? > To: [EMAIL PROTECTED] > Date: Wednesday, December 22, 2010, 5:03 PM > > On Dec 23, 2010, at 11:33 AM, Stack wrote: > > > On Wed, Dec 22, 2010 at 4:05 PM, Ian Holsman <[EMAIL PROTECTED]> > wrote: > >> There are already 5 Hadoop 20.x releases out > there, I don't think there is a need for another. (personal > opinion, not a veto or speaking as the chair) > >> > > > > Are you counting other than Apache releases? (I > see only 4 here, two > > of which probably should be removed: > > http://www.gtlib.gatech.edu/pub/apache//hadoop/core/.) > > > yes.. I was referring to the external companies who have > decided to release their own version, for their own business > purposes. (please don't take that as a negative). > > > > >> Is there a reason why we couldn't create a hadoop > 0.20.3 release that has this patch inside of it, as well as > other fixes that have been applied since 0.20.2 (~26 > patches)? Would this be too much effort for you to RM?.. > >> > > > > I'd like that but my sense is the general populace of > hadoopers would > > think the append/sync suite of patches destabilizing > -- append/sync > > has a long 'history' in hadoop -- and a violation of > the general > > principal that bug fixes only are added on a branch. > > I'm open with adding it, as lack of append/sync could be > seen as a bug to some. (yes i'm playing with words) > > > > > >> I really don't want to come to a^h^h^h^hget out of > the situation where we have multiple releases of 0.20 each > with a unique feature. > >> > > > > Sure. The notion has been broached before up on > these lists -- e.g. > > there was talk of a 0.20 Apache release that had > security in it -- and > > at the time folks seemed amenable. > > I think that approach encourages groups of > individuals/companies to huddle up together to build large > features without taking the larger group into account and > then 'drop' the feature off and wait for others to thank > them & port it to their releases. We then become > multiple communities instead of a single one. > > We will end up with Apache+Security release vs > Apache+Append release vs Apache+Avatar release, with various > bug-fixes sprinkled into each. > And I'm not sure which release Pig or Hbase would target to > develop against. > > Thats why I think we should go to 0.22 ASAP and get > companies to build their new features on trunk against > that. > > > > > Thanks for getting the discussion off the ground, > > St.Ack > >
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Ian Holsman 2010-12-23, 03:31
On Dec 23, 2010, at 12:47 PM, Andrew Purtell wrote: > I'm on the HBase PMC. > >> We will end up with Apache+Security release vs >> Apache+Append release vs Apache+Avatar release, > > The current situation is pretty close to this. agreed. and I would like to make it better > > HBase has no suitable binary ASF Hadoop release to work against, currently. Vanilla version 0.20 does not have sync/append support. We recommend users adopt Cloudera's CDH3 beta 2, or compile the 0.20-append branch from source. Version 0.21 is marked as unstable, was not tested at scale by Yahoo (unlike 0.20), and has been panned by many would be adopters, if the various tweets and blog posts I have seen in that regard are any indication. > I'd like to make two points here: 1. There is no substitute for your own QA team. You can never rely on a single company to do your testing for you. While it was great Yahoo tested the initial releases, you can see by their own distribution that what they were/are running is different to what other people are running. It is better for people to not blindly trust that just because company X is claiming that the are running something that it will work for them, and we cannot just rely on a individual or single company to provide that service going forward. Communities don't work that way. And reliance on a single company to provide your core infrastructure for gratis isn't really going to end up well either. Saying that, we are very lucky that Yahoo has chosen to openly contribute as much as they have, and I look forward to them and other large installation's contributions and participation going forward. 2. Hadoop is only one piece of the puzzle for most installations. One of the other issues with 0.21 (and with future releases going forward) is that 3rd parties did not port/upgrade their software to run with our new APIs. Without major software like Hbase, Pig, Hive being able to run on the platform, major installations won't even bother looking at it. I don't expect people to immediately upgrade to 0.22 when we release it. I expect it will take a good 3-6 months until people have the software they run available on it, and possibly a point release with some of problems people have found in their own testing fixed in our and other software. Like I said, I don't mind getting 0.20.3 released with the append/sync patch applied to it (with the other 20 or so patches), but I don't think the Hadoop team is large enough to support all the different releases as-is, let alone another one. --Ian >> Thats why I think we should go to 0.22 ASAP and get >> companies to build their new features on trunk against >> that. > > If Hadoop 0.22 is not vetted at high scale as was 0.20 -- this is the current situation with 0.21 -- then I fear the current situation will not change and HBase will still to refer would be users to a non-ASF release or a source-only branch. > > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. > - Piet Hein (via Tom White) > > > --- On Wed, 12/22/10, Ian Holsman <[EMAIL PROTECTED]> wrote: > >> From: Ian Holsman <[EMAIL PROTECTED]> >> Subject: Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch? >> To: [EMAIL PROTECTED] >> Date: Wednesday, December 22, 2010, 5:03 PM >> >> On Dec 23, 2010, at 11:33 AM, Stack wrote: >> >>> On Wed, Dec 22, 2010 at 4:05 PM, Ian Holsman <[EMAIL PROTECTED]> >> wrote: >>>> There are already 5 Hadoop 20.x releases out >> there, I don't think there is a need for another. (personal >> opinion, not a veto or speaking as the chair) >>>> >>> >>> Are you counting other than Apache releases? (I >> see only 4 here, two >>> of which probably should be removed: >>> http://www.gtlib.gatech.edu/pub/apache//hadoop/core/.) >> >> >> yes.. I was referring to the external companies who have >> decided to release their own version, for their own business >> purposes. (please don't take that as a negative).
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Owen O'Malley 2010-12-23, 05:31
On Wed, Dec 22, 2010 at 4:05 PM, Ian Holsman <[EMAIL PROTECTED]> wrote:
> Hi St.Ack. > > In general I'm opposed to such a thing. > > There are already 5 Hadoop 20.x releases out there, I don't think there is > a need for another. > Stack is trying to get an Apache version of Hadoop that solves his problem. He's been asking for it for a year now. If the 20-append branch is stable, we should release it. > > Is there a reason why we couldn't create a hadoop 0.20.3 release that has > this patch inside of it, as well as other fixes that have been applied since > 0.20.2 (~26 patches)? It is of course possible, but no one has done the work to build the version and test it out. As to what to call it, I'm a little hesitant to call it 0.20.3 at this point, since the last time I asked it was considered a fairly risky change. If it goes badly, it would mean an incompatible revert onto a branch that has been stable for 1.5 years. I'd much rather call it 0.20-append.0 and use 0.20.3 for straight bugfixes. -- Owen
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Stack 2010-12-23, 06:16
On Wed, Dec 22, 2010 at 5:03 PM, Ian Holsman <[EMAIL PROTECTED]> wrote:>
>> Are you counting other than Apache releases? (I see only 4 here, two >> of which probably should be removed: >> http://www.gtlib.gatech.edu/pub/apache//hadoop/core/.) > > > yes.. I was referring to the external companies who have decided to release their own version, for their own business purposes. (please don't take that as a negative). > Oh. I was not counting those at all. Currently, over in HBase we tell users build and 'trust' your own Hadoop binary from the tip of what to them probably looks like some random Hadoop branch OR go get the slick Cloudera pre-builts since Cloudera's CDH3s have append/sync. By offering to release a hadoop-0.20.0-append, I was just trying to make some remiss for a gaping hole in the Apache Hadoop offering. >>> Is there a reason why we couldn't create a hadoop 0.20.3 release that has this patch inside of it, as well as other fixes that have been applied since 0.20.2 (~26 patches)? Would this be too much effort for you to RM?.. >>> >> ... > > I'm open with adding it, as lack of append/sync could be seen as a bug to some. (yes i'm playing with words) My guess is that few would see it the way you do. Append/sync has had a long torturous history. HADOOP-1700, the original append issue, was originally opened in August 2007. There have been two implementations. The one in branch-0.20-append is the 'deprecated' implementation; i.e. its not the append that is in Hadoop TRUNK (though IIUC the 'deprecated' append runs on the largest 'known' HDFS cluster). At least once, append was part of a release and then pulled because it was 'destabilizing'. It might be hard getting such a storied, scarred feature in as a 'bug fix'. If it did go in, the append/sync is of such a reputation that it might sully the current good standing hadoop 0.20 branch releases hold. That said, I'm cavalier and if others are game, I'd be up for running a 0.20.3 release that included it. > Thats why I think we should go to 0.22 ASAP and get companies to build their new features on trunk against that. > Waiting on 0.22 and its adoption is not going to work for HBase. The HBase project would be long dead if waiting on 0.22 were the only option available to us. In fact we'd be dead already if it wasn't for the lifeline thrown us by the folks who hooked us up with branch-0.20-append. St.Ack
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Ian Holsman 2010-12-23, 06:24
In that case, I'm +1 on releasing a 20+append branch, but am nervous on how much effort will be put into testing it. But this option is better than the current apache alternative out there as you and Owen mentioned.
--- Ian Holsman - 703 879-3128 I saw the angel in the marble and carved until I set him free -- Michelangelo On 23/12/2010, at 5:16 PM, Stack <[EMAIL PROTECTED]> wrote: > On Wed, Dec 22, 2010 at 5:03 PM, Ian Holsman <[EMAIL PROTECTED]> wrote:> >>> Are you counting other than Apache releases? (I see only 4 here, two >>> of which probably should be removed: >>> http://www.gtlib.gatech.edu/pub/apache//hadoop/core/.) >> >> >> yes.. I was referring to the external companies who have decided to release their own version, for their own business purposes. (please don't take that as a negative). >> > > Oh. I was not counting those at all. > > Currently, over in HBase we tell users build and 'trust' your own > Hadoop binary from the tip of what to them probably looks like some > random Hadoop branch OR go get the slick Cloudera pre-builts since > Cloudera's CDH3s have append/sync. By offering to release a > hadoop-0.20.0-append, I was just trying to make some remiss for a > gaping hole in the Apache Hadoop offering. > > >>>> Is there a reason why we couldn't create a hadoop 0.20.3 release that has this patch inside of it, as well as other fixes that have been applied since 0.20.2 (~26 patches)? Would this be too much effort for you to RM?.. >>>> >>> ... >> >> I'm open with adding it, as lack of append/sync could be seen as a bug to some. (yes i'm playing with words) > > My guess is that few would see it the way you do. Append/sync has had > a long torturous history. HADOOP-1700, the original append issue, was > originally opened in August 2007. There have been two > implementations. The one in branch-0.20-append is the 'deprecated' > implementation; i.e. its not the append that is in Hadoop TRUNK > (though IIUC the 'deprecated' append runs on the largest 'known' HDFS > cluster). At least once, append was part of a release and then pulled > because it was 'destabilizing'. It might be hard getting such a > storied, scarred feature in as a 'bug fix'. If it did go in, the > append/sync is of such a reputation that it might sully the current > good standing hadoop 0.20 branch releases hold. > > That said, I'm cavalier and if others are game, I'd be up for running > a 0.20.3 release that included it. > >> Thats why I think we should go to 0.22 ASAP and get companies to build their new features on trunk against that. >> > > Waiting on 0.22 and its adoption is not going to work for HBase. The > HBase project would be long dead if waiting on 0.22 were the only > option available to us. In fact we'd be dead already if it wasn't for > the lifeline thrown us by the folks who hooked us up with > branch-0.20-append. > > St.Ack
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Roy T. Fielding 2010-12-23, 07:07
On Dec 22, 2010, at 10:24 PM, Ian Holsman wrote:
> In that case, I'm +1 on releasing a 20+append branch, but am nervous on how much effort will be put into testing it. But this option is better than the current apache alternative out there as you and Owen mentioned. Features are not release version tags. If there is a security bug found then we would have to release a new version of the append version, and a round of severe trout slapping would result. If someone builds the source package and there are enough votes to release it, then the version number is one of 0.22.0 (assuming trunk hasn't been released yet), 0.23.0 (if 0.22.x is already published), or 1.0.0. That is, unless the PMC decides to make it a separate product, in which case it would be hadoopend-0.20.0 (or something like that) and will either die a slow and painful death or someone else will pick it up and fork the project. ....Roy
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Owen O'Malley 2010-12-23, 08:00
On Wed, Dec 22, 2010 at 11:07 PM, Roy T. Fielding <[EMAIL PROTECTED]> wrote:
> Features are not release version tags. If there is a security bug > found then we would have to release a new version of the append > version, and a round of severe trout slapping would result. > Yeah, it isn't a perfect solution and it doesn't scale to a second tag, but the problem is that this is effectively a release branch between 0.20 and 0.21. Of course I agree that any critical bugs would need to be fixed in the append branch as well as the 0.20 and 0.21 branches. If you want to stick to pure numbers and we want to leave ourselves a way to bugfix the 0.20 branch without append, we'd could use a version string like 0.20.100, etc. Not pretty, but it does preserve the numeric ordering and suggest a version jump. If I remember right, there were also protocol changes in the append branch, which was another reason we didn't want to put it directly into the 0.20 branch. -- Owen
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Stack 2010-12-23, 17:27
On Thu, Dec 23, 2010 at 12:00 AM, Owen O'Malley <[EMAIL PROTECTED]> wrote:
> If I remember right, there were also protocol changes in the append branch, > which was another reason we didn't want to put it directly into the 0.20 > branch. > That is indeed the case Owen. St.Ack
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?M. C. Srivas 2010-12-23, 17:33
[ Sorry if this is be-laboring the obvious ]
There are two append solutions floating around, and they are incompatible with each other. Thus, the two "branches" will forever remain incompatible with each other, regardless of how they are numbered (0.22, 0.23, 0.20.3, e.t.c.) Unless both are merged into one branch, and a switch provided to "use HDFS-200 append" or "use 0.22 append", we have effectively split Hadoop into two. On Thu, Dec 23, 2010 at 12:00 AM, Owen O'Malley <[EMAIL PROTECTED]> wrote: > On Wed, Dec 22, 2010 at 11:07 PM, Roy T. Fielding <[EMAIL PROTECTED]> > wrote: > > > Features are not release version tags. If there is a security bug > > found then we would have to release a new version of the append > > version, and a round of severe trout slapping would result. > > > > Yeah, it isn't a perfect solution and it doesn't scale to a second tag, but > the problem is that this is effectively a release branch between 0.20 and > 0.21. Of course I agree that any critical bugs would need to be fixed in > the > append branch as well as the 0.20 and 0.21 branches. > > If you want to stick to pure numbers and we want to leave ourselves a way > to > bugfix the 0.20 branch without append, we'd could use a version string like > 0.20.100, etc. Not pretty, but it does preserve the numeric ordering and > suggest a version jump. > > If I remember right, there were also protocol changes in the append branch, > which was another reason we didn't want to put it directly into the 0.20 > branch. > > -- Owen >
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Jakob Homan 2010-12-23, 17:38
It's difficult to support this proposal knowing how much time would be
spent preparing an official release, continuing to support it and continuing to two support two separate implementations of append. I believe that effort would be better spent getting out a kick-ass 22 (or, barring that, a *really* kick-ass 23). The Promised Land that we say we're all trying to get to is regular, timely, feature-complete, tested, innovative but stable releases of new versions of Apache Hadoop. Missing out any one of those criteria discovered will continue (and has continued) the current situation where quasi-official branches and outside distributions fill the void such a release should. The effort to maintain this offical branch and fix the bugs that will be discovered could be better spent moving us closer to that goal. I'm certainly sympathetic to the difficult position our quagmire has placed HBase into. However, the current proposal would hurt HDFS to help HBase. The best solution for that project, as well as for HDFS, is to get HDFS back to a healthy release cycle; not prolong or codify the current ad-hoc state of affairs. Let's stop digging this hole. -jakob On Thu, Dec 23, 2010 at 9:33 AM, M. C. Srivas <[EMAIL PROTECTED]> wrote: > [ Sorry if this is be-laboring the obvious ] > > There are two append solutions floating around, and they are incompatible > with each other. Thus, the two "branches" will forever remain incompatible > with each other, regardless of how they are numbered (0.22, 0.23, 0.20.3, > e.t.c.) > > Unless both are merged into one branch, and a switch provided to "use > HDFS-200 append" or "use 0.22 append", we have effectively split Hadoop into > two. > > > On Thu, Dec 23, 2010 at 12:00 AM, Owen O'Malley <[EMAIL PROTECTED]> wrote: > >> On Wed, Dec 22, 2010 at 11:07 PM, Roy T. Fielding <[EMAIL PROTECTED]> >> wrote: >> >> > Features are not release version tags. If there is a security bug >> > found then we would have to release a new version of the append >> > version, and a round of severe trout slapping would result. >> > >> >> Yeah, it isn't a perfect solution and it doesn't scale to a second tag, but >> the problem is that this is effectively a release branch between 0.20 and >> 0.21. Of course I agree that any critical bugs would need to be fixed in >> the >> append branch as well as the 0.20 and 0.21 branches. >> >> If you want to stick to pure numbers and we want to leave ourselves a way >> to >> bugfix the 0.20 branch without append, we'd could use a version string like >> 0.20.100, etc. Not pretty, but it does preserve the numeric ordering and >> suggest a version jump. >> >> If I remember right, there were also protocol changes in the append branch, >> which was another reason we didn't want to put it directly into the 0.20 >> branch. >> >> -- Owen >> >
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?M. C. Srivas 2010-12-23, 18:15
On Thu, Dec 23, 2010 at 9:38 AM, Jakob Homan <[EMAIL PROTECTED]> wrote:
> It's difficult to support this proposal knowing how much time would be > spent preparing an official release, continuing to support it and > continuing to two support two separate implementations of append. I > believe that effort would be better spent getting out a kick-ass 22 > (or, barring that, a *really* kick-ass 23). > Regardless, there will still be 2 incompatible "branches". And that is only the beginning. Some future features will be done only on branch 1 (since company 1 uses that), and other features on branch 2 (by company 2, since they prefer branch 2), thereby further separating the two branches. If the goal is to avoid the split, then there are only 2 choices: (a) merge both (b) abandon one or the other. Which one is one willing to stomach? > > The Promised Land that we say we're all trying to get to is regular, > timely, feature-complete, tested, innovative but stable releases of > new versions of Apache Hadoop. Missing out any one of those criteria > discovered will continue (and has continued) the current situation > where quasi-official branches and outside distributions fill the void > such a release should. The effort to maintain this offical branch and > fix the bugs that will be discovered could be better spent moving us > closer to that goal. > > I'm certainly sympathetic to the difficult position our quagmire has > placed HBase into. However, the current proposal would hurt HDFS to > help HBase. The best solution for that project, as well as for HDFS, > is to get HDFS back to a healthy release cycle; not prolong or codify > the current ad-hoc state of affairs. Let's stop digging this hole. -jakob > > On Thu, Dec 23, 2010 at 9:33 AM, M. C. Srivas <[EMAIL PROTECTED]> wrote: > > [ Sorry if this is be-laboring the obvious ] > > > > There are two append solutions floating around, and they are incompatible > > with each other. Thus, the two "branches" will forever remain > incompatible > > with each other, regardless of how they are numbered (0.22, 0.23, > 0.20.3, > > e.t.c.) > > > > Unless both are merged into one branch, and a switch provided to "use > > HDFS-200 append" or "use 0.22 append", we have effectively split Hadoop > into > > two. > > > > > > On Thu, Dec 23, 2010 at 12:00 AM, Owen O'Malley <[EMAIL PROTECTED]> > wrote: > > > >> On Wed, Dec 22, 2010 at 11:07 PM, Roy T. Fielding <[EMAIL PROTECTED]> > >> wrote: > >> > >> > Features are not release version tags. If there is a security bug > >> > found then we would have to release a new version of the append > >> > version, and a round of severe trout slapping would result. > >> > > >> > >> Yeah, it isn't a perfect solution and it doesn't scale to a second tag, > but > >> the problem is that this is effectively a release branch between 0.20 > and > >> 0.21. Of course I agree that any critical bugs would need to be fixed in > >> the > >> append branch as well as the 0.20 and 0.21 branches. > >> > >> If you want to stick to pure numbers and we want to leave ourselves a > way > >> to > >> bugfix the 0.20 branch without append, we'd could use a version string > like > >> 0.20.100, etc. Not pretty, but it does preserve the numeric ordering and > >> suggest a version jump. > >> > >> If I remember right, there were also protocol changes in the append > branch, > >> which was another reason we didn't want to put it directly into the 0.20 > >> branch. > >> > >> -- Owen > >> > > >
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Ryan Rawson 2010-12-23, 19:40
The append solution in 0.22 that you are referring to was supposed to
be out 13-15 months ago. Pardon if I look for solutions that deploy 4 months ago (as the 0.20 append branch did). Another 12-15 months of delay is not exactly helping HDFS either. -ryan On Thu, Dec 23, 2010 at 9:38 AM, Jakob Homan <[EMAIL PROTECTED]> wrote: > It's difficult to support this proposal knowing how much time would be > spent preparing an official release, continuing to support it and > continuing to two support two separate implementations of append. I > believe that effort would be better spent getting out a kick-ass 22 > (or, barring that, a *really* kick-ass 23). > > The Promised Land that we say we're all trying to get to is regular, > timely, feature-complete, tested, innovative but stable releases of > new versions of Apache Hadoop. Missing out any one of those criteria > discovered will continue (and has continued) the current situation > where quasi-official branches and outside distributions fill the void > such a release should. The effort to maintain this offical branch and > fix the bugs that will be discovered could be better spent moving us > closer to that goal. > > I'm certainly sympathetic to the difficult position our quagmire has > placed HBase into. However, the current proposal would hurt HDFS to > help HBase. The best solution for that project, as well as for HDFS, > is to get HDFS back to a healthy release cycle; not prolong or codify > the current ad-hoc state of affairs. Let's stop digging this hole. > -jakob > > On Thu, Dec 23, 2010 at 9:33 AM, M. C. Srivas <[EMAIL PROTECTED]> wrote: >> [ Sorry if this is be-laboring the obvious ] >> >> There are two append solutions floating around, and they are incompatible >> with each other. Thus, the two "branches" will forever remain incompatible >> with each other, regardless of how they are numbered (0.22, 0.23, 0.20.3, >> e.t.c.) >> >> Unless both are merged into one branch, and a switch provided to "use >> HDFS-200 append" or "use 0.22 append", we have effectively split Hadoop into >> two. >> >> >> On Thu, Dec 23, 2010 at 12:00 AM, Owen O'Malley <[EMAIL PROTECTED]> wrote: >> >>> On Wed, Dec 22, 2010 at 11:07 PM, Roy T. Fielding <[EMAIL PROTECTED]> >>> wrote: >>> >>> > Features are not release version tags. If there is a security bug >>> > found then we would have to release a new version of the append >>> > version, and a round of severe trout slapping would result. >>> > >>> >>> Yeah, it isn't a perfect solution and it doesn't scale to a second tag, but >>> the problem is that this is effectively a release branch between 0.20 and >>> 0.21. Of course I agree that any critical bugs would need to be fixed in >>> the >>> append branch as well as the 0.20 and 0.21 branches. >>> >>> If you want to stick to pure numbers and we want to leave ourselves a way >>> to >>> bugfix the 0.20 branch without append, we'd could use a version string like >>> 0.20.100, etc. Not pretty, but it does preserve the numeric ordering and >>> suggest a version jump. >>> >>> If I remember right, there were also protocol changes in the append branch, >>> which was another reason we didn't want to put it directly into the 0.20 >>> branch. >>> >>> -- Owen >>> >> >
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Todd Lipcon 2010-12-23, 20:00
On Thu, Dec 23, 2010 at 10:15 AM, M. C. Srivas <[EMAIL PROTECTED]> wrote:
> Regardless, there will still be 2 incompatible "branches". And that is only > the beginning. > > Some future features will be done only on branch 1 (since company 1 uses > that), and other features on branch 2 (by company 2, since they prefer > branch 2), thereby further separating the two branches. > > If the goal is to avoid the split, then there are only 2 choices: > (a) merge both > (b) abandon one or the other. > > The 0.20 append solution has never been seen as a fork. It's a stop-gap fixup of the 0.20 append feature, but we don't intend to forward-port that append implementation into trunk. From an API perspective it's very close to the 0.22 version, and I think everyone fully intends to abandon the 0.20-append work once 0.22 append has been heavily tested for HBase workloads. > > > > > The Promised Land that we say we're all trying to get to is regular, > > timely, feature-complete, tested, innovative but stable releases of > > new versions of Apache Hadoop. Missing out any one of those criteria > > discovered will continue (and has continued) the current situation > > where quasi-official branches and outside distributions fill the void > > such a release should. The effort to maintain this offical branch and > > fix the bugs that will be discovered could be better spent moving us > > closer to that goal. > > > +1. Interestingly, the work on 0.20-append uncovered a number of bugs that also will apply to 0.22's implementation. So it wasn't all a wasted effort ;-) -- Todd Lipcon Software Engineer, Cloudera
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Konstantin Shvachko 2010-12-23, 22:18
I also think building 0.20-append will be a major distraction from moving
0.22 forward with all the great new features, including the new append implementation, sitting on the bench because we are delaying the release. It seems to be beneficial for the entire community to focus on 0.22 rather than chasing both birds. I hear a concern that 0.22 will lack large scale testing as was the case with 0.21. I'd like to volunteer to put as many large scale resources, as I can grasp, into stabilizing of 0.22. Under Nigel's management of course. This should get us to production quality in 3-6 months rather than "another 12-15". I also hope it can go even faster/better if others could join the effort. I see > 100 companies claiming they are powered by Apache Hadoop. I also hope with this effort HBase will be able to start moving to the new append implementation in the next 2-3 months, which in turn will help 0.22 HDFS rather than divert resources from it as it would have be with 0.20-append. Stack, will this plan will work for HBase survival? One other thought. Apache Hadoop community is not in control of external releases and distributions, but we should not fork our own releases by introducing competing apis. If we can keep the dev line relatively straight the external releases will follow. Thanks, --Konstantin On Thu, Dec 23, 2010 at 11:40 AM, Ryan Rawson <[EMAIL PROTECTED]> wrote: > The append solution in 0.22 that you are referring to was supposed to > be out 13-15 months ago. Pardon if I look for solutions that deploy 4 > months ago (as the 0.20 append branch did). > > Another 12-15 months of delay is not exactly helping HDFS either. > > -ryan > > On Thu, Dec 23, 2010 at 9:38 AM, Jakob Homan <[EMAIL PROTECTED]> wrote: > > It's difficult to support this proposal knowing how much time would be > > spent preparing an official release, continuing to support it and > > continuing to two support two separate implementations of append. I > > believe that effort would be better spent getting out a kick-ass 22 > > (or, barring that, a *really* kick-ass 23). > > > > The Promised Land that we say we're all trying to get to is regular, > > timely, feature-complete, tested, innovative but stable releases of > > new versions of Apache Hadoop. Missing out any one of those criteria > > discovered will continue (and has continued) the current situation > > where quasi-official branches and outside distributions fill the void > > such a release should. The effort to maintain this offical branch and > > fix the bugs that will be discovered could be better spent moving us > > closer to that goal. > > > > I'm certainly sympathetic to the difficult position our quagmire has > > placed HBase into. However, the current proposal would hurt HDFS to > > help HBase. The best solution for that project, as well as for HDFS, > > is to get HDFS back to a healthy release cycle; not prolong or codify > > the current ad-hoc state of affairs. Let's stop digging this hole. > > -jakob > > > > On Thu, Dec 23, 2010 at 9:33 AM, M. C. Srivas <[EMAIL PROTECTED]> > wrote: > >> [ Sorry if this is be-laboring the obvious ] > >> > >> There are two append solutions floating around, and they are > incompatible > >> with each other. Thus, the two "branches" will forever remain > incompatible > >> with each other, regardless of how they are numbered (0.22, 0.23, > 0.20.3, > >> e.t.c.) > >> > >> Unless both are merged into one branch, and a switch provided to "use > >> HDFS-200 append" or "use 0.22 append", we have effectively split Hadoop > into > >> two. > >> > >> > >> On Thu, Dec 23, 2010 at 12:00 AM, Owen O'Malley <[EMAIL PROTECTED]> > wrote: > >> > >>> On Wed, Dec 22, 2010 at 11:07 PM, Roy T. Fielding <[EMAIL PROTECTED]> > >>> wrote: > >>> > >>> > Features are not release version tags. If there is a security bug > >>> > found then we would have to release a new version of the append > >>> > version, and a round of severe trout slapping would result. > >>> >
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Ryan Rawson 2010-12-23, 22:39
How does stack volunteering his time to release an existing branch
divert resources? Without an ASF release of 0.20-append I will keep having to recommend an external vendor's release of Hadoop. On Thu, Dec 23, 2010 at 2:18 PM, Konstantin Shvachko <[EMAIL PROTECTED]> wrote: > I also think building 0.20-append will be a major distraction from moving > 0.22 forward with all the great new features, including the new append > implementation, sitting on the bench because we are delaying the release. > It seems to be beneficial for the entire community to focus on 0.22 rather > than chasing both birds. > > I hear a concern that 0.22 will lack large scale testing as was the case > with 0.21. > I'd like to volunteer to put as many large scale resources, as I can grasp, > into stabilizing of 0.22. Under Nigel's management of course. > This should get us to production quality in 3-6 months rather than > "another 12-15". I also hope it can go even faster/better if others > could join the effort. I see > 100 companies claiming they are powered by > Apache Hadoop. > > I also hope with this effort HBase will be able to start moving to the new > append implementation in the next 2-3 months, which in turn will help 0.22 > HDFS > rather than divert resources from it as it would have be with 0.20-append. > > Stack, will this plan will work for HBase survival? > > One other thought. Apache Hadoop community is not in control of external > releases and distributions, but we should not fork our own releases by > introducing > competing apis. If we can keep the dev line relatively straight the external > releases > will follow. > > Thanks, > --Konstantin > > > On Thu, Dec 23, 2010 at 11:40 AM, Ryan Rawson <[EMAIL PROTECTED]> wrote: > >> The append solution in 0.22 that you are referring to was supposed to >> be out 13-15 months ago. Pardon if I look for solutions that deploy 4 >> months ago (as the 0.20 append branch did). >> >> Another 12-15 months of delay is not exactly helping HDFS either. >> >> -ryan >> >> On Thu, Dec 23, 2010 at 9:38 AM, Jakob Homan <[EMAIL PROTECTED]> wrote: >> > It's difficult to support this proposal knowing how much time would be >> > spent preparing an official release, continuing to support it and >> > continuing to two support two separate implementations of append. I >> > believe that effort would be better spent getting out a kick-ass 22 >> > (or, barring that, a *really* kick-ass 23). >> > >> > The Promised Land that we say we're all trying to get to is regular, >> > timely, feature-complete, tested, innovative but stable releases of >> > new versions of Apache Hadoop. Missing out any one of those criteria >> > discovered will continue (and has continued) the current situation >> > where quasi-official branches and outside distributions fill the void >> > such a release should. The effort to maintain this offical branch and >> > fix the bugs that will be discovered could be better spent moving us >> > closer to that goal. >> > >> > I'm certainly sympathetic to the difficult position our quagmire has >> > placed HBase into. However, the current proposal would hurt HDFS to >> > help HBase. The best solution for that project, as well as for HDFS, >> > is to get HDFS back to a healthy release cycle; not prolong or codify >> > the current ad-hoc state of affairs. Let's stop digging this hole. >> > -jakob >> > >> > On Thu, Dec 23, 2010 at 9:33 AM, M. C. Srivas <[EMAIL PROTECTED]> >> wrote: >> >> [ Sorry if this is be-laboring the obvious ] >> >> >> >> There are two append solutions floating around, and they are >> incompatible >> >> with each other. Thus, the two "branches" will forever remain >> incompatible >> >> with each other, regardless of how they are numbered (0.22, 0.23, >> 0.20.3, >> >> e.t.c.) >> >> >> >> Unless both are merged into one branch, and a switch provided to "use >> >> HDFS-200 append" or "use 0.22 append", we have effectively split Hadoop >> into >> >> two. >> >> >> >> >> >> On Thu, Dec 23, 2010 at 12:00 AM, Owen O'Malley <[EMAIL PROTECTED]>
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Ian Holsman 2010-12-23, 23:33
The release is one issue, but ongoing maintenance of it is another, which is the point roy raised.
It's a concern if we have a security issue, and who will patch it (and test it) going forward. --- Ian Holsman - 703 879-3128 I saw the angel in the marble and carved until I set him free -- Michelangelo On 24/12/2010, at 9:39 AM, Ryan Rawson <[EMAIL PROTECTED]> wrote: > How does stack volunteering his time to release an existing branch > divert resources? > > Without an ASF release of 0.20-append I will keep having to recommend > an external vendor's release of Hadoop. > > > On Thu, Dec 23, 2010 at 2:18 PM, Konstantin Shvachko > <[EMAIL PROTECTED]> wrote: >> I also think building 0.20-append will be a major distraction from moving >> 0.22 forward with all the great new features, including the new append >> implementation, sitting on the bench because we are delaying the release. >> It seems to be beneficial for the entire community to focus on 0.22 rather >> than chasing both birds. >> >> I hear a concern that 0.22 will lack large scale testing as was the case >> with 0.21. >> I'd like to volunteer to put as many large scale resources, as I can grasp, >> into stabilizing of 0.22. Under Nigel's management of course. >> This should get us to production quality in 3-6 months rather than >> "another 12-15". I also hope it can go even faster/better if others >> could join the effort. I see > 100 companies claiming they are powered by >> Apache Hadoop. >> >> I also hope with this effort HBase will be able to start moving to the new >> append implementation in the next 2-3 months, which in turn will help 0.22 >> HDFS >> rather than divert resources from it as it would have be with 0.20-append. >> >> Stack, will this plan will work for HBase survival? >> >> One other thought. Apache Hadoop community is not in control of external >> releases and distributions, but we should not fork our own releases by >> introducing >> competing apis. If we can keep the dev line relatively straight the external >> releases >> will follow. >> >> Thanks, >> --Konstantin >> >> >> On Thu, Dec 23, 2010 at 11:40 AM, Ryan Rawson <[EMAIL PROTECTED]> wrote: >> >>> The append solution in 0.22 that you are referring to was supposed to >>> be out 13-15 months ago. Pardon if I look for solutions that deploy 4 >>> months ago (as the 0.20 append branch did). >>> >>> Another 12-15 months of delay is not exactly helping HDFS either. >>> >>> -ryan >>> >>> On Thu, Dec 23, 2010 at 9:38 AM, Jakob Homan <[EMAIL PROTECTED]> wrote: >>>> It's difficult to support this proposal knowing how much time would be >>>> spent preparing an official release, continuing to support it and >>>> continuing to two support two separate implementations of append. I >>>> believe that effort would be better spent getting out a kick-ass 22 >>>> (or, barring that, a *really* kick-ass 23). >>>> >>>> The Promised Land that we say we're all trying to get to is regular, >>>> timely, feature-complete, tested, innovative but stable releases of >>>> new versions of Apache Hadoop. Missing out any one of those criteria >>>> discovered will continue (and has continued) the current situation >>>> where quasi-official branches and outside distributions fill the void >>>> such a release should. The effort to maintain this offical branch and >>>> fix the bugs that will be discovered could be better spent moving us >>>> closer to that goal. >>>> >>>> I'm certainly sympathetic to the difficult position our quagmire has >>>> placed HBase into. However, the current proposal would hurt HDFS to >>>> help HBase. The best solution for that project, as well as for HDFS, >>>> is to get HDFS back to a healthy release cycle; not prolong or codify >>>> the current ad-hoc state of affairs. Let's stop digging this hole. >>>> -jakob >>>> >>>> On Thu, Dec 23, 2010 at 9:33 AM, M. C. Srivas <[EMAIL PROTECTED]> >>> wrote: >>>>> [ Sorry if this is be-laboring the obvious ] >>>>> >>>>> There are two append solutions floating around, and they are
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Todd Lipcon 2010-12-23, 23:52
On Thu, Dec 23, 2010 at 3:33 PM, Ian Holsman <[EMAIL PROTECTED]> wrote:
> The release is one issue, but ongoing maintenance of it is another, which > is the point roy raised. > > It's a concern if we have a security issue, and who will patch it (and test > it) going forward. > The nice thing is that Hadoop 0.20.x (without security patches) has no guarantees as to security. So, I can't imagine any security issue that could possibly exist that would be worth addressing. It doesn't run as root so root escalation is only possible with a JVM bug, and it's trivially possible to read anyone's data since 0.20 has no strong authentication. Thanks -Todd > > --- > Ian Holsman - 703 879-3128 > > I saw the angel in the marble and carved until I set him free -- > Michelangelo > > On 24/12/2010, at 9:39 AM, Ryan Rawson <[EMAIL PROTECTED]> wrote: > > > How does stack volunteering his time to release an existing branch > > divert resources? > > > > Without an ASF release of 0.20-append I will keep having to recommend > > an external vendor's release of Hadoop. > > > > > > On Thu, Dec 23, 2010 at 2:18 PM, Konstantin Shvachko > > <[EMAIL PROTECTED]> wrote: > >> I also think building 0.20-append will be a major distraction from > moving > >> 0.22 forward with all the great new features, including the new append > >> implementation, sitting on the bench because we are delaying the > release. > >> It seems to be beneficial for the entire community to focus on 0.22 > rather > >> than chasing both birds. > >> > >> I hear a concern that 0.22 will lack large scale testing as was the case > >> with 0.21. > >> I'd like to volunteer to put as many large scale resources, as I can > grasp, > >> into stabilizing of 0.22. Under Nigel's management of course. > >> This should get us to production quality in 3-6 months rather than > >> "another 12-15". I also hope it can go even faster/better if others > >> could join the effort. I see > 100 companies claiming they are powered > by > >> Apache Hadoop. > >> > >> I also hope with this effort HBase will be able to start moving to the > new > >> append implementation in the next 2-3 months, which in turn will help > 0.22 > >> HDFS > >> rather than divert resources from it as it would have be with > 0.20-append. > >> > >> Stack, will this plan will work for HBase survival? > >> > >> One other thought. Apache Hadoop community is not in control of external > >> releases and distributions, but we should not fork our own releases by > >> introducing > >> competing apis. If we can keep the dev line relatively straight the > external > >> releases > >> will follow. > >> > >> Thanks, > >> --Konstantin > >> > >> > >> On Thu, Dec 23, 2010 at 11:40 AM, Ryan Rawson <[EMAIL PROTECTED]> > wrote: > >> > >>> The append solution in 0.22 that you are referring to was supposed to > >>> be out 13-15 months ago. Pardon if I look for solutions that deploy 4 > >>> months ago (as the 0.20 append branch did). > >>> > >>> Another 12-15 months of delay is not exactly helping HDFS either. > >>> > >>> -ryan > >>> > >>> On Thu, Dec 23, 2010 at 9:38 AM, Jakob Homan <[EMAIL PROTECTED]> > wrote: > >>>> It's difficult to support this proposal knowing how much time would be > >>>> spent preparing an official release, continuing to support it and > >>>> continuing to two support two separate implementations of append. I > >>>> believe that effort would be better spent getting out a kick-ass 22 > >>>> (or, barring that, a *really* kick-ass 23). > >>>> > >>>> The Promised Land that we say we're all trying to get to is regular, > >>>> timely, feature-complete, tested, innovative but stable releases of > >>>> new versions of Apache Hadoop. Missing out any one of those criteria > >>>> discovered will continue (and has continued) the current situation > >>>> where quasi-official branches and outside distributions fill the void > >>>> such a release should. The effort to maintain this offical branch and > >>>> fix the bugs that will be discovered could be better spent moving us Todd Lipcon Software Engineer, Cloudera
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Andrew Purtell 2010-12-24, 00:36
I hope that 22 will be an answer. I think I would be more comfortable with that answer if Hadoop Core were not so obviously internally conflicted and sclerotic. Potential HBase/Hadoop adopters have confidence in 20 seeing the production deployments of it. 21 was to all indications I have seen a dud. There is no reasonable basis as of yet to presume 22 will be "kick ass".
I, at least, was hoping that promoting 0.20-append from its de-facto status to something official could be a fig leaf for HBase while Hadoop Core gets its house in order. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) --- On Thu, 12/23/10, Ryan Rawson <[EMAIL PROTECTED]> wrote: > From: Ryan Rawson <[EMAIL PROTECTED]> > Subject: Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch? > To: [EMAIL PROTECTED] > Date: Thursday, December 23, 2010, 2:39 PM > How does stack volunteering his time > to release an existing branch > divert resources? > > Without an ASF release of 0.20-append I will keep having to > recommend an external vendor's release of Hadoop. > > > On Thu, Dec 23, 2010 at 2:18 PM, Konstantin Shvachko > <[EMAIL PROTECTED]> > wrote: > > I also think building 0.20-append will be a major > distraction from moving > > 0.22 forward with all the great new features, > including the new append > > implementation, sitting on the bench because we are > delaying the release. > > It seems to be beneficial for the entire community to > focus on 0.22 rather > > than chasing both birds. > > > > I hear a concern that 0.22 will lack large scale > testing as was the case > > with 0.21. > > I'd like to volunteer to put as many large scale > resources, as I can grasp, > > into stabilizing of 0.22. Under Nigel's management of > course. > > This should get us to production quality in 3-6 months > rather than > > "another 12-15". I also hope it can go even > faster/better if others > > could join the effort. I see > 100 companies > claiming they are powered by > > Apache Hadoop. > > > > I also hope with this effort HBase will be able to > start moving to the new > > append implementation in the next 2-3 months, which in > turn will help 0.22 > > HDFS > > rather than divert resources from it as it would have > be with 0.20-append. > > > > Stack, will this plan will work for HBase survival? > > > > One other thought. Apache Hadoop community is not in > control of external > > releases and distributions, but we should not fork our > own releases by > > introducing > > competing apis. If we can keep the dev line relatively > straight the external > > releases > > will follow. > > > > Thanks, > > --Konstantin > > > > > > On Thu, Dec 23, 2010 at 11:40 AM, Ryan Rawson <[EMAIL PROTECTED]> > wrote: > > > >> The append solution in 0.22 that you are referring > to was supposed to > >> be out 13-15 months ago. Pardon if I look for > solutions that deploy 4 > >> months ago (as the 0.20 append branch did). > >> > >> Another 12-15 months of delay is not exactly > helping HDFS either. > >> > >> -ryan > >> > >> On Thu, Dec 23, 2010 at 9:38 AM, Jakob Homan > <[EMAIL PROTECTED]> > wrote: > >> > It's difficult to support this proposal > knowing how much time would be > >> > spent preparing an official release, > continuing to support it and > >> > continuing to two support two separate > implementations of append. I > >> > believe that effort would be better spent > getting out a kick-ass 22 > >> > (or, barring that, a *really* kick-ass 23). > >> > > >> > The Promised Land that we say we're all > trying to get to is regular, > >> > timely, feature-complete, tested, innovative > but stable releases of > >> > new versions of Apache Hadoop. Missing out > any one of those criteria > >> > discovered will continue (and has continued) > the current situation > >> > where quasi-official branches and outside > distributions fill the void > >> > such a release should. The effort to
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Jeff Hammerbacher 2010-12-24, 01:32
After reading through the reasoning on both sides of this issue, I agree
with Ian, Konstantin, and Jakob. Nigel has already volunteered to run the 0.22 release process; let's put our energy there. Stack, the energy you would have put into the 0.20-append release could help ensure the 0.22 release makes it out in short order. That way HBase will be able to take advantage of both append (don't lose data) and security (don't give it away), and we won't derail the Hadoop Core release process, which has actually been regaining some momentum over the past several months: we got 0.21 out the door! we have a release manager for 0.22! As Roy points out, the Apache Hadoop release train has already passed 0.20; for those that require a 0.20-based HDFS with append, there are multiple places in the open source world to retrieve such bits, including the 0.20-append branch of the HDFS project. If the HBase community requires an ASF project to release such an artifact, as Roy points out, it can certainly done as a new project separate from HDFS. On Thu, Dec 23, 2010 at 4:36 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > I hope that 22 will be an answer. I think I would be more comfortable with > that answer if Hadoop Core were not so obviously internally conflicted and > sclerotic. Potential HBase/Hadoop adopters have confidence in 20 seeing the > production deployments of it. 21 was to all indications I have seen a dud. > There is no reasonable basis as of yet to presume 22 will be "kick ass". > > I, at least, was hoping that promoting 0.20-append from its de-facto status > to something official could be a fig leaf for HBase while Hadoop Core gets > its house in order. > > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. > - Piet Hein (via Tom White) > > > --- On Thu, 12/23/10, Ryan Rawson <[EMAIL PROTECTED]> wrote: > > > From: Ryan Rawson <[EMAIL PROTECTED]> > > Subject: Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip > of branch-0.20-append branch? > > To: [EMAIL PROTECTED] > > Date: Thursday, December 23, 2010, 2:39 PM > > How does stack volunteering his time > > to release an existing branch > > divert resources? > > > > Without an ASF release of 0.20-append I will keep having to > > recommend an external vendor's release of Hadoop. > > > > > > On Thu, Dec 23, 2010 at 2:18 PM, Konstantin Shvachko > > <[EMAIL PROTECTED]> > > wrote: > > > I also think building 0.20-append will be a major > > distraction from moving > > > 0.22 forward with all the great new features, > > including the new append > > > implementation, sitting on the bench because we are > > delaying the release. > > > It seems to be beneficial for the entire community to > > focus on 0.22 rather > > > than chasing both birds. > > > > > > I hear a concern that 0.22 will lack large scale > > testing as was the case > > > with 0.21. > > > I'd like to volunteer to put as many large scale > > resources, as I can grasp, > > > into stabilizing of 0.22. Under Nigel's management of > > course. > > > This should get us to production quality in 3-6 months > > rather than > > > "another 12-15". I also hope it can go even > > faster/better if others > > > could join the effort. I see > 100 companies > > claiming they are powered by > > > Apache Hadoop. > > > > > > I also hope with this effort HBase will be able to > > start moving to the new > > > append implementation in the next 2-3 months, which in > > turn will help 0.22 > > > HDFS > > > rather than divert resources from it as it would have > > be with 0.20-append. > > > > > > Stack, will this plan will work for HBase survival? > > > > > > One other thought. Apache Hadoop community is not in > > control of external > > > releases and distributions, but we should not fork our > > own releases by > > > introducing > > > competing apis. If we can keep the dev line relatively > > straight the external > > > releases > > > will follow. > > > > > > Thanks, > >
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Konstantin Boudnik 2010-12-24, 05:17
On Thu, Dec 23, 2010 at 14:18, Konstantin Shvachko <[EMAIL PROTECTED]> wrote:
> I also think building 0.20-append will be a major distraction from moving > 0.22 forward with all the great new features, including the new append > implementation, sitting on the bench because we are delaying the release. > It seems to be beneficial for the entire community to focus on 0.22 rather > than chasing both birds. > > I hear a concern that 0.22 will lack large scale testing as was the case > with 0.21. > I'd like to volunteer to put as many large scale resources, as I can grasp, > into stabilizing of 0.22. Under Nigel's management of course. > This should get us to production quality in 3-6 months rather than > "another 12-15". I also hope it can go even faster/better if others > could join the effort. I see > 100 companies claiming they are powered by > Apache Hadoop. On the similar note I's like to emphasize that a significant part of my time is going to be devoted to building system & scale testing infrastructure which would usable out of the box by any of those 100+ companies if they are willing to put any effort into testing of 0.22. Cos
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Stack 2010-12-24, 07:52
The intent of the proposed release off the branch-0.20-append was
never to derail, “hurt”, or distract from the Hadoop 0.22 effort. The HBase crew are up for helping out testing and debugging and the intent is to run atop the 0.22 version of append as well as 0.20’s append. A release off the branch-0.20-append branch was more about a ‘stop-gap’, see Todd’s explication above, or a ‘fig-leaf’ as Andrew describes it while 0.22 is stabilizing. Suggestions that projects like HBase hibernate until 0.22 don’t help (See Ryan’s comments for a sense of why). We can just keep on with what we’ve been doing up to this if the feeling is that an append release could somehow jeopardize the 0.22 effort. Its kinda hokey having to point users at some random looking branch [1] telling them build their own but thankfully this is not their only option. I’ve enjoyed the healthy back and forth, St.Ack 1. To be clear, 'random looking branch' is fruit of a bunch of hardwork by Facebookers and Clouderians.
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Chris Douglas 2010-12-24, 18:57
Does anything go wrong if HBase were to release the 0.20-append branch
as its own product? Even if it were short-lived, it sounds like that would give HBase users a working append, the HBase project could decide when to retire that work (and support it concurrently with post-0.20 append), and it sidesteps the versioning issue. -C On Thu, Dec 23, 2010 at 11:52 PM, Stack <[EMAIL PROTECTED]> wrote: > The intent of the proposed release off the branch-0.20-append was > never to derail, “hurt”, or distract from the Hadoop 0.22 effort. The > HBase crew are up for helping out testing and debugging and the intent > is to run atop the 0.22 version of append as well as 0.20’s append. A > release off the branch-0.20-append branch was more about a ‘stop-gap’, > see Todd’s explication above, or a ‘fig-leaf’ as Andrew describes it > while 0.22 is stabilizing. > > Suggestions that projects like HBase hibernate until 0.22 don’t help > (See Ryan’s comments for a sense of why). We can just keep on with > what we’ve been doing up to this if the feeling is that an append > release could somehow jeopardize the 0.22 effort. Its kinda hokey > having to point users at some random looking branch [1] telling them > build their own but thankfully this is not their only option. > > I’ve enjoyed the healthy back and forth, > > St.Ack > 1. To be clear, 'random looking branch' is fruit of a bunch of > hardwork by Facebookers and Clouderians. >
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Stack 2010-12-24, 19:06
On Fri, Dec 24, 2010 at 10:57 AM, Chris Douglas <[EMAIL PROTECTED]> wrote:
> Does anything go wrong if HBase were to release the 0.20-append branch > as its own product? > This is an interesting notion. We'd host it at hbase.apache.org alongside our download? Would that be OK with others? St.Ack
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Jeff Hammerbacher 2010-12-24, 19:28
That sounds like a reasonable solution to me: the HBase team bears the
burden of cutting an maintaining the release, while Hadoop Core can proceed with 0.22. HBase had its own version of ZooKeeper in there for a while, if I recall correctly, so it's not without precedent. No funky version numbers have to be floating around Hadoop-land, and hopefully HBase can move back to HDFS when 0.22 is released. It's not ideal, but potentially the best solution given the current constraints. On Fri, Dec 24, 2010 at 11:06 AM, Stack <[EMAIL PROTECTED]> wrote: > On Fri, Dec 24, 2010 at 10:57 AM, Chris Douglas <[EMAIL PROTECTED]> > wrote: > > Does anything go wrong if HBase were to release the 0.20-append branch > > as its own product? > > > > This is an interesting notion. We'd host it at hbase.apache.org > alongside our download? Would that be OK with others? > St.Ack >
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Chris Douglas 2010-12-24, 19:31
Calling it something other than Hadoop would avoid confusing users
(and HBase could then release bug fixes, etc. on its own schedule), but from how it's been described: this is acknowledging the reality of the situation, not proposing something radical. HBase can be backed by the HBase FooFS and HDFS. If the former can be retired as a legacy platform that'd be ideal, but Hadoop will have to earn it. -C On Fri, Dec 24, 2010 at 11:06 AM, Stack <[EMAIL PROTECTED]> wrote: > On Fri, Dec 24, 2010 at 10:57 AM, Chris Douglas <[EMAIL PROTECTED]> wrote: >> Does anything go wrong if HBase were to release the 0.20-append branch >> as its own product? >> > > This is an interesting notion. We'd host it at hbase.apache.org > alongside our download? Would that be OK with others? > St.Ack >
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Arun C Murthy 2010-12-25, 06:41
I have a feeling we are thinking too much here.
The reality is that the Hadoop community was in favor of porting append (fixes?) back to a branch based off hadoop-0.20 a while ago (Dhruba proposed the branch). I see no reason we can't release it now, under a reasonable release name. As Stack and Ryan have pointed out, we have severely hampered HBase so far. It behooves us to facilitate users of the entire stack to easily access Apache releases. Plus, Stack and co. are volunteering their own time (thanks!). Arun Sent from my iPhone On Dec 24, 2010, at 11:33 AM, "Chris Douglas" <[EMAIL PROTECTED]> wrote: > Calling it something other than Hadoop would avoid confusing users > (and HBase could then release bug fixes, etc. on its own schedule), > but from how it's been described: this is acknowledging the reality of > the situation, not proposing something radical. > > HBase can be backed by the HBase FooFS and HDFS. If the former can be > retired as a legacy platform that'd be ideal, but Hadoop will have to > earn it. -C > > On Fri, Dec 24, 2010 at 11:06 AM, Stack <[EMAIL PROTECTED]> wrote: >> On Fri, Dec 24, 2010 at 10:57 AM, Chris Douglas <[EMAIL PROTECTED]> wrote: >>> Does anything go wrong if HBase were to release the 0.20-append branch >>> as its own product? >>> >> >> This is an interesting notion. We'd host it at hbase.apache.org >> alongside our download? Would that be OK with others? >> St.Ack >>
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Stack 2010-12-27, 17:12
On Fri, Dec 24, 2010 at 11:31 AM, Chris Douglas <[EMAIL PROTECTED]> wrote:
> Calling it something other than Hadoop would avoid confusing users > (and HBase could then release bug fixes, etc. on its own schedule), > but from how it's been described: this is acknowledging the reality of > the situation, not proposing something radical. > Calling it other than Hadoop would only confuse the situation even more; "Trust all your data to fooFS!". It'd also reeks of HDFS 'fork' (HBase is not yet up for taking on such a burden). I liked my original reading of your suggestion Chris -- even if it was perhaps not what you intended -- where HBase would host hadoop-0.20-append. Thats not on? St.Ack P.S. Arun, I agree.
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Chris Douglas 2010-12-27, 19:20
> On Fri, Dec 24, 2010 at 11:31 AM, Chris Douglas <[EMAIL PROTECTED]> wrote:
> Calling it other than Hadoop would only confuse the situation even > more; "Trust all your data to fooFS!". It'd also reeks of HDFS 'fork' > (HBase is not yet up for taking on such a burden). Unless I'm missing something, it is a fork. It's a temporary, friendly fork, but it's what the HBase project has been using and supporting for months. It hasn't had a label assigned to it, but it's a product (a feature with a mostly-shared implementation across other forks, at any rate). > I liked my original reading of your suggestion Chris -- even if it was > perhaps not what you intended -- where HBase would host > hadoop-0.20-append. Thats not on? Your original reading was what I intended. The obstacle to releasing a variant of Hadoop from the HBase project is the name. I'd be surprised if TLPs were permitted to release under another project's name, even if the other endorsed it. If that assumption is not a real constraint, then I agree that there's no point in calling it something else. -C
-
Re: DISCUSSION: Cut a hadoop-0.20.0-append release from the tip of branch-0.20-append branch?Stack 2010-12-29, 06:32
Chris:
What you say seems sensible enough but its not what I want (smile). My sense is that for HBase, unless the package we host at hbase.apache.org is called hadoop-0.20.0-append -- so its clear its an untainted bundle made from the tip of the append branch -- then we're only going to confuse; we'll be spending our time quelling queries about the "hbase version" of hadoop. I'm going to pass on trying to offer an append release bundle off the branch-0.20-append branch. For the next HBase (imminent) release, we'll just keep on with telling folks build their own hadoop from the append branch or go get CDH3 (The HBase release after that will be about getting us up on hadoop 0.22). Thanks all, St.Ack On Mon, Dec 27, 2010 at 11:20 AM, Chris Douglas <[EMAIL PROTECTED]> wrote: >> On Fri, Dec 24, 2010 at 11:31 AM, Chris Douglas <[EMAIL PROTECTED]> wrote: >> Calling it other than Hadoop would only confuse the situation even >> more; "Trust all your data to fooFS!". It'd also reeks of HDFS 'fork' >> (HBase is not yet up for taking on such a burden). > > Unless I'm missing something, it is a fork. It's a temporary, friendly > fork, but it's what the HBase project has been using and supporting > for months. It hasn't had a label assigned to it, but it's a product > (a feature with a mostly-shared implementation across other forks, at > any rate). > >> I liked my original reading of your suggestion Chris -- even if it was >> perhaps not what you intended -- where HBase would host >> hadoop-0.20-append. Thats not on? > > Your original reading was what I intended. The obstacle to releasing a > variant of Hadoop from the HBase project is the name. I'd be surprised > if TLPs were permitted to release under another project's name, even > if the other endorsed it. If that assumption is not a real constraint, > then I agree that there's no point in calling it something else. -C > |