|
Eric Baldeschwieler
2011-06-18, 06:50
Roy T. Fielding
2011-06-18, 20:22
Eric Baldeschwieler
2011-06-18, 22:55
Steve Loughran
2011-06-21, 11:39
Roy T. Fielding
2011-06-21, 18:40
Steve Loughran
2011-06-22, 09:30
Allen Wittenauer
2011-06-22, 16:27
Steve Loughran
2011-06-23, 12:47
Allen Wittenauer
2011-06-23, 18:23
Eric Baldeschwieler
2011-06-23, 06:21
Allen Wittenauer
2011-06-21, 18:43
Ted Dunning
2011-06-21, 19:04
Allen Wittenauer
2011-06-21, 19:09
Marcos Ortiz
2011-06-21, 19:43
Ted Dunning
2011-06-21, 19:11
Steve Loughran
2011-06-22, 09:16
Rottinghuis, Joep
2011-06-18, 22:47
Scott Carey
2011-06-21, 21:58
|
-
[DISCUSSION] Thinking about 20.204 and beyondEric Baldeschwieler 2011-06-18, 06:50
Hi Folks,
Along with starting a new release off the mainline (see previous mail), the Yahoo! team plans to continue producing sustaining releases off the Hadoop with security branch, such as 0.20.203 . I'm writing this email to outline our plans, explain Yahoo's motivation for supporting this work and request feedback and hopefully your endorsement. This initiative stems from Yahoo's commitment to do its hadoop work in Apache and discontinue the Yahoo Distribution of Hadoop (http://yhoo.it/i9Ww8W). We hope to produce a new 0.20.204 release in Apache in the next few weeks. Owen O'Malley is planning to act as release master for this release. This will be based on work in the hadoop-with-security branch, just as 0.20.203 but will include bugfixes and enhancements beyond those in 0.20.203. This is one in a series of releases we hope to do in the next 6-9 months as hadoop 0.23 (or whatever the community chooses to call it) goes through the various stages of stability testing and burn-in. CONTENTS OF THE RELEASE: Some highlights: - RPM & .deb packaging to ease deployment (back ported from trunk) - I am excited to see hadoop released with .deb & RPM packaging from Apache for the first time. - This will greatly ease deployment - Disk fail in place (merged with trunk, except for some MR changes conflict with MR-279, these will be reimplemented in MR-279) - This change has been motivated by operational problems we observed with our new 12 disk machines. - This work should greatly improve Hadoop availability by keeping nodes working when one of their disks fails - Lots of of additional fixes (I've included the change log below) WHY THIS PROCESS: Producing a stable release of Hadoop is a long, hard and expensive process. Historically Y! has produced all such releases. Other releases of Hadoop have either not been stable (Hadoop 0.19 and Hadoop 0.21) or have been based on a stable Apache release driven by Yahoo (CDH and Facebook). Once we've paid the price of making a stable release, it makes a lot of sense to accept safe improvements as well as bug fixes. Doing so allows one to get customer impacting improvements into production in days, rather than years, which is what would happen if one waited for changes to come in the next stable release off the Hadoop mainline. Given that it takes many months to stabilize trunk, there is no way to get new easy fixes into users hands quickly via a new mainline release. For the last few years Yahoo has done sustaining engineering in open source via Github. These patches have been contributed to Apache Hadoop mainline and backported to the sustaining branch on github (for yahoo 0.20 for example). We've then cut Yahoo releases from Github. Cloudera and Facebook have also taken these patches from Github and incorporated these improvements into their releases, so the community has benefitted from this process for years. What we are planning to do now is simply move this process into Apache, so that Apache releases themselves are timely and relevant, not always a year or two behind what users need. How do I propose making these decisions? Deciding what is a safe patch is a judgement call. Apache process suggests that the release manager makes these calls (http://bit.ly/mJcBjc). For releases Y! champions, such as 0.20 (arun & owen), we are ready to do the sustaining engineering, make these calls and stand our reputation behind the quality of the result. Other release masters are championing other Apache Hadoop releases currently (Nigel and Tom) and I think they should be free to do the same. For hadoop 20.204, I propose pushing what is currently in the hadoop-with-security branch. Part of the reason for this thread is to socialize this process, so that community members can champion stable patches for inclusion in 20.205. In the future I propose that a branch's release master request suggestions for future releases on this list, but is free to use their judgement on what is accepted (pretty much what nigel is doing on 0.22 today). Conclusion: The vote on 0.20.203 was acrimonious, but I believe that 0.20.203 was a useful step forward for Apache Hadoop. 0.20.204 will again be the best stable release of Apache Hadoop ever. I hope folks can support the effort. With your contribution 0.20.205 can be even better, fixing issues that plague your Hadoop clusters. This email is part of a wider effort from the Yahoo team to co-plan our work with the community. Thanks, Eric14 eric14 a.k.a. Eric Baldeschwieler VP Hadoop Software Development @Yahoo! ============================ From CHANGES.txt: Release 0.20.204.0 - unreleased NEW FEATURES HADOOP-6255. Create RPM and Debian packages for common. Changes deployment layout to be consistent across the binary tgz, rpm, and deb. Adds setup scripts for easy one node cluster configuration and user creation. (Eric Yang via omalley) BUG FIXES MAPREDUCE-2495. exit() the TaskTracker when the distributed cache cleanup thread dies. (Robert Joseph Evans via cdouglas) HDFS-1878. TestHDFSServerPorts unit test failure - race condition in FSNamesystem.close() causes NullPointerException without serious consequence. (mattf) MAPREDUCE-2452. Moves the cancellation of delegation tokens to a separate thread. (ddas) MAPREDUCE-2555. Avoid sprious logging from completedtasks. (Thomas Graves via cdouglas) MAPREDUCE-2451. Log the details from health check script at the JobTracker. (Thomas Graves via cdouglas) MAPREDUCE-2535. Fix NPE in JobClient caused by retirement. (Robert Joseph Evans via cdouglas) MAPREDUCE-2456. Log the reduce taskID and associated TaskTrackers with failed fetch notifications in the JobTracker log. (Jeffrey Naisbitt via cdouglas) HDFS-2044. TestQueueProcessingStatistics failing automatic test due to timing issues. (mattf) HADOOP-7248. Update eclipse target to generate .classpath from ivy config. (Thomas Gr +
Eric Baldeschwieler 2011-06-18, 06:50
-
Re: [DISCUSSION] Thinking about 20.204 and beyondRoy T. Fielding 2011-06-18, 20:22
On Jun 17, 2011, at 11:50 PM, Eric Baldeschwieler wrote:
> Along with starting a new release off the mainline (see previous mail), the Yahoo! team plans to continue producing sustaining releases off the Hadoop with security branch, such as 0.20.203 . I'm writing this email to outline our plans, explain Yahoo's motivation for supporting this work and request feedback and hopefully your endorsement. This initiative stems from Yahoo's commitment to do its hadoop work in Apache and discontinue the Yahoo Distribution of Hadoop (http://yhoo.it/i9Ww8W). Eric, Please make an effort to understand that we act at Apache as individuals. That includes you and your team. Yes, everyone greatly appreciates the fact that Yahoo! pays your salaries. Nevertheless, here you are called an individual named Eric. Companies do not have motivations. You do. There is nothing wrong with each and every individual here being motivated, at least partly, by what their boss(es) consider the right path forward. But you don't have to sell it as something Yahoo! is doing, and you don't need to send marketing messaging to the development lists at Apache. Just be yourself and represent yourself. Anyone on the PMC can be an RM. A package can be rolled at any time, and a release vote on that package can be called at any time. While it makes sense for the individual planning to RM a release to announce their plans, doing so does not in any way form the direction the product is headed, nor does it exclude anyone else from being RM of their own release process. There is absolutely no reason that trunk cannot be packaged for release tomorrow as 0.23. There may be many reasons why it won't pass a release vote, but we probably aren't going to find them until somebody tries. ....Roy +
Roy T. Fielding 2011-06-18, 20:22
-
Re: [DISCUSSION] Thinking about 20.204 and beyondEric Baldeschwieler 2011-06-18, 22:55
Hi Roy,
Thanks for the clarification. I will endeavor to keep marketing to a minimum. I always try to keep my actions as a manager and project contributor consistent with our community norms. When one is speaking as the manager of a team of contributors pronouns get complicated. I've gotten lots of feedback that folks would like a better understanding of what I plan to ask team members to invest in. And lots of questions about why we don't do various things others would like to see done. My goal is simply to be transparent. That should allow all of us to make better decisions as a community. Given the number of questions I have gotten on why "yahoo" chooses to do sustaining I think the background is needed for an informed debate. I expect yahoo team members will continue to make branches and call release votes. I hope that sharing my thinking, intentions and motivations up front reduces conflict. --- E14 - typing on glass On Jun 18, 2011, at 1:23 PM, "Roy T. Fielding" <[EMAIL PROTECTED]> wrote: > On Jun 17, 2011, at 11:50 PM, Eric Baldeschwieler wrote: > >> Along with starting a new release off the mainline (see previous mail), the Yahoo! team plans to continue producing sustaining releases off the Hadoop with security branch, such as 0.20.203 . I'm writing this email to outline our plans, explain Yahoo's motivation for supporting this work and request feedback and hopefully your endorsement. This initiative stems from Yahoo's commitment to do its hadoop work in Apache and discontinue the Yahoo Distribution of Hadoop (http://yhoo.it/i9Ww8W). > > Eric, > > Please make an effort to understand that we act at Apache as individuals. > That includes you and your team. Yes, everyone greatly appreciates the > fact that Yahoo! pays your salaries. Nevertheless, here you are called > an individual named Eric. > > Companies do not have motivations. You do. There is nothing wrong with > each and every individual here being motivated, at least partly, by what > their boss(es) consider the right path forward. But you don't have to > sell it as something Yahoo! is doing, and you don't need to send marketing > messaging to the development lists at Apache. Just be yourself and > represent yourself. > > Anyone on the PMC can be an RM. A package can be rolled at any time, and > a release vote on that package can be called at any time. While it makes > sense for the individual planning to RM a release to announce their plans, > doing so does not in any way form the direction the product is headed, > nor does it exclude anyone else from being RM of their own release process. > > There is absolutely no reason that trunk cannot be packaged for release > tomorrow as 0.23. There may be many reasons why it won't pass a release > vote, but we probably aren't going to find them until somebody tries. > > ....Roy +
Eric Baldeschwieler 2011-06-18, 22:55
-
Re: [DISCUSSION] Thinking about 20.204 and beyondSteve Loughran 2011-06-21, 11:39
On 18/06/2011 21:22, Roy T. Fielding wrote:
olutely no reason that trunk cannot be packaged for release > tomorrow as 0.23. There may be many reasons why it won't pass a release > vote, but we probably aren't going to find them until somebody tries. > One limitation with releases has always been size of cluster testing -where Yahoo!s contributions have been invaluable. That said, we shouldn't make them an SPOF in the release process; we should all set up to do some more release testing. Looking round my house I see that I have 1 linux desktop, 4 linux laptops (All Ubuntu 10.04) , 2 OS/X machines and an OLPC, plus three WebOS phones that can take a JVM, all connected via a netgear N600 router, (ignoring CentOS virtual machine images). If I can actually bring up a heterogenous cluster here then it would be a contribution, and I'm sure others have equal messes of home and work systems that could be used to test bad system configurations. If successful we can add a netgear 802.11 {a, b, g, n} router to the list of switches known to work once you get DNS right. What would be good here is more documentation on how the beta testers can generate realistic loads to stress our clusters, using the stress test tools that are already in the codebase. I guess I may start on that if/when I start playing with the tools. -Steve +
Steve Loughran 2011-06-21, 11:39
-
Re: [DISCUSSION] Thinking about 20.204 and beyondRoy T. Fielding 2011-06-21, 18:40
On Jun 21, 2011, at 4:39 AM, Steve Loughran wrote:
> On 18/06/2011 21:22, Roy T. Fielding wrote: > olutely no reason that trunk cannot be packaged for release >> tomorrow as 0.23. There may be many reasons why it won't pass a release >> vote, but we probably aren't going to find them until somebody tries. > > One limitation with releases has always been size of cluster testing -where Yahoo!s contributions have been invaluable. That said, we shouldn't make them an SPOF in the release process; we should all set up to do some more release testing. Yes, more testing is better, but if it can't be tested by the dev team in 72 hours then it doesn't belong in our release process. Please note that one of the main advantages of open source development is that the bulk of testing/QA occurs *after* the release. That's why labels like alpha/beta/GA are best applied/updated after the version number has been cut and the software has been proven in real deployments. If testing on 5000 nodes is important to our customers, then add a scale-tested metric to the download site so that the customers know which release package has been tested at what scale -- they will understand the difference between frequent releases and those fully tested at scale. Let them decide which version is best to use for their own needs. ....Roy +
Roy T. Fielding 2011-06-21, 18:40
-
Re: [DISCUSSION] Thinking about 20.204 and beyondSteve Loughran 2011-06-22, 09:30
On 21/06/2011 19:40, Roy T. Fielding wrote:
> On Jun 21, 2011, at 4:39 AM, Steve Loughran wrote: > >> On 18/06/2011 21:22, Roy T. Fielding wrote: >> olutely no reason that trunk cannot be packaged for release >>> tomorrow as 0.23. There may be many reasons why it won't pass a release >>> vote, but we probably aren't going to find them until somebody tries. >> >> One limitation with releases has always been size of cluster testing -where Yahoo!s contributions have been invaluable. That said, we shouldn't make them an SPOF in the release process; we should all set up to do some more release testing. > > Yes, more testing is better, but if it can't be tested by the dev team > in 72 hours then it doesn't belong in our release process. > > Please note that one of the main advantages of open source development > is that the bulk of testing/QA occurs *after* the release. That's why > labels like alpha/beta/GA are best applied/updated after the version number > has been cut and the software has been proven in real deployments. > If testing on 5000 nodes is important to our customers, then add a > scale-tested metric to the download site so that the customers know > which release package has been tested at what scale -- they will > understand the difference between frequent releases and those fully tested > at scale. Let them decide which version is best to use for their own needs. I agree with in-field testing; the big issue there is that people who do have 1+PB of data are nervous about Hadoop upgrades, JVM upgrades. I haven't even heard of anyone who owns up to moving to ext4 fs underneath. It's the cost of loss of data that raises concerns, not the loss of time in testing and rolling back [1]. And very few people have 500+ node clusters sitting around idle. I like the point about mixed endian; Sparc may be dead but there are other architectures out there, and Arm looks up and coming. Then there are bits of the config space like a block replication factor of 2 that can be tested at small scale. -steve [1] Steve reverted his work desktop from RHEL6 to Ubuntu 10.04 last week and, after discussion with his 9 year old sun, is going to switch back to Firefox 3.6 as it does better flash games) +
Steve Loughran 2011-06-22, 09:30
-
Re: [DISCUSSION] Thinking about 20.204 and beyondAllen Wittenauer 2011-06-22, 16:27
On Jun 22, 2011, at 2:30 AM, Steve Loughran wrote: > > I haven't even heard of anyone who owns up to moving to ext4 fs underneath. Yes you do. :D +
Allen Wittenauer 2011-06-22, 16:27
-
Re: [DISCUSSION] Thinking about 20.204 and beyondSteve Loughran 2011-06-23, 12:47
On 22/06/2011 17:27, Allen Wittenauer wrote:
> > On Jun 22, 2011, at 2:30 AM, Steve Loughran wrote: >> >> I haven't even heard of anyone who owns up to moving to ext4 fs underneath. > > Yes you do. > > :D > Did it work? and RHEL6.0? +
Steve Loughran 2011-06-23, 12:47
-
Re: [DISCUSSION] Thinking about 20.204 and beyondAllen Wittenauer 2011-06-23, 18:23
On Jun 23, 2011, at 5:47 AM, Steve Loughran wrote: > On 22/06/2011 17:27, Allen Wittenauer wrote: >> >> On Jun 22, 2011, at 2:30 AM, Steve Loughran wrote: >>> >>> I haven't even heard of anyone who owns up to moving to ext4 fs underneath. >> >> Yes you do. >> >> :D >> > > Did it work? > and RHEL6.0? We've been using ext4 for HDFS and MR spill space (the rest are ext3) on CentOS 5.5 in some form or another for almost a year now. No issues to report, other than it isn't ZFS (so we lost some functionality that we greatly miss). At this point, we're waiting for 6.1 or 6.2 before changing our Linux version. RH (historically) has too much shift between 0->1->2 for it to be considered stable until the .2 release. But we might jump on .1 anyway. +
Allen Wittenauer 2011-06-23, 18:23
-
Re: [DISCUSSION] Thinking about 20.204 and beyondEric Baldeschwieler 2011-06-23, 06:21
Yup. Frequent releases, some focused on getting to stable on older code lines, some pushing new code out for people to try.
On Jun 21, 2011, at 11:40 AM, Roy T. Fielding wrote: > On Jun 21, 2011, at 4:39 AM, Steve Loughran wrote: > >> On 18/06/2011 21:22, Roy T. Fielding wrote: >> olutely no reason that trunk cannot be packaged for release >>> tomorrow as 0.23. There may be many reasons why it won't pass a release >>> vote, but we probably aren't going to find them until somebody tries. >> >> One limitation with releases has always been size of cluster testing -where Yahoo!s contributions have been invaluable. That said, we shouldn't make them an SPOF in the release process; we should all set up to do some more release testing. > > Yes, more testing is better, but if it can't be tested by the dev team > in 72 hours then it doesn't belong in our release process. > > Please note that one of the main advantages of open source development > is that the bulk of testing/QA occurs *after* the release. That's why > labels like alpha/beta/GA are best applied/updated after the version number > has been cut and the software has been proven in real deployments. > If testing on 5000 nodes is important to our customers, then add a > scale-tested metric to the download site so that the customers know > which release package has been tested at what scale -- they will > understand the difference between frequent releases and those fully tested > at scale. Let them decide which version is best to use for their own needs. > > ....Roy > +
Eric Baldeschwieler 2011-06-23, 06:21
-
Re: [DISCUSSION] Thinking about 20.204 and beyondAllen Wittenauer 2011-06-21, 18:43
On Jun 21, 2011, at 4:39 AM, Steve Loughran wrote: > If I can actually bring up a heterogenous cluster here I believe there was a post in one of the mailing lists in the past 6 months where someone tried a mixed endian grid. It blew up big time. +
Allen Wittenauer 2011-06-21, 18:43
-
Re: [DISCUSSION] Thinking about 20.204 and beyondTed Dunning 2011-06-21, 19:04
For the pain in doing this, it is probably better to just drop $10 and bring
up a nice EC2 cluster with 10 m1.large instances using spot pricing for 5 hours. On Tue, Jun 21, 2011 at 11:43 AM, Allen Wittenauer <[EMAIL PROTECTED]> wrote: > > On Jun 21, 2011, at 4:39 AM, Steve Loughran wrote: > > If I can actually bring up a heterogenous cluster here > > I believe there was a post in one of the mailing lists in the past 6 > months where someone tried a mixed endian grid. It blew up big time. +
Ted Dunning 2011-06-21, 19:04
-
Re: [DISCUSSION] Thinking about 20.204 and beyondAllen Wittenauer 2011-06-21, 19:09
On Jun 21, 2011, at 12:04 PM, Ted Dunning wrote: > For the pain in doing this, it is probably better to just drop $10 and bring > up a nice EC2 cluster with 10 m1.large instances using spot pricing for 5 > hours. Testing on non-Intel, non-Linux is something we need to do more of. Amazon won't help us there, AFAIK. +
Allen Wittenauer 2011-06-21, 19:09
-
Re: [DISCUSSION] Thinking about 20.204 and beyondMarcos Ortiz 2011-06-21, 19:43
For example in Solaris and BSD�s systems Right?
regards On 06/21/2011 02:39 PM, Allen Wittenauer wrote: > On Jun 21, 2011, at 12:04 PM, Ted Dunning wrote: > >> For the pain in doing this, it is probably better to just drop $10 and bring >> up a nice EC2 cluster with 10 m1.large instances using spot pricing for 5 >> hours. > > Testing on non-Intel, non-Linux is something we need to do more of. Amazon won't help us there, AFAIK. -- Marcos Lu�s Ort�z Valmaseda Software Engineer (UCI) http://marcosluis2186.posterous.com http://twitter.com/marcosluis2186 +
Marcos Ortiz 2011-06-21, 19:43
-
Re: [DISCUSSION] Thinking about 20.204 and beyondTed Dunning 2011-06-21, 19:11
Amazon can help with the non-linux, but not with the non-Intel.
On Tue, Jun 21, 2011 at 12:09 PM, Allen Wittenauer <[EMAIL PROTECTED]> wrote: > > On Jun 21, 2011, at 12:04 PM, Ted Dunning wrote: > > > For the pain in doing this, it is probably better to just drop $10 and > bring > > up a nice EC2 cluster with 10 m1.large instances using spot pricing for 5 > > hours. > > > Testing on non-Intel, non-Linux is something we need to do more of. > Amazon won't help us there, AFAIK. +
Ted Dunning 2011-06-21, 19:11
-
Re: [DISCUSSION] Thinking about 20.204 and beyondSteve Loughran 2011-06-22, 09:16
On 21/06/2011 20:04, Ted Dunning wrote:
> For the pain in doing this, it is probably better to just drop $10 and bring > up a nice EC2 cluster with 10 m1.large instances using spot pricing for 5 > hours. this lacks the mess of networking that is common in the outside world +
Steve Loughran 2011-06-22, 09:16
-
RE: [DISCUSSION] Thinking about 20.204 and beyondRottinghuis, Joep 2011-06-18, 22:47
Thanks for clarifying your position and being open with your plans going forward.
Joep ________________________________________ From: Eric Baldeschwieler [[EMAIL PROTECTED]] Sent: Friday, June 17, 2011 11:50 PM To: [EMAIL PROTECTED] Subject: [DISCUSSION] Thinking about 20.204 and beyond Hi Folks, Along with starting a new release off the mainline (see previous mail), the Yahoo! team plans to continue producing sustaining releases off the Hadoop with security branch, such as 0.20.203 . I'm writing this email to outline our plans, explain Yahoo's motivation for supporting this work and request feedback and hopefully your endorsement. This initiative stems from Yahoo's commitment to do its hadoop work in Apache and discontinue the Yahoo Distribution of Hadoop (http://yhoo.it/i9Ww8W). We hope to produce a new 0.20.204 release in Apache in the next few weeks. Owen O'Malley is planning to act as release master for this release. This will be based on work in the hadoop-with-security branch, just as 0.20.203 but will include bugfixes and enhancements beyond those in 0.20.203. This is one in a series of releases we hope to do in the next 6-9 months as hadoop 0.23 (or whatever the community chooses to call it) goes through the various stages of stability testing and burn-in. CONTENTS OF THE RELEASE: Some highlights: - RPM & .deb packaging to ease deployment (back ported from trunk) - I am excited to see hadoop released with .deb & RPM packaging from Apache for the first time. - This will greatly ease deployment - Disk fail in place (merged with trunk, except for some MR changes conflict with MR-279, these will be reimplemented in MR-279) - This change has been motivated by operational problems we observed with our new 12 disk machines. - This work should greatly improve Hadoop availability by keeping nodes working when one of their disks fails - Lots of of additional fixes (I've included the change log below) WHY THIS PROCESS: Producing a stable release of Hadoop is a long, hard and expensive process. Historically Y! has produced all such releases. Other releases of Hadoop have either not been stable (Hadoop 0.19 and Hadoop 0.21) or have been based on a stable Apache release driven by Yahoo (CDH and Facebook). Once we've paid the price of making a stable release, it makes a lot of sense to accept safe improvements as well as bug fixes. Doing so allows one to get customer impacting improvements into production in days, rather than years, which is what would happen if one waited for changes to come in the next stable release off the Hadoop mainline. Given that it takes many months to stabilize trunk, there is no way to get new easy fixes into users hands quickly via a new mainline release. For the last few years Yahoo has done sustaining engineering in open source via Github. These patches have been contributed to Apache Hadoop mainline and backported to the sustaining branch on github (for yahoo 0.20 for example). We've then cut Yahoo releases from Github. Cloudera and Facebook have also taken these patches from Github and incorporated these improvements into their releases, so the community has benefitted from this process for years. What we are planning to do now is simply move this process into Apache, so that Apache releases themselves are timely and relevant, not always a year or two behind what users need. How do I propose making these decisions? Deciding what is a safe patch is a judgement call. Apache process suggests that the release manager makes these calls (http://bit.ly/mJcBjc). For releases Y! champions, such as 0.20 (arun & owen), we are ready to do the sustaining engineering, make these calls and stand our reputation behind the quality of the result. Other release masters are championing other Apache Hadoop releases currently (Nigel and Tom) and I think they should be free to do the same. For hadoop 20.204, I propose pushing what is currently in the hadoop-with-security branch. Part of the reason for this thread is to socialize this process, so that community members can champion stable patches for inclusion in 20.205. In the future I propose that a branch's release master request suggestions for future releases on this list, but is free to use their judgement on what is accepted (pretty much what nigel is doing on 0.22 today). Conclusion: The vote on 0.20.203 was acrimonious, but I believe that 0.20.203 was a useful step forward for Apache Hadoop. 0.20.204 will again be the best stable release of Apache Hadoop ever. I hope folks can support the effort. With your contribution 0.20.205 can be even better, fixing issues that plague your Hadoop clusters. This email is part of a wider effort from the Yahoo team to co-plan our work with the community. Thanks, Eric14 eric14 a.k.a. Eric Baldeschwieler VP Hadoop Software Development @Yahoo! ============================ Release 0.20.204.0 - unreleased NEW FEATURES HADOOP-6255. Create RPM and Debian packages for common. Changes deployment layout to be consistent across the binary tgz, rpm, and deb. Adds setup scripts for easy one node cluster configuration and user creation. (Eric Yang via omalley) BUG FIXES MAPREDUCE-2495. exit() the TaskTracker when the distributed cache cleanup thread dies. (Robert Joseph Evans via cdouglas) HDFS-1878. TestHDFSServerPorts unit test failure - race condition in FSNamesystem.close() causes NullPointerException without serious consequence. (mattf) MAPREDUCE-2452. Moves the cancellation of delegation tokens to a separate thread. (ddas) MAPREDUCE-2555. Avoid sprious logging from completedtasks. (Thomas Graves via cdouglas) MAPREDUCE-2451. Log the details from health check script at the JobTracker. (Thomas Graves via cdouglas) MAPREDUCE-2535. Fix NPE in JobClient caused by retirement. (Robert Joseph Evans via cdouglas) MAPREDUCE-2456. Log the reduce taskID and associated TaskTrackers with +
Rottinghuis, Joep 2011-06-18, 22:47
-
Re: [DISCUSSION] Thinking about 20.204 and beyondScott Carey 2011-06-21, 21:58
On 6/17/11 11:50 PM, "Eric Baldeschwieler" <[EMAIL PROTECTED]> wrote: >Producing a stable release of Hadoop is a long, hard and expensive >process. Historically Y! has produced all such releases. Other releases >of Hadoop have either not been stable (Hadoop 0.19 and Hadoop 0.21) You lost me there. Enough Y! Chest pounding. Your contributions are large and important, but not every release not tested by Y! is destined for failure. 0.19.x eventually was stable and an improvement over 0.18.x in several critical ways (I used it in production for a year and relied on several new features). 0.19.0 was broken, append needed to be disabled, etc. But by the end after testing in the community, (by research institutions and smaller corporations) and some bug fix releases it became stable. +
Scott Carey 2011-06-21, 21:58
|