Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # general >> [ANNOUNCEMENT] Yahoo focusing on Apache Hadoop, discontinuing "The Yahoo Distribution of Hadoop"

Eric Baldeschwieler 2011-02-01, 03:27
Copy link to this message
Re: [ANNOUNCEMENT] Yahoo focusing on Apache Hadoop, discontinuing "The Yahoo Distribution of Hadoop"
Excellent news! Will you also make Howl, Oozie, and Yarn Apache projects as

On Mon, Jan 31, 2011 at 7:27 PM, Eric Baldeschwieler

> Hi Folks,
> I'm pleased to announce that after some reflection, Yahoo! has decided to
> discontinue the  "The Yahoo Distribution of Hadoop" and focus on Apache
> Hadoop.  We plan to remove all references to a Yahoo distribution from our
> website (developer.yahoo.com/hadoop), close our github repo (
> yahoo.github.com/hadoop-common) and focus on working more closely with the
> Apache community.  Our intent is to return to helping Apache produce binary
> releases of Apache Hadoop that are so bullet proof that Yahoo and other
> production Hadoop users can run them unpatched on their clusters.
> Until Hadoop 0.20, Yahoo committers worked as release masters to produce
> binary Apache Hadoop releases that the entire community used on their
> clusters.    As the community grew, we have experiment with using the
> "Yahoo! Distribution of Hadoop" as the vehicle to share our work.
>  Unfortunately, Apache is no longer the obvious place to go for Hadoop
> releases.  The Yahoo! team wants to return to a world where anyone can
> download and directly use releases of Hadoop from Apache.  We want to
> contribute to the stabilization and testing of those releases.  We also want
> to share our regular program of sustaining engineering that backports minor
> feature enhancements into new dot releases on a regular basis, so that the
> world sees regular improvements coming from Apache every few months, not
> years.
> Recently the Apache Hadoop community has been very turbulent.  Over the
> last few months we have been developing Hadoop enhancements in our internal
> git repository while doing a complete review of our options. Our commitment
> to open sourcing our work was never in doubt (see http://yhoo.it/e8p3Dd),
> but the future of the "Yahoo distribution of Hadoop" was far from clear.
>  We've concluded that focusing on Apache Hadoop is the way forward.  We
> believe that more focus on communicating our goals to the Apache Hadoop
> community, and more willingness to compromise on how we get to those goals,
> will help us get back to making Hadoop even better.
> Unfortunately, we now have to sort out how to contribute several
> person-years worth of work to Apache to let us unwind the Yahoo! git
> repositories.  We currently run two lines of Hadoop development, our
> sustaining program (hadoop-0.20-sustaining) and hadoop-future.
>  Hadoop-0.20-sustaining is the stable version of Hadoop we currently run on
> Yahoo's 40,000 nodes.  It contains a series of fixes and enhancements that
> are all backwards compatible with our "Hadoop 0.20 with security".  It is
> our most stable and high performance release of Hadoop ever.  We've expended
> a lot of energy finding and fixing bugs in it this year. We have initiated
> the process of contributing this work to Apache in the branch:
> hadoop/common/branches/branch-0.20-security.  We've proposed calling this
> the 20.100 release.  Once folks have had a chance to try this out and we've
> had a chance to respond to their feedback, we plan to create 20.100 release
> candidates and ask the community to vote on making them Apache releases.
> Hadoop-future is our new feature branch.  We are working on a set of new
> features for Hadoop to improve its availability, scalability and
> interoperability to make Hadoop more usable in mission critical deployments.
> You're going to see another burst of email activity from us as we work to
> get hadoop-future patches socialized, reviewed and checked in.  These bulk
> checkins are exceptional.  They are the result of us striving to be more
> transparent.  Once we've merged our hadoop-future and hadoop-0.20-sustaining
> work back into Apache, folks can expect us to return to our regular
> development cadence.  Looking forward, we plan to socialize our roadmaps
> regularly, actively synchronize our work with other active Hadoop
Alan Gates 2011-02-01, 16:06
Andrew Purtell 2011-02-01, 16:29
Todd Papaioannou 2011-02-01, 22:02
Ian Holsman 2011-02-01, 14:04
Owen OMalley 2011-02-11, 22:56
Arun C Murthy 2011-02-14, 21:34
Arun C Murthy 2011-04-07, 22:52
Todd Lipcon 2011-04-07, 23:22
Arun C Murthy 2011-04-08, 17:34
Todd Lipcon 2011-04-08, 18:08
Eric Baldeschwieler 2011-04-08, 21:20
Arun C Murthy 2011-04-08, 21:40