Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

Copy link to this message
Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project
Or resurrect MR(v1) in Apache Hadoop as Apache YARN becomes a TLP, and let
the new YARN TLP decide if they want to use the Hadoop MR artifacts and/or
contribute patches that harmonize the implementation with theirs, or pursue
an alternate MR implementation within their larger framework.

I'd imagine such a MR(v1) in Hadoop, if this happened, would concentrate on
performance improvements, maybe such things as alternate shuffle plugins.
Perhaps a HA JobTracker for parity with HDFS. But we could expect a clear
separation where next generation framework work would be continued in and
centered upon YARN, while Hadoop remains... well, Hadoop.

On Friday, August 31, 2012, Robert Evans wrote:

> The problem there is that YARN depends on Common, and MapReduce depends on
> YARN, so we would either have a circular dependency or we would have to
> split off MapRedcue too.
> --Bobby
> On 8/31/12 11:54 AM, "Eli Collins" <[EMAIL PROTECTED]> wrote:
> >How about a proposal to just spin YARN off as a TLP?  Rationale:
> >
> >1. YARN started as a separate project and has a more independent
> >community than Common/HDFS/MR (per below these communities do not
> >divide at sub-project boundaries) that appears to want to be even more
> >independent.
> >
> >2. YARN is technically much easier to separate from the rest of the
> >code base (than separating Common and HDFS for example). Separating it
> >out will also help accelerate other efforts like MR2 support for
> >Apache Mesos.
> >
> >3. It side steps a number of thorny issues (how to handle branch-1,
> >how to handle what Hadoop is wrt enforcing trademark, who to remove
> >people from the Hadoop PMC, etc) that haven't been addressed in any of
> >these proposals.
> >
> >4. It's a proof point - if you can't make the case for YARN then
> >there's no way we're going to make a case for splitting the other
> >projects (this thread).
> >
> >Ie this doesn't have to be an all-or-nothing proposition for all
> >sub-projects, since the communities don't fall on sub-project
> >boundaries.
> >
> >Thanks,
> >Eli
> >
> >On Tue, Aug 28, 2012 at 7:33 PM, Mattmann, Chris A (388J)
> ><[EMAIL PROTECTED]> wrote:
> >> [decided to minimize traffic and to simply put this in one thread]
> >>
> >> Hi Guys,
> >>
> >> See the recent discussion on these threads:
> >>
> >> YARN as its own Hadoop "sub project": http://s.apache.org/WW1
> >> Maintain a single committer list for the Hadoop project:
> >>http://s.apache.org/Owx
> >>
> >> ...and just pay attention to the Hadoop project over the last 3-4
> >>years. It's operating
> >> as a single project, that's masking separate communities that
> >>themselves are really
> >> separate ASF projects.
> >>
> >> At the ASF, this has been a problem area called "umbrella" projects and
> >>over the years,
> >> all I've seen from them is wasted bandwidth, artificial barriers and
> >>the inventions of
> >> new ways to perform process mongering and to reduce the fun in
> >>developing software
> >> at this fantastic foundation.
> >>
> >> I've talked about umbrella projects enough. We've diverted conversation
> >>enough.
> >> Enough people have tried to act like there is some technical mumbo
> >>jumbo that is
> >> preventing the eventual act of higher power that I myself hope comes
> >>should these
> >> discussions prove unfruitful through normal means.
> >>
> >> *these. are. separate. projects.*
> >>
> >>*there.are.not.blocker.issues.from.spinning.out.these.projects.as.their.o
> >>wn.communities*
> >>
> >> In this email: http://s.apache.org/rSm
> >>
> >> And in the 2 subsequent follow ons in that thread, I've outlined a
> >>process that I'll copy
> >> through below for splitting these projects into their own TLPs:
> >>
> >> -----snip
> >> Process:
> >>
> >> 0. [DISCUSS] thread for <TLP name> in which you talk about #1 and #2
> >>below, potentially draft resolution too.
> >>
> >> 1. Decide on an initial set of *PMC* members. I urge each new TLP to
> >>adopt PMC==C. See reasons I've

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)