Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # general - [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project

Copy link to this message
Re: [DISCUSS] Spin out MR, HDFS and YARN as their own TLPs and disband Hadoop umbrella project
Mattmann, Chris A 2012-08-31, 15:09
Hi Bobby, and Andrew,

Sorry I think both of you are still missing my point (maybe I'm wrong).
And sorry that I've failed to explain it in such a way that you guys understand,
that's as much my issue as anyone else's.

My point is - technical issues, such as how to pull apart components
and modules are difficult, and my svn copy suggestion, and
moreover, my overall suggestion to figure out how to split the umbrella project
of Hadoop up had less to do with technically pulling apart any its software
product components than it did with actually suggesting a split in the members of the
project management committee of the Apache Hadoop project.

The svn copy I suggested was merely to provide said new committees with code to work from (the same code base
they have now in fact). Put simply: I think you guys know a whole lot better
about how to deliver your software product to the community than I do.
So I'm not even trying to say that I know what the ins and outs of splitting MR,
YARN and HDFS entail, nor am I even trying to say "hey you HAVE to do
that part". That's the technical part.

I am saying that the current members of the Apache Software Foundation's Hadoop
Project Management Committee exhibit the characteristics (not just during
discrete events; it's been happening for a long time) of folks who in reality
shouldn't belong to the same project management committee. Note: this is
NOT a bad thing. There are probably plenty of (sub-)sets of groups at Apache
and elsewhere that folks wouldn't fit in to. I've enumerated some of
those characteristics that you can see sometimes spill over
(meta thought discussions about moving things around; or drawing arbitrary
lines around pieces of code that really have nothing to do with technical
stuff, and more to do about insulating and control;), but there are also other
concerns such as frameworks put in to place (exclusivity amongst others)
that themselves are pretty high indicators that this is an umbrella project.
There are social memes *around* code, that certainly
have an impact on the code, but are not the code themselves.

*That* is what I am talking about. If the code splits or whatever make sense
as part of the internal navel gazing I'm suggesting regarding the *committee*
of this project, then so be it. However, I have no direct say in any of that (
nor would I expect to without having the merit in the code to have a say).

Hope that helps explain where I was coming from better.


On Aug 31, 2012, at 7:34 AM, Robert Evans wrote:

> Andrew,
> I agree with you that the DLL/CLASSPATH issues is one huge concern that
> needs to be addressed before we can really move forward with a valid
> longterm split.  There is hope on the horizon for that though with some of
> the OSGI work that Tom White has been doing.
> Chris,
> I completely agree with Andrew here.  There are very *REAL* technical
> issues that need to be addressed before a *CLEAN* split can happen.  We
> can make a messy one, but the ramifications are far from trivial.  If we
> simply go in blindly it will at a minimum take months to stabilize and get
> back to where we are now.  You may be OK with that, but many of us are
> not.  Simply dismissing others' concurs as invalid is not good for the
> community.  Many of us, as indaviduals, have a huge vested interest in
> having a stable version of Hadoop with new features in it regularly
> released.  That is why we are part of this community.  It frankly baffles
> me that "community over code" can be used to dismiss concurs about an
> issue that many of us see as something that will hurt the community.  I am
> +1 for the split, and I am +1 for doing it soon, but I am -1 on doing it
> without at least having a plan as to how we will tease apart the different
> pieces of Hadoop.
> --Bobby
> On 8/31/12 2:55 AM, "Andrew Purtell" <[EMAIL PROTECTED]> wrote:
>> The end user community might disappear, and you are ok with this? I'm
>> simply astonished. Who are these people showing up to help, document, be
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
WWW:   http://sunset.usc.edu/~mattmann/
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA