Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # general - [VOTE] Maintain a single committer list for the Hadoop project

Copy link to this message
Re: [VOTE] Maintain a single committer list for the Hadoop project
Mattmann, Chris A 2012-08-29, 01:37
Hi Eli,

On Aug 28, 2012, at 4:57 PM, Eli Collins wrote:

>> [...snip...]
>> Thoughts?
> I'd start a separate discussion thread or vote about moving some or
> all of the sub-projects to TLPs. IMO we should resolve this issue
> independently - there's no reason to block this decision on a possible
> future direction for the project.

I think it's a mischaracterization to suggest that Arun's proposal is a future
direction for the project. The project is already there and has been there for
a while, during this time trying to identify itself as one project when in reality
it's been many. That's the cause of issues like this and all the email bandwidth.

> For example if YARN spins out as a
> TLP this issue still remains for the rest of the sub-projects, so I
> don't want to stall progress on this on the larger more complex
> discussion of whether all projects become TLPs.

Incrementally spinning out the projects is fine. Concretely, if each project, HDFS, MR, and YARN each did:

svn copy -m "HDFS TLP." https://svn.apache.org/repos/asf/hadoop/ https://svn.apache.org/repos/asf/<insert cool MR name>
svn copy -m "HDFS TLP." https://svn.apache.org/repos/asf/hadoop/ https://svn.apache.org/repos/asf/<insert cool YARN name>
svn copy -m "HDFS TLP." https://svn.apache.org/repos/asf/hadoop/ https://svn.apache.org/repos/asf/<insert cool HDFS name>

That would probably be a fine starting point for each project, so long as the goal is to turn
each one of those destination paths into distinct entities, and to remove code duplication.
IOW, I think it's perfectly agreeable to do the above, so long as there is the intention to get
to a point that's independent -- don't use independence as the required starting point b/c
I'm not sure you guys will ever fully get there.

> And if a sub-project
> spins out as a TLP that's a great opportunity to figure out the right
> set of committers.

Agreed, you can do that now, not needed in this VOTE - which seems to be
trying to deal with something implicitly again when there is an explicit means
to deal with it.

I would suggest a nominal process for creating TLP:

0. [DISCUSS] thread for <TLP name> in which you talk about #1 and #2 below, potentially draft resolution too.

1. Decide on an initial set of *PMC* members. I urge each new TLP to not drawn distinctions
between committers and PMC members -- in each and every Apache project over the years in which I've seen this done
all it does is create unnecessary drama, and all it does it put an extra chip on folks'
shoulders. Sure, VOTE them in as a committer -- they can modify the code but not have a
VOTE on adding new committers or on the bits they release. Huh? Make them wait another year, six months, before they get
that bit. Huh?

That just doesn't make sense to me. PMC==C makes people feel equal amongst their peers (which they are) --
and peers at Apache are really the people that are doing the work. To draw other distinctive
lines is artificial IMHO. This is a foundation built on trust and most of these communities that themselves
are distinct already have that trust -- it's just being walled up right now. Furthermore all the talk I've seen
in the past within Hadoop about being "worried" about having to deal with people's work in case they screw
up as a PMC member or committer, or having to track down bugs, etc., is exacerbated by the work that it
takes for you guys to banter back and forth in emails discussing committer lists, artificial project boundaries,
and in reviewing committer A's work, committer B's work (neither of whom can VOTE on the bits they release),
etc., etc.. Technical issues used to justify community segregation.

2. Decide on a chair. Try not to VOTE for this explicitly, see if can be discussed and consensus
can be reached (just a thought experiment). VOTE if necessary.

3. [VOTE] thread for <TLP name>

4. paste resolution from #0 to board@.

5. infrastructure set up.

6. TLPs proceed, collaborate, operate as distinct communities, and try to solve the code duplication/dependency
issues from there.


Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
WWW:   http://sunset.usc.edu/~mattmann/
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA