Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # general - [RESULT] Release plan for Hadoop 2.0.5


Copy link to this message
-
Re: [RESULT] Release plan for Hadoop 2.0.5
Roman Shaposhnik 2013-05-15, 16:50
On Wed, May 15, 2013 at 1:20 AM, Matt Foley <[EMAIL PROTECTED]> wrote:
> Additional features that everybody wants in Hadoop-2 still remain to be
> added.  We KNOW these will result in non-backward-compatible API changes.
>  So by trying to do a stability line of code now, we are terminating the
> effort to achieve backward-compatibility in Hadoop-2 APIs.
>
> Did this community really intend to vote for that?  Because that's what
> we've now got.

This was one of the crucial points for me as well (if you rewind the thread
you could see my original vote being -1 precisely because I did NOT
want to have incompatible changes within the same "namespace" of 2.0).

After Arun has proposed a different namespace of 2.1.x I changed my
vote to +1.

Now, here's the rationale: speaking from an experience of building
a community-driven distribution on top of Hadoop releases I can
attest to how badly we (and all the downstream projects) need a
stable 2.x baseline. Perhaps a less featurefull one, but the one
that would parallel stability of 1.x code line.

The downstream projects are struggling mightily with the fact that
it is never quite safe to assume 2.0.x to be stable (to be honest
we labeled it alpha for a reason). We have to have *something*.

Of course, as you point out, the fact that a more featureful
2.1.x line now might become incompatible with a stable base
of 2.0.x is something to worry about. I've struggled with it
myself and finally accepted it as a much lesser evil.

Finally, imagine that we're successful with 2.0.x stabilization and all
of the downstream now has it as a default profile(*) I can guarantee
you that it would generate tons of additional feedback that would
be quite useful to future stabilization of 2.1.x. At this point this
feedback is lost.

Thanks,
Roman.

P.S. In fact we need it so badly, that Cos and I are going to have a
panel discussion at HUG (@Yahoo campus) tonight on that very subject.
Everybody who feels passionate about this and would like to share
ideas/observations would be extremely welcome to participate on the panel.

(*) And speaking of the default profiles, here's my favorite way of
demonstrating
that downstream is hurting:
  $ cd ~/src/random-hadoop-downstream-project
  $ mvn help:all-profiles
Listing Profiles for Project: XXXX
  Profile Id: hadoop_0.20.203 (Active: true , Source: pom)
  Profile Id: hadoop_1.0 (Active: false , Source: pom)
  Profile Id: hadoop_non_secure (Active: false , Source: pom)
  Profile Id: hadoop_facebook (Active: false , Source: pom)
  Profile Id: hadoop_0.23 (Active: false , Source: pom)
  Profile Id: hadoop_yarn (Active: false , Source: pom)
  Profile Id: hadoop_2.0.0 (Active: false , Source: pom)
  Profile Id: hadoop_2.0.1 (Active: false , Source: pom)
  Profile Id: hadoop_2.0.2 (Active: false , Source: pom)
  Profile Id: hadoop_2.0.3 (Active: false , Source: pom)
  Profile Id: hadoop_trunk (Active: false , Source: pom)
  Profile Id: hadoop_cdh4.1.2 (Active: false , Source: pom)

And good luck finding Maven artifacts for it!