MapReduce, mail # dev - Release numbering for branch-2 releases

Arun C Murthy 2013-01-29, 20:56
Stack 2013-01-31, 05:25
Arun C Murthy 2013-01-31, 20:12
Stack 2013-02-01, 20:35
Tom White 2013-02-01, 10:34
Andrew Purtell 2013-02-01, 18:52
Arun C Murthy 2013-02-04, 18:46
Stack 2013-02-04, 19:53
Owen OMalley 2013-02-04, 21:07
Suresh Srinivas 2013-02-04, 22:14
Todd Lipcon 2013-02-04, 22:36
Steve Loughran 2013-02-05, 04:50
Suresh Srinivas 2013-02-04, 19:20
Eli Collins 2013-01-31, 00:21
Re: Release numbering for branch-2 releases
Arun C Murthy 2013-01-31, 01:10
The discussions in HADOOP-9151 were related to wire-compatibility. I think we all agree that breaking API compatibility is not allowed without deprecating them first in a prior major release - this is something we have followed since hadoop-0.1.

I agree we need to spell out what changes we can and cannot do *after* we go GA, for e.g.:
# Clearly incompatible *API* changes are *not* allowed in hadoop-2 post-GA.
# Do we allow incompatible changes on Client-Server protocols? I would say *no*.
# Do we allow incompatible changes on internal-server protocols (for e.g. NN-DN or NN-NN in HA setup or RM-NM in YARN) to ensure we support rolling-upgrades? I would like to not allow this, but I do not know how feasible this is. An option is to allow these changes between minor releases i.e. between hadoop-2.10 and hadoop-2.11.
# Do we allow changes which force a HDFS metadata upgrade between a minor upgrade i.e. hadoop-2.20 to hadoop-2.21?
# Clearly *no* incompatible changes (API/client-server/server-server) changes are allowed in a patch release i.e. hadoop-2.20.0 and hadoop-2.20.1 have to be compatible among all respects.

What else am I missing?

I'll make sure we update our Roadmap wiki and other docs post this discussion.


On Jan 30, 2013, at 4:21 PM, Eli Collins wrote:

> Thanks for bringing this up Arun.  One of the issues is that we
> haven't been clear about what type of compatibility breakages are
> allowed, and which are not.  For example, renaming FileSystem#open is
> incompatible, and not OK, regardless of the alpha/beta tag.  Breaking
> a server/server APIs is OK pre-GA but probably not post GA, at least
> in a point release, or required for a security fix, etc.
> Configuration, data format, environment variable, changes etc can all
> be similarly incompatible. The issue we had in HADOOP-9151 was someone
> claimed it is not an incompatible change because it doesn't break API
> compatibility even though it breaks wire compatibility. So let's be
> clear about the types of incompatibility we are or are not permitting.
> For example, will it be OK to merge a change before 2.2.0-beta that
> requires an HDFS metadata upgrade? Or breaks client server wire
> compatibility?  I've been assuming that changing an API annotated
> Public/Stable still requires multiple major releases (one to deprecate
> and one to remove), does the alpha label change that? To some people
> the "alpha", "beta" label implies instability in terms of
> quality/features, while to others it means unstable APIs (and to some
> both) so it would be good to spell that out. In short, agree that we
> really need to figure out what changes are permitted in what releases,
> and we should update the docs accordingly (there's a start here:
> http://wiki.apache.org/hadoop/Roadmap).
> Note that the 2.0.0 alpha release vote thread was clear that we
> thought were all in agreement that we'd like to keep client/server
> compatible post 2.0 - and there was no push back. We pulled a number
> of jiras into the 2.0 release explicitly so that we could preserve
> client/server compatibility going forward.  Here's the relevant part
> of the thread as a refresher: http://s.apache.org/gQ
> "2) HADOOP-8285 and HADOOP-8366 changed the wire format for the RPC
> envelope in branch-2, but didn't make it into this rc. So, that would
> mean that future alphas would not be protocol-compatible with this
> alpha. Per a discussion a few weeks ago, I think we all were in
> agreement that, if possible, we'd like all 2.x to be compatible for
> client-server communication, at least (even if we don't support
> cross-version for the intra-cluster protocols)"
> Thanks,
> Eli
> On Tue, Jan 29, 2013 at 12:56 PM, Arun C Murthy <[EMAIL PROTECTED]> wrote:
>> Folks,
>> There has been some discussions about incompatible changes in the hadoop-2.x.x-alpha releases on HADOOP-9070, HADOOP-9151, HADOOP-9192 and few other jiras. Frankly, I'm surprised about some of them since the 'alpha' moniker was precisely to harden apis by changing them if necessary, borne out by the fact that every  single release in hadoop-2 chain has had incompatible changes. This happened since we were releasing early, moving fast and breaking things. Furthermore, we'll have more in future as move towards stability of hadoop-2 similar to HDFS-4362, HDFS-4364 et al in HDFS and YARN-142 (api changes) for YARN.

Arun C. Murthy
Hortonworks Inc.
