Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> Hadoop 2.0 Support for Accumulo 1.4 Branch


Copy link to this message
-
Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
Thanks for the note, Ted. That vote is for 2.2.0, not -beta.
On Oct 14, 2013 7:30 PM, "Ted Yu" <[EMAIL PROTECTED]> wrote:

> w.r.t. hadoop-2 release, see this thread:
>
> http://search-hadoop.com/m/YSTny19y1Ha1/hadoop+2.2.0
>
> Looks like 2.2.0-beta would pass votes.
>
> Cheers
>
>
> On Mon, Oct 14, 2013 at 7:24 PM, Mike Drob <[EMAIL PROTECTED]> wrote:
>
> > Responses Inline.
> >
> > - Mike
> >
> > On Mon, Oct 14, 2013 at 12:55 PM, Sean Busbey <[EMAIL PROTECTED]>
> wrote:
> >
> > > Hey All,
> > >
> > > I'd like to restart the conversation from end July / start August about
> > > Hadoop 2 support on the 1.4 branch.
> > >
> > > Specifically, I'd like to get some requirements ironed out so I can
> file
> > > one or more jiras. I'd also like to get a plan for application.
> > >
> > > =requirements
> > >
> > > Here's the requirements I have from the last thread:
> > >
> > > 1)  Maintain existing 1.4 compatibility
> > >
> > > The only thing I see listed in the pom is Apache release 0.20.203.0.
> > (1.4.4
> > > tag)[1]
> > >
> > > I don't see anything in the README[2] nor the user manual[3] on other
> > > versions being supported.
> > >
> > > Yep.
> >
> >
> > > 2) Gain Hadoop 2 support
> > >
> > > At the moment, I'm presuming this means Apache release 2.0.4-alpha
> since
> > > that's what 1.5.0 builds against for Hadoop 2.
> > >
> > > I haven't been following the Hadoop 2 release schedule that closely,
> but
> > I
> > think the latest is a 2.1.0-beta? Pretty sure it was released after we
> > finished Accumulo 1.5, so there's no reason not to support it in my mind.
> > Depending on an "alpha" of something strikes me as either unstable or
> lazy,
> > although I fully understand that it may be neither.
> >
> >
> > > 3) Test for correctness on given versions, with >= 5 node cluster
> > >
> > > * Unit Tests
> > > * Functional Tests
> > > * 24hr continuous + verification
> > > * 24hr continuous + verification + agitation
> > > * 24hr random walk
> > > * 24hr random walk + agitation
> > >
> > > Keith mentioned running these against a CDH4 cluster, but I presume
> that
> > > since Apache Releases are our stated compatibilities it would actually
> be
> > > against whatever versions we list. Based on #1 and #2 above, I would
> > expect
> > > that to be Apache Hadoop 0.20.203.0 and Apache Hadoop 2.0.4-alpha.
> > >
> > > Hadoop 2 introduces some neat new things like NN HA, which I think it
> > might be worthwhile to test with. At that level it might be more of a
> > verification of the Hadoop code, but I'd like to be comfortable that our
> > DFS Clients switch correctly. This is in addition to the standard release
> > suite that we run. [1]
> >
> > [1]: http://accumulo.apache.org/governance/releasing.html#testing
> >
> >
> > > 4) Binary packaging
> > > 4a) Either source produces a single binary for all accepted versions
> > >
> > > or
> > >
> > > 4b) Instructions for building from source for each versions and somehow
> > > flag what (if any) convenience binaries are made for the release.
> > >
> > >
> > Having run the binary packaging for 1.4.4, I can tell you that it is not
> in
> > great shape. Christopher cleaned up a lot of the issues in the 1.5 line,
> so
> > I didn't bother spending a ton of time on them here, but I think RPM and
> > DEB are both broken. It would be nice to be able to specify a Hadoop 2
> > version for compilation, similar to what happens in the newer code base,
> > which could be back ported, I suppose. 4b seems easier.
> >
> > =application
> > >
> > > There will be many back-ported patches. Not much active development
> > happens
> > > on 1.4.x now, but I presume this should still all go onto a feature
> > branch?
> > >
> > > Is the community preference that eventually all the changes become a
> > single
> > > commit (or one-per-subtask if there are multiple jiras) on the active
> 1.4
> > > development branch, or that the original patches remain broken out?
> > >
> > > Not sure what you mean by this.
> >
> >
> > > For what it's worth, I'd recommend keeping them broken out. (And that's