Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> Hadoop 2.0 Support for Accumulo 1.4 Branch


Copy link to this message
-
Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
w.r.t. hadoop-2 release, see this thread:

http://search-hadoop.com/m/YSTny19y1Ha1/hadoop+2.2.0

Looks like 2.2.0-beta would pass votes.

Cheers
On Mon, Oct 14, 2013 at 7:24 PM, Mike Drob <[EMAIL PROTECTED]> wrote:

> Responses Inline.
>
> - Mike
>
> On Mon, Oct 14, 2013 at 12:55 PM, Sean Busbey <[EMAIL PROTECTED]> wrote:
>
> > Hey All,
> >
> > I'd like to restart the conversation from end July / start August about
> > Hadoop 2 support on the 1.4 branch.
> >
> > Specifically, I'd like to get some requirements ironed out so I can file
> > one or more jiras. I'd also like to get a plan for application.
> >
> > =requirements
> >
> > Here's the requirements I have from the last thread:
> >
> > 1)  Maintain existing 1.4 compatibility
> >
> > The only thing I see listed in the pom is Apache release 0.20.203.0.
> (1.4.4
> > tag)[1]
> >
> > I don't see anything in the README[2] nor the user manual[3] on other
> > versions being supported.
> >
> > Yep.
>
>
> > 2) Gain Hadoop 2 support
> >
> > At the moment, I'm presuming this means Apache release 2.0.4-alpha since
> > that's what 1.5.0 builds against for Hadoop 2.
> >
> > I haven't been following the Hadoop 2 release schedule that closely, but
> I
> think the latest is a 2.1.0-beta? Pretty sure it was released after we
> finished Accumulo 1.5, so there's no reason not to support it in my mind.
> Depending on an "alpha" of something strikes me as either unstable or lazy,
> although I fully understand that it may be neither.
>
>
> > 3) Test for correctness on given versions, with >= 5 node cluster
> >
> > * Unit Tests
> > * Functional Tests
> > * 24hr continuous + verification
> > * 24hr continuous + verification + agitation
> > * 24hr random walk
> > * 24hr random walk + agitation
> >
> > Keith mentioned running these against a CDH4 cluster, but I presume that
> > since Apache Releases are our stated compatibilities it would actually be
> > against whatever versions we list. Based on #1 and #2 above, I would
> expect
> > that to be Apache Hadoop 0.20.203.0 and Apache Hadoop 2.0.4-alpha.
> >
> > Hadoop 2 introduces some neat new things like NN HA, which I think it
> might be worthwhile to test with. At that level it might be more of a
> verification of the Hadoop code, but I'd like to be comfortable that our
> DFS Clients switch correctly. This is in addition to the standard release
> suite that we run. [1]
>
> [1]: http://accumulo.apache.org/governance/releasing.html#testing
>
>
> > 4) Binary packaging
> > 4a) Either source produces a single binary for all accepted versions
> >
> > or
> >
> > 4b) Instructions for building from source for each versions and somehow
> > flag what (if any) convenience binaries are made for the release.
> >
> >
> Having run the binary packaging for 1.4.4, I can tell you that it is not in
> great shape. Christopher cleaned up a lot of the issues in the 1.5 line, so
> I didn't bother spending a ton of time on them here, but I think RPM and
> DEB are both broken. It would be nice to be able to specify a Hadoop 2
> version for compilation, similar to what happens in the newer code base,
> which could be back ported, I suppose. 4b seems easier.
>
> =application
> >
> > There will be many back-ported patches. Not much active development
> happens
> > on 1.4.x now, but I presume this should still all go onto a feature
> branch?
> >
> > Is the community preference that eventually all the changes become a
> single
> > commit (or one-per-subtask if there are multiple jiras) on the active 1.4
> > development branch, or that the original patches remain broken out?
> >
> > Not sure what you mean by this.
>
>
> > For what it's worth, I'd recommend keeping them broken out. (And that's
> how
> > the initial development against CDH4 has been done.)
> >
> >
> > [1] http://bit.ly/1fxucMe
> > [2] http://bit.ly/192zUAJ
> > [3]
> >
> http://accumulo.apache.org/1.4/user_manual/Administration.html#Dependencies
> >
> > --
> > Sean
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB