Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # general - [VOTE] Abandon mrunit MapReduce contrib


Copy link to this message
-
Re: [VOTE] Abandon mrunit MapReduce contrib
Aaron Kimball 2011-02-11, 22:10
The main reason I am interested in removing MRUnit from Hadoop is that I
believe that MRUnit deserves its own release cycle. I think this is in the
best interest of its users.

MRUnit is valuable to users of several different versions of Hadoop. But
MRUnit has only ever been committed to version 0.21 and above -- even though
in practice, the majority (dare I say--all) of its users are running on
0.20. The only place today to get a version of MRUnit compatible with 0.20
has been through a Cloudera release, which backported the entire MRUnit
patchset.

My thoughts on MRUnit in 0.20.100 resonate with Eric's. There will be
further fixes to MRUnit and its lightweight codebase can be released far
more rapidly than whenever the next 0.20.1xx release of Hadoop would occur.
Given that MRUnit has already been in the repository since April 2009 (see
https://issues.apache.org/jira/browse/HADOOP-5518) and has yet to see an
Apache 0.20-based release, I do not think it is in the best interest of the
library's userbase to couple MRUnit's release cycle to that of Hadoop
itself.

Perhaps more importantly, access to new features in MRUnit should not
require upgrading one's entire Hadoop deployment; this is a client library
that depends only on Hadoop's public APIs.

My primary concern is to move MRUnit to a place where the community can
derive the most benefit from it. The Apache Incubator could fulfill this
role; given the presence of individuals willing to mentor this project, I
believe this would be a successful way to release MRUnit more quickly and
continue to work to grow the MRUnit community.

Regards,
- Aaron
On Fri, Feb 11, 2011 at 11:57 AM, Mattmann, Chris A (388J) <
[EMAIL PROTECTED]> wrote:

> Awesome Patrick, we'd probably need one more active mentor. Any takers?
>
> After we get that, then we cook up a proposal on the Incubator wiki here
> [1], and follow the process here [2] to get started...
>
> Cheers,
> Chris
>
> [1] http://wiki.apache.org/incubator/MRUnitProposal
> [2] http://incubator.apache.org/guides/proposal.html
>
> On Feb 11, 2011, at 11:52 AM, Patrick Hunt wrote:
>
> > On Fri, Feb 11, 2011 at 9:44 AM, Mattmann, Chris A (388J)
> > <[EMAIL PROTECTED]> wrote:
> >> Guys, BTW, if you need help or a mentor in Apache Incubator-ville for
> MRUnit, I would be happy to help.
> >
> > I was going to suggest the same thing (mrunit to incubator). I would
> > also be happy to be a mentor.
> >
> > Patrick
> >
> >>
> >> On Feb 11, 2011, at 9:04 AM, Eric Sammer wrote:
> >>
> >>> On Fri, Feb 11, 2011 at 11:48 AM, Owen O'Malley <[EMAIL PROTECTED]>
> wrote:
> >>>
> >>>> On Feb 11, 2011, at 8:02 AM, Eric Sammer wrote:
> >>>>
> >>>> - allow mrunit to have its own release cycle. This is, I think, the
> most
> >>>>>
> >>>>
> >>>> important.
> >>>>>
> >>>>
> >>>> If you submit your work to Apache we can evaluate it for inclusion in
> the
> >>>> 0.20.100 branch to get your changes released in a timely manner.
> >>>
> >>>
> >>> I'm thinking in general (beyond the next immediate release).
> Independent of
> >>> where mrunit goes, I think it should leave the contrib tree to
> facilitate
> >>> light weight releases (the dependency on Hadoop proper is a public
> facing
> >>> API - a pure client). I think most projects could benefit from this
> with the
> >>> exception of things that are tightly coupled to Hadoop releases or
> touch
> >>> non-public APIs.
> >>>
> >>>
> >>>> I would actually prefer to move it to Extras or Incubator and leave
> this
> >>>>> within the ASF.
> >>>>>
> >>>>
> >>>> Extras is **NOT** inside of the ASF. Extras is a source hosting system
> for
> >>>> non-Apache projects that are related to Apache projects.
> >>>
> >>>
> >>> Got it. Thanks for correcting me. I only mentioned it because someone
> >>> suggested it to me initially.
> >>>
> >>>
> >>>> Right now, I picked github because of the ability to easily
> >>>> collaborate with others (and to use git).
> >>>>
> >>>
> >>> I agree that it is unfortunate that Apache doesn't yet support