Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # dev - Hadoop Tools Layout (was Re: DistCpV2 in 0.23)


+
Amareshwari Sri Ramadasu 2011-08-29, 08:43
+
Allen Wittenauer 2011-08-29, 18:40
+
Amareshwari Sri Ramadasu 2011-08-30, 08:01
+
Vinod Kumar Vavilapalli 2011-08-30, 12:43
+
Mithun Radhakrishnan 2011-09-06, 05:28
+
Amareshwari Sri Ramadasu 2011-09-06, 07:13
+
Arun C Murthy 2011-09-06, 07:19
+
Vinod Kumar Vavilapalli 2011-09-06, 16:30
+
Allen Wittenauer 2011-09-06, 17:11
+
Eli Collins 2011-09-06, 23:32
+
Allen Wittenauer 2011-09-06, 23:46
+
Eric Yang 2011-09-07, 01:38
+
Alejandro Abdelnur 2011-09-07, 01:55
+
Vinod Kumar Vavilapalli 2011-09-07, 13:32
+
Eric Yang 2011-09-07, 17:50
Copy link to this message
-
Re: Hadoop Tools Layout (was Re: DistCpV2 in 0.23)
Alejandro Abdelnur 2011-09-07, 18:18
Agreed, we should not have a dumping ground. IMO, what it would go into
hadoop-tools (i.e. distcp, streaming and someone could argue for FsShell as
well) are effectively hadoop CLI utilities. Having them in a separate module
rather in than in the core module (common, hdfs, mapreduce) does not mean
that they are secondary things, just modularization. Also it will help to
get those tools to use public interfaces of the core module, and when we
finally have a clean hadoop-client layer, those tools should only depend on
that.

Finally, the fact that tools would end up under trunk/hadoop-tools, it does
not prevent that the packaging from HDFS and MAPREDUCE to bundle the
same/different tools

+1 for hadoop-tools/ (not binding)

Thanks.
On Wed, Sep 7, 2011 at 10:50 AM, Eric Yang <[EMAIL PROTECTED]> wrote:

> Mapreduce and HDFS are distinct function of Hadoop.  They are loosely
> coupled.  If we have tools aggregator module, it will not have as
> clear distinct function as other Hadoop modules.  Hence, it is
> possible for a tool to be depend on both HDFS and map reduce.  If
> something broke in tools module, it is unclear which subproject's
> responsibility to maintain tools function.  Therefore, it is safer to
> send tools to incubator or apache extra rather than deposit the
> utility tools in tools subcategory.  There are many short lived
> projects that attempts to associate themselves with Hadoop but not
> being maintained.  It would be better to spin off those utility
> projects than use Hadoop as a dumping ground.
>
> The previous discussion for removing contrib, most people were in
> favor of doing so, and only a few contrib owners were reluctant to
> remove contrib.  Fewer people has participated in restore
> functionality of broken contrib projects.  History speaks for itself.
> -1 (non-binding) for hadoop-tools.
>
> regards,
> Eric
>
> On Tue, Sep 6, 2011 at 6:55 PM, Alejandro Abdelnur <[EMAIL PROTECTED]>
> wrote:
> > Eric,
> >
> > Personally I'm fine either way.
> >
> > Still, I fail to see why a generic/categorized tools increase/reduce the
> > risk of dead code and how they make more-difficult/easier the
> > package&deployment.
> >
> > Would you please explain this?
> >
> > Thanks.
> >
> > Alejandro
> >
> > On Tue, Sep 6, 2011 at 6:38 PM, Eric Yang <[EMAIL PROTECTED]> wrote:
> >
> >> Option #2 proposed by Amareshwari, seems like a better proposal.  We
> don't
> >> want to repeat history for contrib again with hadoop-tools.  Having a
> >> generic module like hadoop-tools increases the risk of accumulate dead
> code.
> >>  It would be better to categorize the hdfs or mapreduce specific tools
> in
> >> their respected subcategories.  It is also easier to manage from
> >> package/deployment prospective.
> >>
> >> regards,
> >> Eric
> >>
> >> On Sep 6, 2011, at 4:32 PM, Eli Collins wrote:
> >>
> >> > On Tue, Sep 6, 2011 at 10:11 AM, Allen Wittenauer <[EMAIL PROTECTED]>
> wrote:
> >> >>
> >> >> On Sep 6, 2011, at 9:30 AM, Vinod Kumar Vavilapalli wrote:
> >> >>> We still need to answer Amareshwari's question (2) she asked some
> time
> >> back
> >> >>> about the automated code compilation and test execution of the tools
> >> module.
> >> >>
> >> >>
> >> >>
> >> >>>>> My #1 question is if tools is basically contrib reborn.  If not,
> what
> >> >>>> makes
> >> >>>>> it different?
> >> >>
> >> >>
> >> >>        I'm still waiting for this answer as well.
> >> >>
> >> >>        Until such, I would be pretty much against a tools module.
> >>  Changing the name of the dumping ground doesn't make it any less of a
> >> dumping ground.
> >> >
> >> > IMO if the tools module only gets stuff like distcp that's maintained
> >> > then it's not contrib, if it contains all the stuff from the current
> >> > MR contrib then tools is just a re-labeling of contrib. Given that
> >> > this proposal only covers moving distcp to tools it doesn't sound like
> >> > contrib to me.
> >> >
> >> > Thanks,
> >> > Eli
> >>
> >>
> >
>
+
Mahadev Konar 2011-09-07, 18:27
+
Milind.Bhandarkar@... 2011-09-07, 18:32
+
Alejandro Abdelnur 2011-09-07, 18:35
+
Rottinghuis, Joep 2011-09-08, 03:43
+
Amareshwari Sri Ramadasu 2011-09-08, 04:33
+
Rottinghuis, Joep 2011-09-09, 05:25
+
Vinod Kumar Vavilapalli 2011-09-12, 13:47
+
Alejandro Abdelnur 2011-10-18, 19:41