Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # dev >> Hadoop Tools Layout (was Re: DistCpV2 in 0.23)


+
Amareshwari Sri Ramadasu 2011-08-29, 08:43
+
Allen Wittenauer 2011-08-29, 18:40
+
Amareshwari Sri Ramadasu 2011-08-30, 08:01
+
Vinod Kumar Vavilapalli 2011-08-30, 12:43
+
Mithun Radhakrishnan 2011-09-06, 05:28
+
Amareshwari Sri Ramadasu 2011-09-06, 07:13
+
Arun C Murthy 2011-09-06, 07:19
+
Vinod Kumar Vavilapalli 2011-09-06, 16:30
+
Allen Wittenauer 2011-09-06, 17:11
+
Eli Collins 2011-09-06, 23:32
+
Allen Wittenauer 2011-09-06, 23:46
+
Eric Yang 2011-09-07, 01:38
+
Alejandro Abdelnur 2011-09-07, 01:55
+
Vinod Kumar Vavilapalli 2011-09-07, 13:32
+
Eric Yang 2011-09-07, 17:50
+
Alejandro Abdelnur 2011-09-07, 18:18
+
Mahadev Konar 2011-09-07, 18:27
+
Milind.Bhandarkar@... 2011-09-07, 18:32
Copy link to this message
-
Re: Hadoop Tools Layout (was Re: DistCpV2 in 0.23)
Makes sense

On Wed, Sep 7, 2011 at 11:32 AM, <[EMAIL PROTECTED]> wrote:

> +1 for separate hadoop-tools module. However, if a tool is broken at
> release time, and no one comes forward to fix it, it should be removed.
> (i.e. Unlike contrib modules, where build and test failures were
> tolerated.)
>
> - milind
>
> On 9/7/11 11:27 AM, "Mahadev Konar" <[EMAIL PROTECTED]> wrote:
>
> >I like the idea of having tools as a seperate module and I dont think
> >that it will be a dumping ground unless we choose to make one of it.
> >
> >+1 for hadoop tools module under trunk.
> >
> >thanks
> >mahadev
> >
> >On Wed, Sep 7, 2011 at 11:18 AM, Alejandro Abdelnur <[EMAIL PROTECTED]>
> >wrote:
> >> Agreed, we should not have a dumping ground. IMO, what it would go into
> >> hadoop-tools (i.e. distcp, streaming and someone could argue for
> >>FsShell as
> >> well) are effectively hadoop CLI utilities. Having them in a separate
> >>module
> >> rather in than in the core module (common, hdfs, mapreduce) does not
> >>mean
> >> that they are secondary things, just modularization. Also it will help
> >>to
> >> get those tools to use public interfaces of the core module, and when we
> >> finally have a clean hadoop-client layer, those tools should only
> >>depend on
> >> that.
> >>
> >> Finally, the fact that tools would end up under trunk/hadoop-tools, it
> >>does
> >> not prevent that the packaging from HDFS and MAPREDUCE to bundle the
> >> same/different tools
> >>
> >> +1 for hadoop-tools/ (not binding)
> >>
> >> Thanks.
> >>
> >>
> >> On Wed, Sep 7, 2011 at 10:50 AM, Eric Yang <[EMAIL PROTECTED]> wrote:
> >>
> >>> Mapreduce and HDFS are distinct function of Hadoop.  They are loosely
> >>> coupled.  If we have tools aggregator module, it will not have as
> >>> clear distinct function as other Hadoop modules.  Hence, it is
> >>> possible for a tool to be depend on both HDFS and map reduce.  If
> >>> something broke in tools module, it is unclear which subproject's
> >>> responsibility to maintain tools function.  Therefore, it is safer to
> >>> send tools to incubator or apache extra rather than deposit the
> >>> utility tools in tools subcategory.  There are many short lived
> >>> projects that attempts to associate themselves with Hadoop but not
> >>> being maintained.  It would be better to spin off those utility
> >>> projects than use Hadoop as a dumping ground.
> >>>
> >>> The previous discussion for removing contrib, most people were in
> >>> favor of doing so, and only a few contrib owners were reluctant to
> >>> remove contrib.  Fewer people has participated in restore
> >>> functionality of broken contrib projects.  History speaks for itself.
> >>> -1 (non-binding) for hadoop-tools.
> >>>
> >>> regards,
> >>> Eric
> >>>
> >>> On Tue, Sep 6, 2011 at 6:55 PM, Alejandro Abdelnur <[EMAIL PROTECTED]>
> >>> wrote:
> >>> > Eric,
> >>> >
> >>> > Personally I'm fine either way.
> >>> >
> >>> > Still, I fail to see why a generic/categorized tools increase/reduce
> >>>the
> >>> > risk of dead code and how they make more-difficult/easier the
> >>> > package&deployment.
> >>> >
> >>> > Would you please explain this?
> >>> >
> >>> > Thanks.
> >>> >
> >>> > Alejandro
> >>> >
> >>> > On Tue, Sep 6, 2011 at 6:38 PM, Eric Yang <[EMAIL PROTECTED]> wrote:
> >>> >
> >>> >> Option #2 proposed by Amareshwari, seems like a better proposal.  We
> >>> don't
> >>> >> want to repeat history for contrib again with hadoop-tools.  Having
> >>>a
> >>> >> generic module like hadoop-tools increases the risk of accumulate
> >>>dead
> >>> code.
> >>> >>  It would be better to categorize the hdfs or mapreduce specific
> >>>tools
> >>> in
> >>> >> their respected subcategories.  It is also easier to manage from
> >>> >> package/deployment prospective.
> >>> >>
> >>> >> regards,
> >>> >> Eric
> >>> >>
> >>> >> On Sep 6, 2011, at 4:32 PM, Eli Collins wrote:
> >>> >>
> >>> >> > On Tue, Sep 6, 2011 at 10:11 AM, Allen Wittenauer <[EMAIL PROTECTED]>
> >>> wrote:
> >>>
+
Rottinghuis, Joep 2011-09-08, 03:43
+
Amareshwari Sri Ramadasu 2011-09-08, 04:33
+
Rottinghuis, Joep 2011-09-09, 05:25
+
Vinod Kumar Vavilapalli 2011-09-12, 13:47
+
Alejandro Abdelnur 2011-10-18, 19:41
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB