Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # dev >> DistCpV2 in 0.23


Copy link to this message
-
Re: DistCpV2 in 0.23
I'd suggest putting hadoop-tools either at trunk/ level or having a a tools
aggregator module for hdfs and other for common.

I personal would prefer at trunk/.

Thanks.

Alejandro

On Thu, Aug 25, 2011 at 9:06 PM, Amareshwari Sri Ramadasu <
[EMAIL PROTECTED]> wrote:

> Agree. It should be separate maven module (and patch puts it as separate
> maven module now). And top level for hadoop tools is nice to have, but it
> becomes hard to maintain until patch automation tests run the tests under
> tools. Currently we see many times the changes in HDFS effecting RAID tests
> in MapReduce. So, I'm fine putting the tools under hadoop-mapreduce.
>
> I propose we can have something like the following:
>
> trunk/
>  - hadoop-mapreduce
>      - hadoop-mr-client
>      - hadoop-yarn
>      - hadoop-tools
>          - hadoop-streaming
>          - hadoop-archives
>          - hadoop-distcp
>
> Thoughts?
>
> @Eli and @JD, we did not replace old legacy distcp because this is really a
> complete rewrite and did not want to remove it until users are familiarized
> with new one.
>
> On 8/26/11 12:51 AM, "Todd Lipcon" <[EMAIL PROTECTED]> wrote:
>
> Maybe a separate toplevel for hadoop-tools? Stuff like RAID could go
> in there as well - ie tools that are downstream of MR and/or HDFS.
>
> On Thu, Aug 25, 2011 at 12:09 PM, Mahadev Konar <[EMAIL PROTECTED]>
> wrote:
> > +1 for a seperate module in hadoop-mapreduce-project. I think
> > hadoop-mapreduce-client might not be right place for it. We might have
> > to pick a new maven module under hadoop-mapreduce-project that could
> > host streaming/distcp/hadoop archives.
> >
> > thanks
> > mahadev
> >
> > On Thu, Aug 25, 2011 at 11:04 AM, Alejandro Abdelnur <[EMAIL PROTECTED]>
> wrote:
> >> Agree, it should be a separate maven module.
> >>
> >> And it should be under hadoop-mapreduce-client, right?
> >>
> >> And now that we are in the topic, the same should go for streaming, no?
> >>
> >> Thanks.
> >>
> >> Alejandro
> >>
> >> On Thu, Aug 25, 2011 at 10:58 AM, Todd Lipcon <[EMAIL PROTECTED]>
> wrote:
> >>
> >>> On Thu, Aug 25, 2011 at 10:36 AM, Eli Collins <[EMAIL PROTECTED]>
> wrote:
> >>> > Nice work!   I definitely think this should go in 23 and 20x.
> >>> >
> >>> > Agree with JD that it should be in the core code, not contrib.  If
> >>> > it's going to be maintained then we should put it in the core code.
> >>>
> >>> Now that we're all mavenized, though, a separate maven module and
> >>> artifact does make sense IMO - ie "hadoop jar
> >>> hadoop-distcp-0.23.0-SNAPSHOT" rather than "hadoop distcp"
> >>>
> >>> -Todd
> >>> --
> >>> Todd Lipcon
> >>> Software Engineer, Cloudera
> >>>
> >>
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
>