Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # dev >> DistCpV2 in 0.23


Copy link to this message
-
Re: DistCpV2 in 0.23
+1 to Alejandro's

I prefer to keep the hadoop-tools at trunk level.

-Giri

On Thu, Aug 25, 2011 at 9:15 PM, Alejandro Abdelnur <[EMAIL PROTECTED]> wrote:
> I'd suggest putting hadoop-tools either at trunk/ level or having a a tools
> aggregator module for hdfs and other for common.
>
> I personal would prefer at trunk/.
>
> Thanks.
>
> Alejandro
>
> On Thu, Aug 25, 2011 at 9:06 PM, Amareshwari Sri Ramadasu <
> [EMAIL PROTECTED]> wrote:
>
>> Agree. It should be separate maven module (and patch puts it as separate
>> maven module now). And top level for hadoop tools is nice to have, but it
>> becomes hard to maintain until patch automation tests run the tests under
>> tools. Currently we see many times the changes in HDFS effecting RAID tests
>> in MapReduce. So, I'm fine putting the tools under hadoop-mapreduce.
>>
>> I propose we can have something like the following:
>>
>> trunk/
>>  - hadoop-mapreduce
>>      - hadoop-mr-client
>>      - hadoop-yarn
>>      - hadoop-tools
>>          - hadoop-streaming
>>          - hadoop-archives
>>          - hadoop-distcp
>>
>> Thoughts?
>>
>> @Eli and @JD, we did not replace old legacy distcp because this is really a
>> complete rewrite and did not want to remove it until users are familiarized
>> with new one.
>>
>> On 8/26/11 12:51 AM, "Todd Lipcon" <[EMAIL PROTECTED]> wrote:
>>
>> Maybe a separate toplevel for hadoop-tools? Stuff like RAID could go
>> in there as well - ie tools that are downstream of MR and/or HDFS.
>>
>> On Thu, Aug 25, 2011 at 12:09 PM, Mahadev Konar <[EMAIL PROTECTED]>
>> wrote:
>> > +1 for a seperate module in hadoop-mapreduce-project. I think
>> > hadoop-mapreduce-client might not be right place for it. We might have
>> > to pick a new maven module under hadoop-mapreduce-project that could
>> > host streaming/distcp/hadoop archives.
>> >
>> > thanks
>> > mahadev
>> >
>> > On Thu, Aug 25, 2011 at 11:04 AM, Alejandro Abdelnur <[EMAIL PROTECTED]>
>> wrote:
>> >> Agree, it should be a separate maven module.
>> >>
>> >> And it should be under hadoop-mapreduce-client, right?
>> >>
>> >> And now that we are in the topic, the same should go for streaming, no?
>> >>
>> >> Thanks.
>> >>
>> >> Alejandro
>> >>
>> >> On Thu, Aug 25, 2011 at 10:58 AM, Todd Lipcon <[EMAIL PROTECTED]>
>> wrote:
>> >>
>> >>> On Thu, Aug 25, 2011 at 10:36 AM, Eli Collins <[EMAIL PROTECTED]>
>> wrote:
>> >>> > Nice work!   I definitely think this should go in 23 and 20x.
>> >>> >
>> >>> > Agree with JD that it should be in the core code, not contrib.  If
>> >>> > it's going to be maintained then we should put it in the core code.
>> >>>
>> >>> Now that we're all mavenized, though, a separate maven module and
>> >>> artifact does make sense IMO - ie "hadoop jar
>> >>> hadoop-distcp-0.23.0-SNAPSHOT" rather than "hadoop distcp"
>> >>>
>> >>> -Todd
>> >>> --
>> >>> Todd Lipcon
>> >>> Software Engineer, Cloudera
>> >>>
>> >>
>> >
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>>
>

--
-Giri