Subroto and Shumin
Try adding a slash to to the s3n source:
- hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData"
/test/srcData
+ hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData/"
/test/srcData
Without the slash, it will keep listing "srcData" each time it is
listed, leading to the infinite recursion you experienced.
George
> I used to have similar problem. Looks like there is a recursive folder
> creation bug. How about you try remove the srcData from the <dst>, for
> example use the following command:
>
> *hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData
> <
http://acessKey:[EMAIL PROTECTED]/srcData>" /test/*
>
> Or with distcp:
>
> *hadoop distcp s3n://acessKey:[EMAIL PROTECTED]/srcData
> <
http://acessKey:[EMAIL PROTECTED]/srcData>" /test/*
>
> HTH.
> Shumin
>
> On Wed, Mar 6, 2013 at 5:44 AM, Subroto <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>
> Hi Mike,
>
> I have tries distcp as well and it ended up with exception:
> 13/03/06 05:41:13 INFO tools.DistCp:
> srcPaths=[s3n://acessKey:[EMAIL PROTECTED]et/srcData]
> 13/03/06 05:41:13 INFO tools.DistCp: destPath=/test/srcData
> 13/03/06 05:41:18 INFO tools.DistCp: /test/srcData does not exist.
> org.apache.hadoop.tools.DistCp$DuplicationException: Invalid
> input, there are duplicated files in the sources:
> s3n://acessKey:[EMAIL PROTECTED]et/srcData/compressed,
> s3n://acessKey:[EMAIL PROTECTED]et/srcData/compressed
> at org.apache.hadoop.tools.DistCp.checkDuplication(DistCp.java:1368)
> at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1176)
> at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
> at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
> One more interesting stuff to notice is that same thing works
> nicely with hadoop 2.0
>
> Cheers,
> Subroto Sanyal
>
> On Mar 6, 2013, at 11:12 AM, Michel Segel wrote:
>
>> Have you tried using distcp?
>>
>> Sent from a remote device. Please excuse any typos...
>>
>> Mike Segel
>>
>> On Mar 5, 2013, at 8:37 AM, Subroto <[EMAIL PROTECTED]
>> <mailto:[EMAIL PROTECTED]>> wrote:
>>
>>> Hi,
>>>
>>> Its not because there are too many recursive folders in S3
>>> bucket; in-fact there is no recursive folder in the source.
>>> If I list the S3 bucket with Native S3 tools I can find a file
>>> srcData with size 0 in the folder srcData.
>>> The copy command keeps on creating
>>> folder /test/srcData/srcData/srcData (keep on appending srcData).
>>>
>>> Cheers,
>>> Subroto Sanyal
>>>
>>> On Mar 5, 2013, at 3:32 PM, 卖报的小行家 wrote:
>>>
>>>> Hi Subroto,
>>>>
>>>> I didn't use the s3n filesystem.But from the output "cp:
>>>> java.io.IOException: mkdirs: Pathname too long. Limit 8000
>>>> characters, 1000 levels.", I think this is because the problem
>>>> of the path. Is the path longer than 8000 characters or the
>>>> level is more than 1000?
>>>> You only have 998 folders.Maybe the last one is more than 8000
>>>> characters.Why not count the last one's length?
>>>>
>>>> BRs//Julian
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ------------------ Original ------------------
>>>> *From: * "Subroto"<[EMAIL PROTECTED]
>>>> <mailto:[EMAIL PROTECTED]>>;
>>>> *Date: * Tue, Mar 5, 2013 10:22 PM
>>>> *To: * "user"<[EMAIL PROTECTED]
>>>> <mailto:[EMAIL PROTECTED]>>;
>>>> *Subject: * S3N copy creating recursive folders
>>>>
>>>> Hi,
>>>>
>>>> I am using Hadoop 1.0.3 and trying to execute:
>>>> hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData"
>>>> /test/srcData
>>>>
>>>> This ends up with:
>>>> cp: java.io.IOException: mkdirs: Pathname too long. Limit 8000