Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re:S3N copy creating recursive folders


+
卖报的小行家 2013-03-05, 14:32
+
Subroto 2013-03-05, 14:37
Copy link to this message
-
Re: S3N copy creating recursive folders
Subroto and Shumin

Try adding a slash to to the s3n source:

- hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData"
/test/srcData
+ hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData/"
/test/srcData

Without the slash, it will keep listing "srcData" each time it is
listed, leading to the infinite recursion you experienced.
George
> I used to have similar problem. Looks like there is a recursive folder
> creation bug. How about you try remove the srcData from the <dst>, for
> example use the following command:
>
> *hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData
> <http://acessKey:[EMAIL PROTECTED]/srcData>" /test/*
>
> Or with distcp:
>
> *hadoop distcp s3n://acessKey:[EMAIL PROTECTED]/srcData
> <http://acessKey:[EMAIL PROTECTED]/srcData>" /test/*
>
> HTH.
> Shumin
>
> On Wed, Mar 6, 2013 at 5:44 AM, Subroto <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>
>     Hi Mike,
>
>     I have tries distcp as well and it ended up with exception:
>     13/03/06 05:41:13 INFO tools.DistCp:
>     srcPaths=[s3n://acessKey:[EMAIL PROTECTED]et/srcData]
>     13/03/06 05:41:13 INFO tools.DistCp: destPath=/test/srcData
>     13/03/06 05:41:18 INFO tools.DistCp: /test/srcData does not exist.
>     org.apache.hadoop.tools.DistCp$DuplicationException: Invalid
>     input, there are duplicated files in the sources:
>     s3n://acessKey:[EMAIL PROTECTED]et/srcData/compressed,
>     s3n://acessKey:[EMAIL PROTECTED]et/srcData/compressed
>     at org.apache.hadoop.tools.DistCp.checkDuplication(DistCp.java:1368)
>     at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1176)
>     at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
>     at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>     at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
>     One more interesting stuff to notice is that same thing works
>     nicely with hadoop 2.0
>
>     Cheers,
>     Subroto Sanyal
>
>     On Mar 6, 2013, at 11:12 AM, Michel Segel wrote:
>
>>     Have you tried using distcp?
>>
>>     Sent from a remote device. Please excuse any typos...
>>
>>     Mike Segel
>>
>>     On Mar 5, 2013, at 8:37 AM, Subroto <[EMAIL PROTECTED]
>>     <mailto:[EMAIL PROTECTED]>> wrote:
>>
>>>     Hi,
>>>
>>>     Its not because there are too many recursive folders in S3
>>>     bucket; in-fact there is no recursive folder in the source.
>>>     If I list the S3 bucket with Native S3 tools I can find a file
>>>     srcData with size 0 in the folder srcData.
>>>     The copy command keeps on creating
>>>     folder  /test/srcData/srcData/srcData (keep on appending srcData).
>>>
>>>     Cheers,
>>>     Subroto Sanyal
>>>
>>>     On Mar 5, 2013, at 3:32 PM, 卖报的小行家 wrote:
>>>
>>>>     Hi Subroto,
>>>>
>>>>     I didn't use the s3n filesystem.But  from the output "cp:
>>>>     java.io.IOException: mkdirs: Pathname too long.  Limit 8000
>>>>     characters, 1000 levels.", I think this is because the problem
>>>>     of the path. Is the path longer than 8000 characters or the
>>>>     level is more than 1000?
>>>>     You only have 998 folders.Maybe the last one is more than 8000
>>>>     characters.Why not count the last one's length?
>>>>
>>>>     BRs//Julian
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>     ------------------ Original ------------------
>>>>     *From: * "Subroto"<[EMAIL PROTECTED]
>>>>     <mailto:[EMAIL PROTECTED]>>;
>>>>     *Date: * Tue, Mar 5, 2013 10:22 PM
>>>>     *To: * "user"<[EMAIL PROTECTED]
>>>>     <mailto:[EMAIL PROTECTED]>>;
>>>>     *Subject: * S3N copy creating recursive folders
>>>>
>>>>     Hi,
>>>>
>>>>     I am using Hadoop 1.0.3 and trying to execute:
>>>>     hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData"
>>>>     /test/srcData
>>>>
>>>>     This ends up with:
>>>>     cp: java.io.IOException: mkdirs: Pathname too long. Limit 8000
+
Subroto 2013-03-07, 08:33