|
|
-
Re: S3N copy creating recursive foldersGeorge Datskos 2013-03-07, 07:51
Subroto and Shumin
Try adding a slash to to the s3n source: - hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData" /test/srcData + hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData/" /test/srcData Without the slash, it will keep listing "srcData" each time it is listed, leading to the infinite recursion you experienced. George > I used to have similar problem. Looks like there is a recursive folder > creation bug. How about you try remove the srcData from the <dst>, for > example use the following command: > > *hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData > <http://acessKey:[EMAIL PROTECTED]/srcData>" /test/* > > Or with distcp: > > *hadoop distcp s3n://acessKey:[EMAIL PROTECTED]/srcData > <http://acessKey:[EMAIL PROTECTED]/srcData>" /test/* > > HTH. > Shumin > > On Wed, Mar 6, 2013 at 5:44 AM, Subroto <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> wrote: > > Hi Mike, > > I have tries distcp as well and it ended up with exception: > 13/03/06 05:41:13 INFO tools.DistCp: > srcPaths=[s3n://acessKey:[EMAIL PROTECTED]et/srcData] > 13/03/06 05:41:13 INFO tools.DistCp: destPath=/test/srcData > 13/03/06 05:41:18 INFO tools.DistCp: /test/srcData does not exist. > org.apache.hadoop.tools.DistCp$DuplicationException: Invalid > input, there are duplicated files in the sources: > s3n://acessKey:[EMAIL PROTECTED]et/srcData/compressed, > s3n://acessKey:[EMAIL PROTECTED]et/srcData/compressed > at org.apache.hadoop.tools.DistCp.checkDuplication(DistCp.java:1368) > at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1176) > at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) > > One more interesting stuff to notice is that same thing works > nicely with hadoop 2.0 > > Cheers, > Subroto Sanyal > > On Mar 6, 2013, at 11:12 AM, Michel Segel wrote: > >> Have you tried using distcp? >> >> Sent from a remote device. Please excuse any typos... >> >> Mike Segel >> >> On Mar 5, 2013, at 8:37 AM, Subroto <[EMAIL PROTECTED] >> <mailto:[EMAIL PROTECTED]>> wrote: >> >>> Hi, >>> >>> Its not because there are too many recursive folders in S3 >>> bucket; in-fact there is no recursive folder in the source. >>> If I list the S3 bucket with Native S3 tools I can find a file >>> srcData with size 0 in the folder srcData. >>> The copy command keeps on creating >>> folder /test/srcData/srcData/srcData (keep on appending srcData). >>> >>> Cheers, >>> Subroto Sanyal >>> >>> On Mar 5, 2013, at 3:32 PM, 卖报的小行家 wrote: >>> >>>> Hi Subroto, >>>> >>>> I didn't use the s3n filesystem.But from the output "cp: >>>> java.io.IOException: mkdirs: Pathname too long. Limit 8000 >>>> characters, 1000 levels.", I think this is because the problem >>>> of the path. Is the path longer than 8000 characters or the >>>> level is more than 1000? >>>> You only have 998 folders.Maybe the last one is more than 8000 >>>> characters.Why not count the last one's length? >>>> >>>> BRs//Julian >>>> >>>> >>>> >>>> >>>> >>>> ------------------ Original ------------------ >>>> *From: * "Subroto"<[EMAIL PROTECTED] >>>> <mailto:[EMAIL PROTECTED]>>; >>>> *Date: * Tue, Mar 5, 2013 10:22 PM >>>> *To: * "user"<[EMAIL PROTECTED] >>>> <mailto:[EMAIL PROTECTED]>>; >>>> *Subject: * S3N copy creating recursive folders >>>> >>>> Hi, >>>> >>>> I am using Hadoop 1.0.3 and trying to execute: >>>> hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData" >>>> /test/srcData >>>> >>>> This ends up with: >>>> cp: java.io.IOException: mkdirs: Pathname too long. Limit 8000 |