|
|
-
Re: S3N copy creating recursive foldersSubroto 2013-03-07, 08:33
Hi George,
Tried as per your suggestion: hadoop fs -cp "s3n://acessKey:[EMAIL PROTECTED]et/srcData/" /test/srcData/ Still facing the same problem :-( : cp: java.io.IOException: mkdirs: Pathname too long. Limit 8000 characters, 1000 levels. Cheers, Subroto Sanyal On Mar 7, 2013, at 8:51 AM, George Datskos wrote: > Subroto and Shumin > > Try adding a slash to to the s3n source: > > - hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData" /test/srcData > + hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData/" /test/srcData > > Without the slash, it will keep listing "srcData" each time it is listed, leading to the infinite recursion you experienced. > > > George > > >> I used to have similar problem. Looks like there is a recursive folder creation bug. How about you try remove the srcData from the <dst>, for example use the following command: >> >> hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData" /test/ >> >> Or with distcp: >> >> hadoop distcp s3n://acessKey:[EMAIL PROTECTED]/srcData" /test/ >> >> HTH. >> Shumin >> >> On Wed, Mar 6, 2013 at 5:44 AM, Subroto <[EMAIL PROTECTED]> wrote: >> Hi Mike, >> >> I have tries distcp as well and it ended up with exception: >> 13/03/06 05:41:13 INFO tools.DistCp: srcPaths=[s3n://acessKey:[EMAIL PROTECTED]et/srcData] >> 13/03/06 05:41:13 INFO tools.DistCp: destPath=/test/srcData >> 13/03/06 05:41:18 INFO tools.DistCp: /test/srcData does not exist. >> org.apache.hadoop.tools.DistCp$DuplicationException: Invalid input, there are duplicated files in the sources: s3n://acessKey:[EMAIL PROTECTED]et/srcData/compressed, s3n://acessKey:[EMAIL PROTECTED]et/srcData/compressed >> at org.apache.hadoop.tools.DistCp.checkDuplication(DistCp.java:1368) >> at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1176) >> at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666) >> at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) >> >> One more interesting stuff to notice is that same thing works nicely with hadoop 2.0 >> >> Cheers, >> Subroto Sanyal >> >> On Mar 6, 2013, at 11:12 AM, Michel Segel wrote: >> >>> Have you tried using distcp? >>> >>> Sent from a remote device. Please excuse any typos... >>> >>> Mike Segel >>> >>> On Mar 5, 2013, at 8:37 AM, Subroto <[EMAIL PROTECTED]> wrote: >>> >>>> Hi, >>>> >>>> Its not because there are too many recursive folders in S3 bucket; in-fact there is no recursive folder in the source. >>>> If I list the S3 bucket with Native S3 tools I can find a file srcData with size 0 in the folder srcData. >>>> The copy command keeps on creating folder /test/srcData/srcData/srcData (keep on appending srcData). >>>> >>>> Cheers, >>>> Subroto Sanyal >>>> >>>> On Mar 5, 2013, at 3:32 PM, 卖报的小行家 wrote: >>>> >>>>> Hi Subroto, >>>>> >>>>> I didn't use the s3n filesystem.But from the output "cp: java.io.IOException: mkdirs: Pathname too long. Limit 8000 characters, 1000 levels.", I think this is because the problem of the path. Is the path longer than 8000 characters or the level is more than 1000? >>>>> You only have 998 folders.Maybe the last one is more than 8000 characters.Why not count the last one's length? >>>>> >>>>> BRs//Julian >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> ------------------ Original ------------------ >>>>> From: "Subroto"<[EMAIL PROTECTED]>; >>>>> Date: Tue, Mar 5, 2013 10:22 PM >>>>> To: "user"<[EMAIL PROTECTED]>; >>>>> Subject: S3N copy creating recursive folders >>>>> >>>>> Hi, >>>>> >>>>> I am using Hadoop 1.0.3 and trying to execute: >>>>> hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData" /test/srcData >>>>> >>>>> This ends up with: >>>>> cp: java.io.IOException: mkdirs: Pathname too long. Limit 8000 characters, 1000 levels. >>>>> > |