Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re:S3N copy creating recursive folders


+
卖报的小行家 2013-03-05, 14:32
+
Subroto 2013-03-05, 14:37
Copy link to this message
-
Re: S3N copy creating recursive folders
Subroto and Shumin

Try adding a slash to to the s3n source:

- hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData"
/test/srcData
+ hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData/"
/test/srcData

Without the slash, it will keep listing "srcData" each time it is
listed, leading to the infinite recursion you experienced.
George
> I used to have similar problem. Looks like there is a recursive folder
> creation bug. How about you try remove the srcData from the <dst>, for
> example use the following command:
>
> *hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData
> <http://acessKey:[EMAIL PROTECTED]/srcData>" /test/*
>
> Or with distcp:
>
> *hadoop distcp s3n://acessKey:[EMAIL PROTECTED]/srcData
> <http://acessKey:[EMAIL PROTECTED]/srcData>" /test/*
>
> HTH.
> Shumin
>
> On Wed, Mar 6, 2013 at 5:44 AM, Subroto <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>
>     Hi Mike,
>
>     I have tries distcp as well and it ended up with exception:
>     13/03/06 05:41:13 INFO tools.DistCp:
>     srcPaths=[s3n://acessKey:[EMAIL PROTECTED]et/srcData]
>     13/03/06 05:41:13 INFO tools.DistCp: destPath=/test/srcData
>     13/03/06 05:41:18 INFO tools.DistCp: /test/srcData does not exist.
>     org.apache.hadoop.tools.DistCp$DuplicationException: Invalid
>     input, there are duplicated files in the sources:
>     s3n://acessKey:[EMAIL PROTECTED]et/srcData/compressed,
>     s3n://acessKey:[EMAIL PROTECTED]et/srcData/compressed
>     at org.apache.hadoop.tools.DistCp.checkDuplication(DistCp.java:1368)
>     at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1176)
>     at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
>     at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>     at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
>     One more interesting stuff to notice is that same thing works
>     nicely with hadoop 2.0
>
>     Cheers,
>     Subroto Sanyal
>
>     On Mar 6, 2013, at 11:12 AM, Michel Segel wrote:
>
>>     Have you tried using distcp?
>>
>>     Sent from a remote device. Please excuse any typos...
>>
>>     Mike Segel
>>
>>     On Mar 5, 2013, at 8:37 AM, Subroto <[EMAIL PROTECTED]
>>     <mailto:[EMAIL PROTECTED]>> wrote:
>>
>>>     Hi,
>>>
>>>     Its not because there are too many recursive folders in S3
>>>     bucket; in-fact there is no recursive folder in the source.
>>>     If I list the S3 bucket with Native S3 tools I can find a file
>>>     srcData with size 0 in the folder srcData.
>>>     The copy command keeps on creating
>>>     folder  /test/srcData/srcData/srcData (keep on appending srcData).
>>>
>>>     Cheers,
>>>     Subroto Sanyal
>>>
>>>     On Mar 5, 2013, at 3:32 PM, 卖报的小行家 wrote:
>>>
>>>>     Hi Subroto,
>>>>
>>>>     I didn't use the s3n filesystem.But  from the output "cp:
>>>>     java.io.IOException: mkdirs: Pathname too long.  Limit 8000
>>>>     characters, 1000 levels.", I think this is because the problem
>>>>     of the path. Is the path longer than 8000 characters or the
>>>>     level is more than 1000?
>>>>     You only have 998 folders.Maybe the last one is more than 8000
>>>>     characters.Why not count the last one's length?
>>>>
>>>>     BRs//Julian
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>     ------------------ Original ------------------
>>>>     *From: * "Subroto"<[EMAIL PROTECTED]
>>>>     <mailto:[EMAIL PROTECTED]>>;
>>>>     *Date: * Tue, Mar 5, 2013 10:22 PM
>>>>     *To: * "user"<[EMAIL PROTECTED]
>>>>     <mailto:[EMAIL PROTECTED]>>;
>>>>     *Subject: * S3N copy creating recursive folders
>>>>
>>>>     Hi,
>>>>
>>>>     I am using Hadoop 1.0.3 and trying to execute:
>>>>     hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData"
>>>>     /test/srcData
>>>>
>>>>     This ends up with:
>>>>     cp: java.io.IOException: mkdirs: Pathname too long. Limit 8000
+
Subroto 2013-03-07, 08:33
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB