Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Re: S3N copy creating recursive folders


Copy link to this message
-
Re: S3N copy creating recursive folders
I used to have similar problem. Looks like there is a recursive folder
creation bug. How about you try remove the srcData from the <dst>, for
example use the following command:

*hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData" /test/*

Or with distcp:

*hadoop distcp s3n://acessKey:[EMAIL PROTECTED]/srcData" /test/*

HTH.
Shumin

On Wed, Mar 6, 2013 at 5:44 AM, Subroto <[EMAIL PROTECTED]> wrote:

> Hi Mike,
>
> I have tries distcp as well and it ended up with exception:
> 13/03/06 05:41:13 INFO tools.DistCp: srcPaths=[
> s3n://acessKey:[EMAIL PROTECTED]et/srcData]
> 13/03/06 05:41:13 INFO tools.DistCp: destPath=/test/srcData
> 13/03/06 05:41:18 INFO tools.DistCp: /test/srcData does not exist.
> org.apache.hadoop.tools.DistCp$DuplicationException: Invalid input, there
> are duplicated files in the sources:
> s3n://acessKey:[EMAIL PROTECTED]et/srcData/compressed,
> s3n://acessKey:[EMAIL PROTECTED]et/srcData/compressed
>  at org.apache.hadoop.tools.DistCp.checkDuplication(DistCp.java:1368)
> at org.apache.hadoop.tools.DistCp.setup(DistCp.java:1176)
>  at org.apache.hadoop.tools.DistCp.copy(DistCp.java:666)
> at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>  at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
> One more interesting stuff to notice is that same thing works nicely with
> hadoop 2.0
>
> Cheers,
> Subroto Sanyal
>
> On Mar 6, 2013, at 11:12 AM, Michel Segel wrote:
>
> Have you tried using distcp?
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Mar 5, 2013, at 8:37 AM, Subroto <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> Its not because there are too many recursive folders in S3 bucket; in-fact
> there is no recursive folder in the source.
> If I list the S3 bucket with Native S3 tools I can find a file srcData
> with size 0 in the folder srcData.
> The copy command keeps on creating folder  /test/srcData/srcData/srcData
> (keep on appending srcData).
>
> Cheers,
> Subroto Sanyal
>
> On Mar 5, 2013, at 3:32 PM, 卖报的小行家 wrote:
>
> Hi Subroto,
>
> I didn't use the s3n filesystem.But  from the output "cp:
> java.io.IOException: mkdirs: Pathname too long.  Limit 8000 characters,
> 1000 levels.", I think this is because the problem of the path. Is the
> path longer than 8000 characters or the level is more than 1000?
> You only have 998 folders.Maybe the last one is more than 8000
> characters.Why not count the last one's length?
>
> BRs//Julian
>
>
>
>
>
> ------------------ Original ------------------
> *From: * "Subroto"<[EMAIL PROTECTED]>;
> *Date: * Tue, Mar 5, 2013 10:22 PM
> *To: * "user"<[EMAIL PROTECTED]>; **
> *Subject: * S3N copy creating recursive folders
>
> Hi,
>
> I am using Hadoop 1.0.3 and trying to execute:
> hadoop fs -cp s3n://acessKey:[EMAIL PROTECTED]/srcData"
> /test/srcData
>
> This ends up with:
> cp: java.io.IOException: mkdirs: Pathname too long.  Limit 8000
> characters, 1000 levels.
>
> When I try to list the folder recursively /test/srcData: it lists 998
> folders like:
> drwxr-xr-x   - root supergroup          0 2013-03-05 08:49
> /test/srcData/srcData
> drwxr-xr-x   - root supergroup          0 2013-03-05 08:49
> /test/srcData/srcData/srcData
> drwxr-xr-x   - root supergroup          0 2013-03-05 08:49
> /test/srcData/srcData/srcData/srcData
> drwxr-xr-x   - root supergroup          0 2013-03-05 08:49
> /test/srcData/srcData/srcData/srcData/srcData
> drwxr-xr-x   - root supergroup          0 2013-03-05 08:49
> /test/srcData/srcData/srcData/srcData/srcData/srcData
>
> Is there a problem with s3n filesystem ??
>
> Cheers,
> Subroto Sanyal
>
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB