Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> Why failed to use Distcp over FTP protocol?


Copy link to this message
-
Re: Why failed to use Distcp over FTP protocol?
Now,  I can successfully run "hadoop distcp
ftp://ftpuser:ftpuser@hostname/tmp/test1.txt
hdfs:///tmp/test1.txt"

But failed on "hadoop distcp hdfs:///tmp/test1.txt
ftp://ftpuser:ftpuser@hostname/tmp/test1.txt.v1", it returns issue like:
attempt_201304222240_0005_m_000000_1: log4j:ERROR Could not connect to
remote log4j server at [localhost]. We will try again later.
13/04/23 18:59:05 INFO mapred.JobClient: Task Id :
attempt_201304222240_0005_m_000000_2, Status : FAILED
java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
        at
org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at
java.security.AccessController.doPrivileged(AccessController.java:310)
        at javax.security.auth.Subject.doAs(Subject.java:573)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
2013/4/24 sam liu <[EMAIL PROTECTED]>

> I can success execute "hadoop fs -ls ftp://hadoopadm:xxxxxxxx@ftphostname<ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here>",
> it returns the root path of linux system.
>
> But failed to execute "hadoop fs -rm
> ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here", and it returns:
> rm: Delete failed ftp://hadoopadm:xxxxxxxx<ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here>
> @ftphostname/some/path/here<ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here>
>
>
> 2013/4/24 Daryn Sharp <[EMAIL PROTECTED]>
>
>>  The ftp fs is listing the contents of the given path's parent directory,
>> and then trying to match the basename of each child path returned against
>> the basename of the given path – quite inefficient…  The FNF is it didn't
>> find a match for the basename.  It may be that the ftp server isn't
>> returning a listing in exactly the expected format so it's being parsed
>> incorrectly.
>>
>>  Does "hadoop fs -ls ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here"
>> work?  Or "hadoop fs -rm
>> ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here"?  Those cmds should
>> exercise the same code paths where you are experiencing errors.
>>
>>  Daryn
>>
>>  On Apr 22, 2013, at 9:06 PM, sam liu wrote:
>>
>>  I encountered IOException and FileNotFoundException:
>>
>> 13/04/17 17:11:10 INFO mapred.JobClient: Task Id :
>> attempt_201304160910_2135_m_
>> 000000_0, Status : FAILED
>> java.io.IOException: The temporary job-output directory
>> ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/_distcp_logs_i74spu/_temporarydoesn't exist!
>>     at
>> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250)
>>     at
>> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244)
>>     at
>> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116)
>>     at
>> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:820)
>>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>>     at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>     at
>> java.security.AccessController.doPrivileged(AccessController.java:310)
>>     at javax.security.auth.Subject.doAs(Subject.java:573)
>>     at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1144)
>>     at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>
>>
>> ... ...
>>
>> 13/04/17 17:11:42 INFO mapred.JobClient: Job complete:
>> job_201304160910_2135
>> 13/04/17 17:11:42 INFO mapred.JobClient: Counters: 6
>> 13/04/17 17:11:42 INFO mapred.JobClient:   Job Counters
>> 13/04/17 17:11:42 INFO mapred.JobClient:     Failed map tasks=1
>> 13/04/17 17:11:42 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=33785