Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - distcp / mv is not working on ftp

Copy link to this message
distcp / mv is not working on ftp
Fabian Zimmermann 2013-09-27, 13:29

i'm just trying to backup some files to our ftp-server.

hadoop distcp hdfs:///data/ ftp://user:pass@server/data/

returns after some minutes with:

Task TASKID="task_201308231529_97700_m_000002" TASK_TYPE="MAP" TASK_STATUS="FAILED" FINISH_TIME="1380217916479" ERROR="java\.io\.IOException: Cannot rename parent(source): ftp://x:x@backup2/data/, parent(destination):  ftp://x:x@backup2/data/
at org\.apache\.hadoop\.fs\.ftp\.FTPFileSystem\.rename(FTPFileSystem\.java:557)
at org\.apache\.hadoop\.fs\.ftp\.FTPFileSystem\.rename(FTPFileSystem\.java:522)
at org\.apache\.hadoop\.mapred\.FileOutputCommitter\.moveTaskOutputs(FileOutputCommitter\.java:154)
at org\.apache\.hadoop\.mapred\.FileOutputCommitter\.moveTaskOutputs(FileOutputCommitter\.java:172)
at org\.apache\.hadoop\.mapred\.FileOutputCommitter\.commitTask(FileOutputCommitter\.java:132)
at org\.apache\.hadoop\.mapred\.OutputCommitter\.commitTask(OutputCommitter\.java:221)
at org\.apache\.hadoop\.mapred\.Task\.commit(Task\.java:1000)
at org\.apache\.hadoop\.mapred\.Task\.done(Task\.java:870)
at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:329)
at org\.apache\.hadoop\.mapred\.Child$4\.run" TASK_ATTEMPT_ID="" .

I googled a bit and added

fs.ftp.host = backup2
fs.ftp.user.backup2 = user
fs.ftp.password.backup2 = password

to core-site.xml, then I was able to execute:

hadoop fs -ls ftp:///data/
hadoop fs -rm ftp:///data/test.file

but as soon as I try

hadoop fs -mv file:///data/test.file ftp:///data/test2.file
mv: `ftp:///data/test.file': Input/output error

I enabled debug-logging in our ftp-server and got:

Sep 27 15:24:33 backup2 ftpd[38241]: command: LIST /data
Sep 27 15:24:33 backup2 ftpd[38241]: <--- 150
Sep 27 15:24:33 backup2 ftpd[38241]: Opening BINARY mode data connection for '/bin/ls'.
Sep 27 15:24:33 backup2 ftpd[38241]: <--- 226
Sep 27 15:24:33 backup2 ftpd[38241]: Transfer complete.
Sep 27 15:24:33 backup2 ftpd[38241]: command: CWD ftp:/data
Sep 27 15:24:33 backup2 ftpd[38241]: <--- 550
Sep 27 15:24:33 backup2 ftpd[38241]: ftp:/data: No such file or directory.
Sep 27 15:24:33 backup2 ftpd[38241]: command: RNFR test.file
Sep 27 15:24:33 backup2 ftpd[38241]: <--- 550

looks like the generation of "CWD" is buggy, hadoop tries to cd into "ftp:/data", but should use "/data"

Any ideas how to fix?

Thanks a lot,

Fabian Zimmermann
IT Engineer, Systemadministrator

xplosion interactive GmbH

Steindamm 71 | Besucher: Steindamm 80
20099 Hamburg

t. + 49 (0) 40.2850 7045
m. + 49 (0) 160 5898835
f. + 49 (0) 40.2850 1922

FOLLOW US ON TWITTER - www.twitter.com/xplosion_de

Sitz der Gesellschaft: Hamburg
Handelsregister: AG Hamburg, HRB 109808
Geschäftsführer: Daniel Neuhaus, Thorsten Lottici
Wir sind Mitglied im BVDW (Bundesverband Digitale Wirtschaft)
This e-mail is confidential and is intended for the addressee(s) only.  
If you are not the named addressee you may not use it, copy it or  
disclose it to any other person. If you received this message in error  
please notify the sender immediately.