Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # dev - HDFS to S3 copy issues


+
Momina Khan 2012-07-06, 05:29
Copy link to this message
-
RE: HDFS to S3 copy issues
Ivan Mitic 2012-07-06, 06:01
Hi Momina,

Could it be that you misspelled the port in your source path, you mind trying with: hdfs://10.240.113.162:9000/data/

Ivan

-----Original Message-----
From: Momina Khan [mailto:[EMAIL PROTECTED]]
Sent: Thursday, July 05, 2012 10:30 PM
To: [EMAIL PROTECTED]
Subject: HDFS to S3 copy issues

hi ... hope someone is able to help me out with this ... have tried an exhaustive search of google and AWS forum but there is little help in this regard and all that i found didnt work for me!

i want to copy data from HDFS to my S3 bucket ... to test whether my HDFS url is correct i tried the fs -cat command which works just fine ... spits contents of the file ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$ *bin/hadoop fs -cat hdfs://10.240.113.162:9000/data/hello.txt*

but when i try to distance copy the file from hdfs (same location as above) to my s3 bucket it says connection to server refused! have looked up Google exhaustively but cannot get an answer. they say that the port may be blocked but have checked that 9000-9001 are not blocked .... could it be an autghentication issue? just saying ... out of ideas.

Find the call trace attached below:

ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$ *bin/hadoop distcp hdfs://10.240.113.162:9001/data/ s3://ID:**SECRET@momina
*

12/07/05 12:48:37 INFO tools.DistCp: srcPaths=[hdfs:// 10.240.113.162:9001/data]
12/07/05 12:48:37 INFO tools.DistCp: destPath=s3://ID:SECRET@momina

12/07/05 12:48:38 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 0 time(s).
12/07/05 12:48:39 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 1 time(s).
12/07/05 12:48:40 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 2 time(s).
12/07/05 12:48:41 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 3 time(s).
12/07/05 12:48:42 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 4 time(s).
12/07/05 12:48:43 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 5 time(s).
12/07/05 12:48:44 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 6 time(s).
12/07/05 12:48:45 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 7 time(s).
12/07/05 12:48:46 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 8 time(s).
12/07/05 12:48:47 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 9 time(s).
With failures, global counters are inaccurate; consider running with -i Copy failed: java.net.ConnectException: Call to
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001 failed on connection exception: java.net.ConnectException: Connection refused
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
    at org.apache.hadoop.ipc.Client.call(Client.java:1071)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at $Proxy1.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
    at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
    at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
    at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
    at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
    at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
    at
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
    at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
    at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
    at org.apache.hadoop.ipc.Client.call(Client.java:1046)
    ... 19 more

thank u!
momina
+
Momina Khan 2012-07-06, 06:49
+
feng lu 2012-07-06, 07:22
+
Momina Khan 2012-07-06, 12:50
+
Nitin Pawar 2012-07-06, 07:12