Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - distcp in Hadoop 2.0.4 over http?


Copy link to this message
-
distcp in Hadoop 2.0.4 over http?
Pedro Sá da Costa 2013-06-01, 16:10
I want to copy HDFS filese over HTTP using distcp, but I can't. It is a
problem of configuration that I can't find it. How can I do distcp in
Hadoop 2.0.4 over HTTP?

First I set up hadoop 2.0.4 over http - Httpfs - on port 3888, which is
running. Here is the proof:

$ curl -i http://zk1.host.com:3888?user.name=babu&op=homedir
[1] 32129
[myuser@zk1 hadoop]$ HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Accept-Ranges: bytes
ETag: W/"674-1365802990000"
Last-Modified: Fri, 12 Apr 2013 21:43:10 GMT
Content-Type: text/html
Content-Length: 674
Date: Sat, 01 Jun 2013 15:48:04 GMT

<?xml version="1.0" encoding="UTF-8"?>
<html>
<body>
<b>HttpFs service</b>, service base URL at /webhdfs/v1.
</body>
</html>
But, when I do distcp, I can't copy:
$ hadoop distcp  http://zk1.host:3888/gutenberg/a.txt http://zk1.host:3888/
Warning: $HADOOP_HOME is deprecated.
Copy failed: java.io.IOException: No FileSystem for scheme: http
    at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1434)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1455)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
    at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)

$ hadoop distcp  httpfs://zk1.host:3888/gutenberg/a.txt
httpfs://zk1.host:3888/
Copy failed: java.io.IOException: No FileSystem for scheme: httpfs
    at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1434)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1455)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
    at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635)
    at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
    at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)

$ hadoop distcp  hdfs://zk1.host3888/gutenberg/a.txt hdfs://zk1.host:3888/
Copy failed: java.io.IOException: Call to
zk1.host/127.0.0.1:3888<http://zk1.yrl.gq1.yahoo.com/98.137.30.10:3888>failed
on local exception: java.io.EOFException
    at org.apache.hadoop.ipc.Client.wrapException(Client.java:1144)
    at org.apache.hadoop.ipc.Client.call(Client.java:1112)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
    at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)

Here is my core-site files and httpfs-env.sh where I configured HDFS and
the HTTPFS:
$ cat etc/hadoop/core-site.xml
<configuration>
  <property> <name>fs.default.name</name>
<value>hdfs://zk1.host:9000</value> </property>
  <property<name>hadoop.proxyuser.myuser.hosts</name<value>zk1.host</value>
</property>
  <property> <name>hadoop.proxyuser.myuser.groups</name> <value>*</value>
  </property>
</configuration>

$ cat etc/hadoop/httpfs-env.sh
#!/bin/bash
export HTTPFS_HTTP_PORT=3888
export HTTPFS_HTTP_HOSTNAME=`hostname -f`
--
Best regards,