Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> distcp in Hadoop 2.0.4 over http?


+
Pedro Sá da Costa 2013-06-01, 16:10
Copy link to this message
-
Re: distcp in Hadoop 2.0.4 over http?
Both WebHDFS and HTTPFS use/provide the same API for compatibility.
Use webhdfs:// as thats the accepted REST-based HDFS standard, rather
than using "httpfs://".

On Sat, Jun 1, 2013 at 9:40 PM, Pedro Sá da Costa <[EMAIL PROTECTED]> wrote:
> I want to copy HDFS filese over HTTP using distcp, but I can't. It is a
> problem of configuration that I can't find it. How can I do distcp in Hadoop
> 2.0.4 over HTTP?
>
> First I set up hadoop 2.0.4 over http - Httpfs - on port 3888, which is
> running. Here is the proof:
>
> $ curl -i http://zk1.host.com:3888?user.name=babu&op=homedir
> [1] 32129
> [myuser@zk1 hadoop]$ HTTP/1.1 200 OK
> Server: Apache-Coyote/1.1
> Accept-Ranges: bytes
> ETag: W/"674-1365802990000"
> Last-Modified: Fri, 12 Apr 2013 21:43:10 GMT
> Content-Type: text/html
> Content-Length: 674
> Date: Sat, 01 Jun 2013 15:48:04 GMT
>
> <?xml version="1.0" encoding="UTF-8"?>
> <html>
> <body>
> <b>HttpFs service</b>, service base URL at /webhdfs/v1.
> </body>
> </html>
>
>
> But, when I do distcp, I can't copy:
> $ hadoop distcp  http://zk1.host:3888/gutenberg/a.txt http://zk1.host:3888/
> Warning: $HADOOP_HOME is deprecated.
> Copy failed: java.io.IOException: No FileSystem for scheme: http
>     at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1434)
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1455)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
>     at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635)
>     at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>     at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>     at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
> $ hadoop distcp  httpfs://zk1.host:3888/gutenberg/a.txt
> httpfs://zk1.host:3888/
> Copy failed: java.io.IOException: No FileSystem for scheme: httpfs
>     at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1434)
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1455)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
>     at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635)
>     at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>     at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>     at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
> $ hadoop distcp  hdfs://zk1.host3888/gutenberg/a.txt hdfs://zk1.host:3888/
> Copy failed: java.io.IOException: Call to zk1.host/127.0.0.1:3888 failed on
> local exception: java.io.EOFException
>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1144)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1112)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
>     at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>
> Here is my core-site files and httpfs-env.sh where I configured HDFS and the
> HTTPFS:
> $ cat etc/hadoop/core-site.xml
> <configuration>
>   <property> <name>fs.default.name</name>
> <value>hdfs://zk1.host:9000</value> </property>
>   <property<name>hadoop.proxyuser.myuser.hosts</name<value>zk1.host</value>
> </property>
>   <property> <name>hadoop.proxyuser.myuser.groups</name> <value>*</value>

Harsh J