Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> distcp in Hadoop 2.0.4 over http?


Copy link to this message
-
Re: distcp in Hadoop 2.0.4 over http?
Both WebHDFS and HTTPFS use/provide the same API for compatibility.
Use webhdfs:// as thats the accepted REST-based HDFS standard, rather
than using "httpfs://".

On Sat, Jun 1, 2013 at 9:40 PM, Pedro Sá da Costa <[EMAIL PROTECTED]> wrote:
> I want to copy HDFS filese over HTTP using distcp, but I can't. It is a
> problem of configuration that I can't find it. How can I do distcp in Hadoop
> 2.0.4 over HTTP?
>
> First I set up hadoop 2.0.4 over http - Httpfs - on port 3888, which is
> running. Here is the proof:
>
> $ curl -i http://zk1.host.com:3888?user.name=babu&op=homedir
> [1] 32129
> [myuser@zk1 hadoop]$ HTTP/1.1 200 OK
> Server: Apache-Coyote/1.1
> Accept-Ranges: bytes
> ETag: W/"674-1365802990000"
> Last-Modified: Fri, 12 Apr 2013 21:43:10 GMT
> Content-Type: text/html
> Content-Length: 674
> Date: Sat, 01 Jun 2013 15:48:04 GMT
>
> <?xml version="1.0" encoding="UTF-8"?>
> <html>
> <body>
> <b>HttpFs service</b>, service base URL at /webhdfs/v1.
> </body>
> </html>
>
>
> But, when I do distcp, I can't copy:
> $ hadoop distcp  http://zk1.host:3888/gutenberg/a.txt http://zk1.host:3888/
> Warning: $HADOOP_HOME is deprecated.
> Copy failed: java.io.IOException: No FileSystem for scheme: http
>     at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1434)
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1455)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
>     at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635)
>     at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>     at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>     at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
> $ hadoop distcp  httpfs://zk1.host:3888/gutenberg/a.txt
> httpfs://zk1.host:3888/
> Copy failed: java.io.IOException: No FileSystem for scheme: httpfs
>     at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1434)
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1455)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
>     at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635)
>     at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>     at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>     at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>
> $ hadoop distcp  hdfs://zk1.host3888/gutenberg/a.txt hdfs://zk1.host:3888/
> Copy failed: java.io.IOException: Call to zk1.host/127.0.0.1:3888 failed on
> local exception: java.io.EOFException
>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1144)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1112)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
>     at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>
> Here is my core-site files and httpfs-env.sh where I configured HDFS and the
> HTTPFS:
> $ cat etc/hadoop/core-site.xml
> <configuration>
>   <property> <name>fs.default.name</name>
> <value>hdfs://zk1.host:9000</value> </property>
>   <property<name>hadoop.proxyuser.myuser.hosts</name<value>zk1.host</value>
> </property>
>   <property> <name>hadoop.proxyuser.myuser.groups</name> <value>*</value>

Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB