-RE: hadoop fsck through proxy...
Andy Sautins 2009-09-24, 20:23
Thanks Ted. Right, if I setup my browser to use the SOCKS proxy and then access the namenode fsck URL ( e.g., http://namenode/fsck ) I get the same information.
Still seems a little inconsistent for that to be one of the few commands that don't work through a proxy with the hadoop command line.
From: Ted Dunning [mailto:[EMAIL PROTECTED]]
Sent: Thursday, September 24, 2009 2:16 PM
To: [EMAIL PROTECTED]
Subject: Re: hadoop fsck through proxy...
An easy work-around is to hit the fsck url on the namenode. You get the
On Thu, Sep 24, 2009 at 12:26 PM, Andy Sautins
> I looked in JIRA but didn't see this reported so I thought I'd see what
> this list thinks. We've been using SOCKS proxying to access a Hadoop
> cluster generally using setup described on the Couldera blog posting (
> http://www.cloudera.com/blog/2008/12/03/securing-a-hadoop-cluster-through-a-gateway/). This works great by setting hadoop.rpc.socket.factory.class.default to
> org.apache.hadoop.net.SocksSocketFactory. Generally thinks work well (
> hadoop dfs activity like -ls -rmr -cat ) all work fine. The one command
> that doesn't work is fsck. Note the following command and error:
> hadoop fsck /
> Exception in thread "main" java.net.NoRouteToHostException: No route to
> So looking at org.apache.hadoop.hdfs.tools.DFSck.java the connection is
> created using URLConnection, so it makes sense why it wouldn't work since it
> doesn't seem to use the socket factory.
> So to me this seems like an issue. Can someone please confirm? If it is
> I'll add a JIRA. Happy to take a crack and making a change as well ( if one
> should be made ). Unclear to me the easiest way to change. I haven't run
> across in the codebase code that uses
> hadoop.rpc.socket.factory.class.default for HTTP connections.
> Any thoughts would be appreciated.
Ted Dunning, CTO