|
|
-
RE: hadoop fsck through proxy...Andy Sautins 2009-09-24, 20:23
Thanks Ted. Right, if I setup my browser to use the SOCKS proxy and then access the namenode fsck URL ( e.g., http://namenode/fsck ) I get the same information. Still seems a little inconsistent for that to be one of the few commands that don't work through a proxy with the hadoop command line. Thanks Andy -----Original Message----- From: Ted Dunning [mailto:[EMAIL PROTECTED]] Sent: Thursday, September 24, 2009 2:16 PM To: [EMAIL PROTECTED] Subject: Re: hadoop fsck through proxy... An easy work-around is to hit the fsck url on the namenode. You get the same output. On Thu, Sep 24, 2009 at 12:26 PM, Andy Sautins <[EMAIL PROTECTED]>wrote: > I looked in JIRA but didn't see this reported so I thought I'd see what > this list thinks. We've been using SOCKS proxying to access a Hadoop > cluster generally using setup described on the Couldera blog posting ( > http://www.cloudera.com/blog/2008/12/03/securing-a-hadoop-cluster-through-a-gateway/). This works great by setting hadoop.rpc.socket.factory.class.default to > org.apache.hadoop.net.SocksSocketFactory. Generally thinks work well ( > hadoop dfs activity like -ls -rmr -cat ) all work fine. The one command > that doesn't work is fsck. Note the following command and error: > > hadoop fsck / > Exception in thread "main" java.net.NoRouteToHostException: No route to > host > > So looking at org.apache.hadoop.hdfs.tools.DFSck.java the connection is > created using URLConnection, so it makes sense why it wouldn't work since it > doesn't seem to use the socket factory. > > So to me this seems like an issue. Can someone please confirm? If it is > I'll add a JIRA. Happy to take a crack and making a change as well ( if one > should be made ). Unclear to me the easiest way to change. I haven't run > across in the codebase code that uses > hadoop.rpc.socket.factory.class.default for HTTP connections. > > Any thoughts would be appreciated. > > Thanks > > Andy > > -- Ted Dunning, CTO DeepDyve |