Keith Wiley 2013-02-19, 19:37
The webapps/hdfs bundle is present at
$HADOOP_PREFIX/share/hadoop/hdfs/ directory of the Hadoop 2.x release
tarball. This should get on the classpath automatically as well.
What "bin/hadoop-daemon.sh" script are you using, the one from the MR1
"aside" tarball or the chief hadoop-2 one?
On my tarball setups, I 'start-dfs.sh' via the regular tarball, and it
Another simple check you could do is to try to start with
"$HADOOP_PREFIX/bin/hdfs namenode" to see if it at least starts well
this way and brings up the NN as a foreground process.
On Wed, Feb 20, 2013 at 1:07 AM, Keith Wiley <[EMAIL PROTECTED]> wrote:
> This is Hadoop 2.0, but using the separate MR1 package (hadoop-2.0.0-mr1-cdh4.1.3), not yarn. I formatted the namenode ("./bin/hadoop namenode -format") and saw no errors in the shell or in the logs/[namenode].log file (in fact, simply formatting the namenode doesn't even create the log file yet). I believe that merely formatting the namenode shouldn't leave any persistent java processes running, so I wouldn't expect "ps aux | grep java" to show anything, which of course it doesn't.
> I then started the namenode with "./bin/hadoop-daemon.sh start namenode". This produces the log file and still shows no errors. The final entry in the log is:
> 2013-02-19 19:15:19,477 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 9000
> Curiously, I still don't see any java processes running and netstat doesn't show any obvious 9000 listeners. I get this:
> $ netstat -a -t --numeric-ports -p
> (Not all processes could be identified, non-owned process info
> will not be shown, you would have to be root to see it all.)
> Active Internet connections (servers and established)
> Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
> tcp 0 0 localhost:25 *:* LISTEN -
> tcp 0 0 *:22 *:* LISTEN -
> tcp 0 0 ip-13-0-177-11:60765 ec2-50-19-38-112.compute:22 ESTABLISHED 23591/ssh
> tcp 0 0 ip-13-0-177-11:22 126.96.36.199:56984 ESTABLISHED -
> tcp 0 0 ip-13-0-177-11:22 188.8.131.52:38081 ESTABLISHED -
> tcp 0 0 *:22 *:* LISTEN -
> Note that ip-13-0-177-11 is the current machine (it is also specified as the master in /etc/hosts and is indicated via localhost in fs.default.name on port 9000 (fs.default.name = "hdfs://localhost:9000")). So, at this point, I'm beginning to get confused because I don't see a java namenode process and I don't see a port 9000 listener...but still haven't seen any blatant error messages.
> Next, I try "hadoop fs -ls /". I then get the shell error I have been wrestling with recently:
> ls: Call From ip-13-0-177-11/127.0.0.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
> Furthermore, this last step adds the following entry to the namenode log file:
> 2013-02-19 19:15:20,434 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: ReplicationMonitor thread received InterruptedException.
> java.lang.InterruptedException: sleep interrupted
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3025)
> at java.lang.Thread.run(Thread.java:679)
> 2013-02-19 19:15:20,438 WARN org.apache.hadoop.hdfs.server.blockmanagement.DecommissionManager: Monitor interrupted: java.lang.InterruptedException: sleep interrupted
> 2013-02-19 19:15:20,442 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for active state
> 2013-02-19 19:15:20,442 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for standby state