Hadoop, mail # user - Hadoop 2.4.1 Verifying Automatic Failover Failed: ResourceManager and JobHistoryServer do not auto-failover to Standby Node - 2014-08-05, 11:15
Solr & Elasticsearch trainings in New York & San Fransisco [more info]
 Search Hadoop and all its subprojects:

Switch to Threaded View
Copy link to this message
-
Hadoop 2.4.1 Verifying Automatic Failover Failed: ResourceManager and JobHistoryServer do not auto-failover to Standby Node
Hi

I have set up the Hadoop 2.4.1 with HDFS High Availability using the Quorum Journal Manager.

I am verifying Automatic Failover: I manually used “kill -9” command to disable all running Hadoop services in active node (NN-1), I can find that the Standby node (NN-2) now becomes ACTIVE now which is good, however, the “ResourceManager” service cannot be found in NN-2, please advise how to make ResourceManager and JobHistoryServer auto-failover? or do I miss some important setup? missing some settings in hdfs-site.xml or core-site.xml?

Please help!

Regards
Arthur
BEFORE TESTING:
NN-1:
jps
9564 NameNode
10176 JobHistoryServer
21215 Jps
17636 QuorumPeerMain
20838 NodeManager
9678 DataNode
9933 JournalNode
10085 DFSZKFailoverController
20724 ResourceManager

NN-2 (Standby Name node)
jps
14064 Jps
32046 NameNode
13765 NodeManager
32126 DataNode
32271 DFSZKFailoverController

AFTER
NN-1
dips
17636 QuorumPeerMain
21508 Jps

NN-2
jps
32046 NameNode
13765 NodeManager
32126 DataNode
32271 DFSZKFailoverController
14165 Jps
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB