Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Hadoop 2.4.1 Verifying Automatic Failover Failed: Unable to trigger a roll of the active NN


Copy link to this message
-
Hadoop 2.4.1 Verifying Automatic Failover Failed: ResourceManager and JobHistoryServer do not auto-failover to Standby Node
Hi

I have set up the Hadoop 2.4.1 with HDFS High Availability using the Quorum Journal Manager.

I am verifying Automatic Failover: I manually used “kill -9” command to disable all running Hadoop services in active node (NN-1), I can find that the Standby node (NN-2) now becomes ACTIVE now which is good, however, the “ResourceManager” service cannot be found in NN-2, please advise how to make ResourceManager and JobHistoryServer auto-failover? or do I miss some important setup? missing some settings in hdfs-site.xml or core-site.xml?

Please help!

Regards
Arthur
BEFORE TESTING:
NN-1:
jps
9564 NameNode
10176 JobHistoryServer
21215 Jps
17636 QuorumPeerMain
20838 NodeManager
9678 DataNode
9933 JournalNode
10085 DFSZKFailoverController
20724 ResourceManager

NN-2 (Standby Name node)
jps
14064 Jps
32046 NameNode
13765 NodeManager
32126 DataNode
32271 DFSZKFailoverController

AFTER
NN-1
dips
17636 QuorumPeerMain
21508 Jps

NN-2
jps
32046 NameNode
13765 NodeManager
32126 DataNode
32271 DFSZKFailoverController
14165 Jps