Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Can not auto-failover when unplug network interface


Copy link to this message
-
Can not auto-failover when unplug network interface
Hi
   Another auto-failover testing problem:

   My HA can auto-failover after I kill the active NN.When it comes to the
unplug  network interface to simulate the hardware fail,the auto-failover
seems  not to work after   wait for times -the zkfc logs as [1].

   I'm using the default sshfence.
[1] zkfc
logs----------------------------------------------------------------------------------------
2013-12-03 10:05:56,650 INFO org.apache.hadoop.ha.NodeFencer: =====Beginning Service Fencing Process... =====2013-12-03 10:05:56,650 INFO org.apache.hadoop.ha.NodeFencer: Trying method
1/1: org.apache.hadoop.ha.SshFenceByTcpPort(null)
2013-12-03 10:05:56,651 INFO org.apache.hadoop.ha.SshFenceByTcpPort:
Connecting to hadoop3...
2013-12-03 10:05:56,651 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
Connecting to hadoop3 port 22
2013-12-03 10:05:59,648 WARN org.apache.hadoop.ha.SshFenceByTcpPort: Unable
to connect to hadoop3 as user hadoop
com.jcraft.jsch.JSchException: java.net.NoRouteToHostException: No route to
host
    at com.jcraft.jsch.Util.createSocket(Util.java:386)
    at com.jcraft.jsch.Session.connect(Session.java:182)
    at
org.apache.hadoop.ha.SshFenceByTcpPort.tryFence(SshFenceByTcpPort.java:100)
    at org.apache.hadoop.ha.NodeFencer.fence(NodeFencer.java:97)
    at
org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:521)
    at
org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:494)
    at
org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:59)
    at
org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:837)
    at
org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:900)
    at
org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:799)
    at
org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415)
    at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:596)
    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2013-12-03 10:05:59,649 WARN org.apache.hadoop.ha.NodeFencer: Fencing
method org.apache.hadoop.ha.SshFenceByTcpPort(null) was unsuccessful.
2013-12-03 10:05:59,649 ERROR org.apache.hadoop.ha.NodeFencer: Unable to
fence service by any configured method.
2013-12-03 10:05:59,650 WARN org.apache.hadoop.ha.ActiveStandbyElector:
Exception handling the winning of election
java.lang.RuntimeException: Unable to fence NameNode at hadoop3/
10.7.23.124:8020
    at
org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:522)
    at
org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:494)
    at
org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:59)
    at
org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:837)
    at
org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:900)
    at
org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:799)
    at
org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415)
    at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:596)
    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2013-12-03 10:05:59,650 INFO org.apache.hadoop.ha.ActiveStandbyElector:
Trying to re-establish ZK session
2013-12-03 10:05:59,676 INFO org.apache.zookeeper.ZooKeeper: Session:
0x142931031810260 closed
2013-12-03 10:06:00,678 INFO org.apache.zookeeper.ZooKeeper: Initiating
client connection, connectString=hadoop1:2181,hadoop2:2181,hadoop3:2181
sessionTimeout=5000
watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@5ce2acea
2013-12-03 10:06:00,681 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server hadoop1/10.7.23.122:2181. Will not attempt to
authenticate using SASL (Unable to locate a login configuration)
2013-12-03 10:06:00,681 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to hadoop1/10.7.23.122:2181, initiating session
2013-12-03 10:06:00,709 INFO org.apache.zookeeper.ClientCnxn: Session
establishment complete on server hadoop1/10.7.23.122:2181, sessionid 0x142931031810261, negotiated timeout = 5000
2013-12-03 10:06:00,711 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB