Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Can not auto-failover when unplug network interface


Copy link to this message
-
Re: Can not auto-failover when unplug network interface
This is still because your fence method configuraed improperly.
plseae paste your fence configuration. and double check you can ssh on
active NN to standby NN without password.
On Tue, Dec 3, 2013 at 10:23 AM, YouPeng Yang <[EMAIL PROTECTED]>wrote:

> Hi
>    Another auto-failover testing problem:
>
>    My HA can auto-failover after I kill the active NN.When it comes to the
> unplug  network interface to simulate the hardware fail,the auto-failover
> seems  not to work after   wait for times -the zkfc logs as [1].
>
>    I'm using the default sshfence.
>
>
>
>
>
>
> [1] zkfc
> logs----------------------------------------------------------------------------------------
> 2013-12-03 10:05:56,650 INFO org.apache.hadoop.ha.NodeFencer: =====> Beginning Service Fencing Process... =====> 2013-12-03 10:05:56,650 INFO org.apache.hadoop.ha.NodeFencer: Trying
> method 1/1: org.apache.hadoop.ha.SshFenceByTcpPort(null)
> 2013-12-03 10:05:56,651 INFO org.apache.hadoop.ha.SshFenceByTcpPort:
> Connecting to hadoop3...
> 2013-12-03 10:05:56,651 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
> Connecting to hadoop3 port 22
> 2013-12-03 10:05:59,648 WARN org.apache.hadoop.ha.SshFenceByTcpPort:
> Unable to connect to hadoop3 as user hadoop
> com.jcraft.jsch.JSchException: java.net.NoRouteToHostException: No route
> to host
>     at com.jcraft.jsch.Util.createSocket(Util.java:386)
>     at com.jcraft.jsch.Session.connect(Session.java:182)
>     at
> org.apache.hadoop.ha.SshFenceByTcpPort.tryFence(SshFenceByTcpPort.java:100)
>     at org.apache.hadoop.ha.NodeFencer.fence(NodeFencer.java:97)
>     at
> org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:521)
>     at
> org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:494)
>     at
> org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:59)
>     at
> org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:837)
>     at
> org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:900)
>     at
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:799)
>     at
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415)
>     at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:596)
>     at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> 2013-12-03 10:05:59,649 WARN org.apache.hadoop.ha.NodeFencer: Fencing
> method org.apache.hadoop.ha.SshFenceByTcpPort(null) was unsuccessful.
> 2013-12-03 10:05:59,649 ERROR org.apache.hadoop.ha.NodeFencer: Unable to
> fence service by any configured method.
> 2013-12-03 10:05:59,650 WARN org.apache.hadoop.ha.ActiveStandbyElector:
> Exception handling the winning of election
> java.lang.RuntimeException: Unable to fence NameNode at hadoop3/
> 10.7.23.124:8020
>     at
> org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:522)
>     at
> org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:494)
>     at
> org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:59)
>     at
> org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:837)
>     at
> org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:900)
>     at
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:799)
>     at
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:415)
>     at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:596)
>     at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
> 2013-12-03 10:05:59,650 INFO org.apache.hadoop.ha.ActiveStandbyElector:
> Trying to re-establish ZK session
> 2013-12-03 10:05:59,676 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x142931031810260 closed