Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Can not auto-failover when unplug network interface


Copy link to this message
-
Re: Can not auto-failover when unplug network interface
Hi Yu

  Thanks for your response.
  I'm sure my ssh setup is good. Ssh from  act NN to stanby nn need no
password.

I attached my config
------core-site.xml-----------------

<configuration>
 <property>
     <name>fs.defaultFS</name>
     <value>hdfs://lklcluster</value>
     <final>true</final>
 </property>

 <property>
     <name>hadoop.tmp.dir</name>
     <value>/home/hadoop/tmp2</value>
 </property>
</configuration>
-------hdfs-site.xml----------
---

<configuration>
 <property>
     <name>dfs.namenode.name.dir</name>
    <value>/home/hadoop/namedir2</value>
 </property>

 <property>
     <name>dfs.datanode.data.dir</name>
     <value>/home/hadoop/datadir2</value>
 </property>

 <property>
   <name>dfs.nameservices</name>
   <value>lklcluster</value>
</property>

<property>
    <name>dfs.ha.namenodes.lklcluster</name>
    <value>nn1,nn2</value>
</property>
<property>
  <name>dfs.namenode.rpc-address.lklcluster.nn1</name>
  <value>hadoop2:8020</value>
</property>
<property>
    <name>dfs.namenode.rpc-address.lklcluster.nn2</name>
    <value>hadoop3:8020</value>
</property>

<property>
  <name>dfs.namenode.http-address.lklcluster.nn1</name>
    <value>hadoop2:50070</value>
</property>

<property>
    <name>dfs.namenode.http-address.lklcluster.nn2</name>
    <value>hadoop3:50070</value>
</property>

<property>
  <name>dfs.namenode.shared.edits.dir</name>

<value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/lklcluster</value>
</property>
<property>
  <name>dfs.client.failover.proxy.provider.lklcluster</name>

<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>sshfence</value>
</property>

<property>
  <name>dfs.ha.fencing.ssh.private-key-files</name>
   <value>/home/hadoop/.ssh/id_rsa</value>
</property>

<property>
    <name>dfs.ha.fencing.ssh.connect-timeout</name>
     <value>5000</value>
</property>

<property>
  <name>dfs.journalnode.edits.dir</name>
   <value>/home/hadoop/journal/data</value>
</property>

<property>
   <name>dfs.ha.automatic-failover.enabled</name>
      <value>true</value>
</property>

<property>
     <name>ha.zookeeper.quorum</name>
     <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
</property>

</configuration>

2013/12/3 Azuryy Yu <[EMAIL PROTECTED]>

> This is still because your fence method configuraed improperly.
> plseae paste your fence configuration. and double check you can ssh on
> active NN to standby NN without password.
>
>
> On Tue, Dec 3, 2013 at 10:23 AM, YouPeng Yang <[EMAIL PROTECTED]>wrote:
>
>> Hi
>>    Another auto-failover testing problem:
>>
>>    My HA can auto-failover after I kill the active NN.When it comes to
>> the unplug  network interface to simulate the hardware fail,the
>> auto-failover seems  not to work after   wait for times -the zkfc logs as
>> [1].
>>
>>    I'm using the default sshfence.
>>
>>
>>
>>
>>
>>
>> [1] zkfc
>> logs----------------------------------------------------------------------------------------
>> 2013-12-03 10:05:56,650 INFO org.apache.hadoop.ha.NodeFencer: =====>> Beginning Service Fencing Process... =====>> 2013-12-03 10:05:56,650 INFO org.apache.hadoop.ha.NodeFencer: Trying
>> method 1/1: org.apache.hadoop.ha.SshFenceByTcpPort(null)
>> 2013-12-03 10:05:56,651 INFO org.apache.hadoop.ha.SshFenceByTcpPort:
>> Connecting to hadoop3...
>> 2013-12-03 10:05:56,651 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
>> Connecting to hadoop3 port 22
>> 2013-12-03 10:05:59,648 WARN org.apache.hadoop.ha.SshFenceByTcpPort:
>> Unable to connect to hadoop3 as user hadoop
>> com.jcraft.jsch.JSchException: java.net.NoRouteToHostException: No route
>> to host
>>     at com.jcraft.jsch.Util.createSocket(Util.java:386)
>>     at com.jcraft.jsch.Session.connect(Session.java:182)
>>     at
>> org.apache.hadoop.ha.SshFenceByTcpPort.tryFence(SshFenceByTcpPort.java:100)
>>     at org.apache.hadoop.ha.NodeFencer.fence(NodeFencer.java:97)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB