Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - auto-failover does not work


+
YouPeng Yang 2013-12-02, 12:04
+
Pavan Kumar Polineni 2013-12-02, 12:13
Copy link to this message
-
Re: auto-failover does not work
YouPeng Yang 2013-12-02, 12:20
Hi Pavan
  I'm using sshfence

------core-site.xml-----------------

<configuration>
 <property>
     <name>fs.defaultFS</name>
     <value>hdfs://lklcluster</value>
     <final>true</final>
 </property>

 <property>
     <name>hadoop.tmp.dir</name>
     <value>/home/hadoop/tmp2</value>
 </property>
</configuration>
-------hdfs-site.xml-------------

<configuration>
 <property>
     <name>dfs.namenode.name.dir</name>
    <value>/home/hadoop/namedir2</value>
 </property>

 <property>
     <name>dfs.datanode.data.dir</name>
     <value>/home/hadoop/datadir2</value>
 </property>

 <property>
   <name>dfs.nameservices</name>
   <value>lklcluster</value>
</property>

<property>
    <name>dfs.ha.namenodes.lklcluster</name>
    <value>nn1,nn2</value>
</property>
<property>
  <name>dfs.namenode.rpc-address.lklcluster.nn1</name>
  <value>hadoop2:8020</value>
</property>
<property>
    <name>dfs.namenode.rpc-address.lklcluster.nn2</name>
    <value>hadoop3:8020</value>
</property>

<property>
  <name>dfs.namenode.http-address.lklcluster.nn1</name>
    <value>hadoop2:50070</value>
</property>

<property>
    <name>dfs.namenode.http-address.lklcluster.nn2</name>
    <value>hadoop3:50070</value>
</property>

<property>
  <name>dfs.namenode.shared.edits.dir</name>

<value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/lklcluster</value>
</property>
<property>
  <name>dfs.client.failover.proxy.provider.lklcluster</name>

<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
  <name>dfs.ha.fencing.methods</name>
  <value>sshfence</value>
</property>

<property>
  <name>dfs.ha.fencing.ssh.private-key-files</name>
   <value>/home/hadoop/.ssh/id_rsa</value>
</property>

<property>
    <name>dfs.ha.fencing.ssh.connect-timeout</name>
     <value>5000</value>
</property>

<property>
  <name>dfs.journalnode.edits.dir</name>
   <value>/home/hadoop/journal/data</value>
</property>

<property>
   <name>dfs.ha.automatic-failover.enabled</name>
      <value>true</value>
</property>

<property>
     <name>ha.zookeeper.quorum</name>
     <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
</property>

</configuration>
2013/12/2 Pavan Kumar Polineni <[EMAIL PROTECTED]>

> Post your config files and in which method you are following for automatic
> failover
>
>
> On Mon, Dec 2, 2013 at 5:34 PM, YouPeng Yang <[EMAIL PROTECTED]>wrote:
>
>> Hi i
>>   I'm testing the HA auto-failover within hadoop-2.2.0
>>
>>   The cluster can be manully failover ,however failed with the automatic
>> failover.
>> I setup the HA according to  the URL
>>
>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html
>>
>>   When I test the automatic failover, I killed my active NN by kill -9
>> <Pid-nn>,while the standby namenode does not change to active state.
>>   It came out the log in my DFSZKFailoverController as [1]
>>
>>  Please help me ,any suggestion will be appreciated.
>>
>>
>> Regards.
>>
>>
>> zkfc
>> log[1]----------------------------------------------------------------------------------------------------
>>
>> 2013-12-02 19:49:28,588 INFO org.apache.hadoop.ha.NodeFencer: =====>> Beginning Service Fencing Process... =====>> 2013-12-02 19:49:28,588 INFO org.apache.hadoop.ha.NodeFencer: Trying
>> method 1/1: org.apache.hadoop.ha.SshFenceByTcpPort(null)
>> 2013-12-02 19:49:28,590 INFO org.apache.hadoop.ha.SshFenceByTcpPort:
>> Connecting to hadoop3...
>> 2013-12-02 19:49:28,590 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
>> Connecting to hadoop3 port 22
>> 2013-12-02 19:49:28,592 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
>> Connection established
>> 2013-12-02 19:49:28,603 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
>> Remote version string: SSH-2.0-OpenSSH_5.3
>> 2013-12-02 19:49:28,603 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
>> Local version string: SSH-2.0-JSCH-0.1.42
>> 2013-12-02 19:49:28,603 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch:
+
Jitendra Yadav 2013-12-02, 12:13