Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: auto-failover does not work


Copy link to this message
-
Re: auto-failover does not work
Hi Jitendra
  Yes
  I'm doubt that it need to enter the ssh-agent bash & ssh-add  before I
ssh the NN from each other.Is it an problem?

Regards
2013/12/2 Jitendra Yadav <[EMAIL PROTECTED]>

> Are you able to connect both NN hosts using SSH without password?
> Make sure you have correct ssh keys in authorized key file.
>
> Regards
> Jitendra
>
>
> On Mon, Dec 2, 2013 at 5:50 PM, YouPeng Yang <[EMAIL PROTECTED]>wrote:
>
>> Hi Pavan
>>
>>
>>   I'm using sshfence
>>
>> ------core-site.xml-----------------
>>
>> <configuration>
>>  <property>
>>      <name>fs.defaultFS</name>
>>      <value>hdfs://lklcluster</value>
>>      <final>true</final>
>>  </property>
>>
>>  <property>
>>      <name>hadoop.tmp.dir</name>
>>      <value>/home/hadoop/tmp2</value>
>>  </property>
>>
>>
>> </configuration>
>>
>>
>> -------hdfs-site.xml-------------
>>
>> <configuration>
>>  <property>
>>      <name>dfs.namenode.name.dir</name>
>>     <value>/home/hadoop/namedir2</value>
>>  </property>
>>
>>  <property>
>>      <name>dfs.datanode.data.dir</name>
>>      <value>/home/hadoop/datadir2</value>
>>  </property>
>>
>>  <property>
>>    <name>dfs.nameservices</name>
>>    <value>lklcluster</value>
>> </property>
>>
>> <property>
>>     <name>dfs.ha.namenodes.lklcluster</name>
>>     <value>nn1,nn2</value>
>> </property>
>> <property>
>>   <name>dfs.namenode.rpc-address.lklcluster.nn1</name>
>>   <value>hadoop2:8020</value>
>> </property>
>> <property>
>>     <name>dfs.namenode.rpc-address.lklcluster.nn2</name>
>>     <value>hadoop3:8020</value>
>> </property>
>>
>> <property>
>>   <name>dfs.namenode.http-address.lklcluster.nn1</name>
>>     <value>hadoop2:50070</value>
>> </property>
>>
>> <property>
>>     <name>dfs.namenode.http-address.lklcluster.nn2</name>
>>     <value>hadoop3:50070</value>
>> </property>
>>
>> <property>
>>   <name>dfs.namenode.shared.edits.dir</name>
>>
>> <value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/lklcluster</value>
>> </property>
>> <property>
>>   <name>dfs.client.failover.proxy.provider.lklcluster</name>
>>
>> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
>> </property>
>> <property>
>>   <name>dfs.ha.fencing.methods</name>
>>   <value>sshfence</value>
>> </property>
>>
>> <property>
>>   <name>dfs.ha.fencing.ssh.private-key-files</name>
>>    <value>/home/hadoop/.ssh/id_rsa</value>
>> </property>
>>
>> <property>
>>     <name>dfs.ha.fencing.ssh.connect-timeout</name>
>>      <value>5000</value>
>> </property>
>>
>> <property>
>>   <name>dfs.journalnode.edits.dir</name>
>>    <value>/home/hadoop/journal/data</value>
>> </property>
>>
>> <property>
>>    <name>dfs.ha.automatic-failover.enabled</name>
>>       <value>true</value>
>> </property>
>>
>> <property>
>>      <name>ha.zookeeper.quorum</name>
>>      <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
>> </property>
>>
>> </configuration>
>>
>>
>> 2013/12/2 Pavan Kumar Polineni <[EMAIL PROTECTED]>
>>
>>> Post your config files and in which method you are following for
>>> automatic failover
>>>
>>>
>>> On Mon, Dec 2, 2013 at 5:34 PM, YouPeng Yang <[EMAIL PROTECTED]>wrote:
>>>
>>>> Hi i
>>>>   I'm testing the HA auto-failover within hadoop-2.2.0
>>>>
>>>>   The cluster can be manully failover ,however failed with the
>>>> automatic failover.
>>>> I setup the HA according to  the URL
>>>>
>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html
>>>>
>>>>   When I test the automatic failover, I killed my active NN by kill -9
>>>> <Pid-nn>,while the standby namenode does not change to active state.
>>>>   It came out the log in my DFSZKFailoverController as [1]
>>>>
>>>>  Please help me ,any suggestion will be appreciated.
>>>>
>>>>
>>>> Regards.
>>>>
>>>>
>>>> zkfc
>>>> log[1]----------------------------------------------------------------------------------------------------
>>>>
>>>> 2013-12-02 19:49:28,588 INFO org.apache.hadoop.ha.NodeFencer: =====>>>> Beginning Service Fencing Process... =====>>>> 2013-12-02 19:49:28,588 INFO org.apache.hadoop.ha.NodeFencer: Trying
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB