Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Re: auto-failover does not work


Copy link to this message
-
Re: auto-failover does not work
Jitendra Yadav 2013-12-02, 12:47
If you are using hadoop user and you have correct ssh conf  then below
commands
should works without password.

Execute from NN2 & NN1
# ssh hadoop@NN1_host

&

Execute from NN2 & NN1
# ssh hadoop@NN2_host

Regards
Jitendra

On Mon, Dec 2, 2013 at 6:10 PM, YouPeng Yang <[EMAIL PROTECTED]>wrote:

> Hi Jitendra
>   Yes
>   I'm doubt that it need to enter the ssh-agent bash & ssh-add  before I
> ssh the NN from each other.Is it an problem?
>
> Regards
>
>
>
>
> 2013/12/2 Jitendra Yadav <[EMAIL PROTECTED]>
>
>> Are you able to connect both NN hosts using SSH without password?
>> Make sure you have correct ssh keys in authorized key file.
>>
>> Regards
>> Jitendra
>>
>>
>> On Mon, Dec 2, 2013 at 5:50 PM, YouPeng Yang <[EMAIL PROTECTED]>wrote:
>>
>>> Hi Pavan
>>>
>>>
>>>   I'm using sshfence
>>>
>>> ------core-site.xml-----------------
>>>
>>> <configuration>
>>>  <property>
>>>      <name>fs.defaultFS</name>
>>>      <value>hdfs://lklcluster</value>
>>>      <final>true</final>
>>>  </property>
>>>
>>>  <property>
>>>      <name>hadoop.tmp.dir</name>
>>>      <value>/home/hadoop/tmp2</value>
>>>  </property>
>>>
>>>
>>> </configuration>
>>>
>>>
>>> -------hdfs-site.xml-------------
>>>
>>> <configuration>
>>>  <property>
>>>      <name>dfs.namenode.name.dir</name>
>>>     <value>/home/hadoop/namedir2</value>
>>>  </property>
>>>
>>>  <property>
>>>      <name>dfs.datanode.data.dir</name>
>>>      <value>/home/hadoop/datadir2</value>
>>>  </property>
>>>
>>>  <property>
>>>    <name>dfs.nameservices</name>
>>>    <value>lklcluster</value>
>>> </property>
>>>
>>> <property>
>>>     <name>dfs.ha.namenodes.lklcluster</name>
>>>     <value>nn1,nn2</value>
>>> </property>
>>> <property>
>>>   <name>dfs.namenode.rpc-address.lklcluster.nn1</name>
>>>   <value>hadoop2:8020</value>
>>> </property>
>>> <property>
>>>     <name>dfs.namenode.rpc-address.lklcluster.nn2</name>
>>>     <value>hadoop3:8020</value>
>>> </property>
>>>
>>> <property>
>>>   <name>dfs.namenode.http-address.lklcluster.nn1</name>
>>>     <value>hadoop2:50070</value>
>>> </property>
>>>
>>> <property>
>>>     <name>dfs.namenode.http-address.lklcluster.nn2</name>
>>>     <value>hadoop3:50070</value>
>>> </property>
>>>
>>> <property>
>>>   <name>dfs.namenode.shared.edits.dir</name>
>>>
>>> <value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/lklcluster</value>
>>> </property>
>>> <property>
>>>   <name>dfs.client.failover.proxy.provider.lklcluster</name>
>>>
>>> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
>>> </property>
>>> <property>
>>>   <name>dfs.ha.fencing.methods</name>
>>>   <value>sshfence</value>
>>> </property>
>>>
>>> <property>
>>>   <name>dfs.ha.fencing.ssh.private-key-files</name>
>>>    <value>/home/hadoop/.ssh/id_rsa</value>
>>> </property>
>>>
>>> <property>
>>>     <name>dfs.ha.fencing.ssh.connect-timeout</name>
>>>      <value>5000</value>
>>> </property>
>>>
>>> <property>
>>>   <name>dfs.journalnode.edits.dir</name>
>>>    <value>/home/hadoop/journal/data</value>
>>> </property>
>>>
>>> <property>
>>>    <name>dfs.ha.automatic-failover.enabled</name>
>>>       <value>true</value>
>>> </property>
>>>
>>> <property>
>>>      <name>ha.zookeeper.quorum</name>
>>>      <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>
>>> </property>
>>>
>>> </configuration>
>>>
>>>
>>> 2013/12/2 Pavan Kumar Polineni <[EMAIL PROTECTED]>
>>>
>>>> Post your config files and in which method you are following for
>>>> automatic failover
>>>>
>>>>
>>>> On Mon, Dec 2, 2013 at 5:34 PM, YouPeng Yang <[EMAIL PROTECTED]
>>>> > wrote:
>>>>
>>>>> Hi i
>>>>>   I'm testing the HA auto-failover within hadoop-2.2.0
>>>>>
>>>>>   The cluster can be manully failover ,however failed with the
>>>>> automatic failover.
>>>>> I setup the HA according to  the URL
>>>>>
>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html
>>>>>
>>>>>   When I test the automatic failover, I killed my active NN by kill -9