Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> A region server stopped (timeout after trying to connect local Zookeeper)


+
ac@...) 2012-11-21, 11:16
+
ac@...) 2012-11-21, 13:29
+
Jean-Marc Spaggiari 2012-11-21, 17:22
+
ac@...) 2012-11-21, 23:13
Copy link to this message
-
Re: A region server stopped (timeout after trying to connect local Zookeeper)
Can you do JPS on your master and look at the logs too?

Another think, can you try with hbase.zookeeper.quorum instead of
hbase.ZooKeeper.quorum?

2012/11/21, [EMAIL PROTECTED] <[EMAIL PROTECTED]>:
> Hi,
>
> Here are my HBase configuration and test:
>
> 1) {$HBASE_HOME}hbase/conf/hbase-site.xml
> <property>
> <name>hbase.ZooKeeper.quorum</name>
> <value>m146,m145,m143</value>
> </property>
>
> <property>
> <name>zookeeper.session.timeout</name>
> <value>60000</value>
> </property>
>
>
> 2) {$HBASE_HOME}hbase/conf/hbase-env.sh
> export HBASE_MANAGES_ZK=false
>
>
> 3) I used " {$ZK_HOME}/bin/zkCli.sh -server m145,m146,m143"  to test the
> connection, it worked
> [zk: m145,m146,m143(CONNECTED) 0]
>
>
> 4) from the logs, I found that the connectString was odd, the RegionServer
> did not use the setting of "hbase.ZooKeeper.quorum" in conf/hbase-site.xml,
> it seemed that it always used the default and tried to connect
> "localhost:2181" in the distributed cluster:
>
> 2012-11-21 17:21:42,299 INFO org.apache.zookeeper.ZooKeeper: Initiating
> client connection, connectString=localhost:2181 sessionTimeout=60000
> watcher=regionserver:60020
> ...
> 2012-11-21 17:21:42,313 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server localhost/127.0.0.1:2181. Will not attempt to
> authenticate using SASL (Unable to locate a login configura$
> ...
> 2012-11-21 17:21:42,316 WARN org.apache.zookeeper.ClientCnxn: Session 0x0
> for server null, unexpected error, closing socket connection and attempting
> reconnect java.net.ConnectException: Connection refused
> ...  (remark: it tried above 3 times, then had FATAL error as follows)
>
> 2012-11-21 17:21:57,846 ERROR
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: regionserver:60020
> Received unexpected KeeperException, re-throwing exception
> ...
> 2012-11-21 17:21:57,847 FATAL
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server
> ...
>
>
>
> Please help.
>
> Thanks
>
>
>
>
>
> On 22 Nov 2012, at 1:22 AM, Jean-Marc Spaggiari wrote:
>
>> Hi,
>>
>> What do you have on your HBase configuration? Are you passing the name
>> of the Quorum servers?
>> $ cat conf/hbase-site.xml
>> ......
>>  </property>
>>    <property>
>>      <name>hbase.zookeeper.quorum</name>
>>      <value>cube,latitude,node3</value>
>>      <description>Comma separated list of servers in the ZooKeeper
>> Quorum.
>>      For example,
>> "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
>>      By default this is set to localhost for local and pseudo-distributed
>> modes
>>      of operation. For a fully-distributed setup, this should be set to a
>> full
>>      list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in
>> hbase-env.sh
>>      this is the list of servers which we will start/stop ZooKeeper on.
>>      </description>
>>    </property>
>> .....
>>
>> 2012/11/21, [EMAIL PROTECTED] <[EMAIL PROTECTED]>:
>>> Hi,
>>>
>>>
>>> I have the following line in /etc/hosts in all servers, should I keep it
>>> or
>>> comment it out or ...?
>>>
>>> 127.0.0.1       localhost
>>>
>>> Please help.
>>>
>>> Thanks
>>>
>>>
>>>
>>> On 21 Nov 2012, at 7:16 PM, [EMAIL PROTECTED] wrote:
>>>
>>>> Hi,
>>>>
>>>>
>>>> Please help!!
>>>>
>>>> HBase version: 0.94
>>>> ZooKeeper: 3.4.4
>>>>
>>>> One of the regional servers stopped very quickly after HBASE is
>>>> started:
>>>>
>>>> ### Check JPS after HBASE cluster was started, could find the
>>>> HRegionServer process (*** there is no any ZooKeeper instance running
>>>> in
>>>> this server ***)
>>>> $ jps
>>>> 24767 Jps
>>>> 18418 TaskTracker
>>>> 24678 HRegionServer
>>>> 18156 DataNode
>>>>
>>>> ### Wait a while and checked JPS again,  HRegionServer process gone
>>>> $ jps
>>>> 18418 TaskTracker
>>>> 24784 Jps
>>>> 18156 DataNode
>>>>
>>>>
>>>> ### Here is the setting in hbase-site.xml ( enabled
>>>> hbase.cluster.distributed, set up 3 ZooKeepers, timeout= 60000)
>>>> <property>
>>>> <name>hbase.cluster.distributed</name>
>>>> <value>true</value>
>>>> </property>
>>>
+
ac@...) 2012-11-22, 00:14
+
Jean-Marc Spaggiari 2012-11-22, 00:39
+
ac@...) 2012-11-22, 01:53