Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> A region server stopped (timeout after trying to connect local Zookeeper)


+
ac@...) 2012-11-21, 11:16
Copy link to this message
-
Re: A region server stopped (timeout after trying to connect local Zookeeper)
Hi,
I have the following line in /etc/hosts in all servers, should I keep it or comment it out or ...?

127.0.0.1       localhost

Please help.

Thanks

On 21 Nov 2012, at 7:16 PM, [EMAIL PROTECTED] wrote:

> Hi,
>
>
> Please help!!
>
> HBase version: 0.94
> ZooKeeper: 3.4.4
>
> One of the regional servers stopped very quickly after HBASE is started:
>
> ### Check JPS after HBASE cluster was started, could find the HRegionServer process (*** there is no any ZooKeeper instance running in this server ***)
> $ jps
> 24767 Jps
> 18418 TaskTracker
> 24678 HRegionServer
> 18156 DataNode
>
> ### Wait a while and checked JPS again,  HRegionServer process gone
> $ jps
> 18418 TaskTracker
> 24784 Jps
> 18156 DataNode
>
>
> ### Here is the setting in hbase-site.xml ( enabled hbase.cluster.distributed, set up 3 ZooKeepers, timeout= 60000)
> <property>
> <name>hbase.cluster.distributed</name>
> <value>true</value>
> </property>
>
> <property>
> <name>hbase.ZooKeeper.quorum</name>
> <value>m146,m145,m143</value>
> </property>
>
> <property>
> <name>zookeeper.session.timeout</name>
> <value>60000</value>
> </property>
>
>
> ### hbase-env.sh also tells HBASE not to manage local instance of ZooKeeper
> export HBASE_MANAGES_ZK=false
>
>
> ###This server can connect to the 3 ZooKeepers,
> ./zkCli.sh -server m145,m146,m143   ==>  [zk: m145,m146,m143(CONNECTED) 0]
>
>
> ### checked the hbase log file, found something odd,  seemed that it tried to connect local ZooKeeper
> 2012-11-21 17:30:33,066 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=60000 watcher=regionserver:60020
>
> 2012-11-21 17:31:33,254 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
>
> 2012-11-21 17:31:33,254 INFO org.apache.hadoop.hbase.util.RetryCounter: Sleeping 2000ms before retry #1...
> 2012-11-21 17:32:33,262 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 60010ms for sessionid 0x0, closing socket connection and attempting reconnect
>
> 2012-11-21 17:32:33,362 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
>
> ......
>
> 2012-11-21 17:34:33,570 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 3 retries
> 2012-11-21 17:34:33,571 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: regionserver:60020 Unable to set watcher on znode /hbase/master
> 2012-11-21 17:34:33,573 ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: regionserver:60020 Received unexpected KeeperException, re-throwing exception
> 2012-11-21 17:34:33,573 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server ......
> 2012-11-21 17:34:33,576 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: []
>
> 2012-11-21 17:34:36,580 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server m144,60020,1353490232962: Initialization of RS failed.  Hence aborting RS.
> java.io.IOException: Received the shutdown message while waiting.
> at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:623)
> at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:598)
> at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:560)
> at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:669)
> at java.lang.Thread.run(Thread.java:662)
> 2012-11-21 17:34:36,581 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: []
>
>
> Please help!
> QUESTION: Is it a bug and I need to check something else?  
+
Jean-Marc Spaggiari 2012-11-21, 17:22
+
ac@...) 2012-11-21, 23:13
+
Jean-Marc Spaggiari 2012-11-21, 23:51
+
ac@...) 2012-11-22, 00:14
+
Jean-Marc Spaggiari 2012-11-22, 00:39
+
ac@...) 2012-11-22, 01:53
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB