Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBASE -- RS expire?


Copy link to this message
-
Re: HBASE -- RS expire?


On Thursday, July 5, 2012 at 8:25 PM, Jay Wilson wrote:

> Finally my HMaster has stabilized and been running for 7 hours. I
> believe my networking issues are behind me now. Thank you everyone for
> the help.
>
>

Awesome.

Looks like the same issue is biting you with the RS too. The RS isn't heartbeating to ZK and the ZK session expires, causing the RS to die.
Do you see a YouAreDeadException in the logs?
>
> New issue is my RSes continue to die after about 20 minutes. Again the
> cluster is idle. No jobs are running and I get this on all of my RSes
> at almost the same time:
>
> 2012-07-05 19:34:05,283 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server devrackA-04/172.18.0.5:2181
> 2012-07-05 19:34:05,288 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to devrackA-04/172.18.0.5:2181, initiating session
> 2012-07-05 19:34:05,301 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server devrackA-04/172.18.0.5:2181, sessionid
> = 0x13858fc240f0003, negotiated timeout = 180000
> 2012-07-05 19:34:05,399 INFO
> org.apache.hadoop.hbase.regionserver.ShutdownHook: Installed shutdown
> hook thread: Shutdownhook:regionserver60020
> 2012-07-05 20:06:40,279 INFO org.apache.zookeeper.ClientCnxn: Unable to
> read additional data from server sessionid 0x13858fc240f0003, likely
> server has closed socket, closing socket connection and attempting reconnect
> 2012-07-05 20:06:40,573 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server devrackA-03/172.18.0.4:2181
> 2012-07-05 20:06:40,574 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to devrackA-03/172.18.0.4:2181, initiating session
> 2012-07-05 20:06:40,578 INFO org.apache.zookeeper.ClientCnxn: Unable to
> reconnect to ZooKeeper service, session 0x13858fc240f0003 has expired,
> closing socket connection
> 2012-07-05 20:06:40,586 FATAL
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region
> server serverName=devrackB-07,60020,1341542045088, load=(requests=0,
> regions=0, usedHeap=0, maxHeap=0): regionserver:60020-0x13858fc240f0003
> regionserver:60020-0x13858fc240f0003 received expired from ZooKeeper,
> aborting
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired
>
> Could the fact that the cluster is idle cause the sessions to expire?
> It's almost like a timing trigger pops, the sessions expire, and then
> can reconnect. Is there a timer I need to adjust?
>
> Could this be related to a TCP or IP timer that needs to be adjusted?
> The session goes into a Fin/Wait state and then closes?
>
> Thank you
> ---
> Jay Wilson
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB