Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> help why do my regionservers shut themselves down?


+
kaveh minooie 2013-04-23, 01:25
+
Leonid Fedotov 2013-04-23, 15:59
+
Jean-Marc Spaggiari 2013-04-23, 01:46
Copy link to this message
-
Re: help why do my regionservers shut themselves down?
Kaveh:
What version of HBase are you using ?
Around 2013-04-22 16:47:56, did you observe anything else happening in your
cluster ? See below:

2013-04-22 16:47:56,830 INFO org.apache.hadoop.hbase.**regionserver.HRegion:
compaction interrupted by user:
java.io.**InterruptedIOException: Aborting compaction of store f in region
t1_webpage,com.pandora.www:**http/shaggy,1366670139658.**9f565d5
da3468c0725e590dc232abc**23. because user requested stop.
        at org.apache.hadoop.hbase.**regionserver.Store.compact(**Store.
java:998)
        at org.apache.hadoop.hbase.**regionserver.Store.compact(**Store.
java:779)
        at org.apache.hadoop.hbase.**regionserver.HRegion.**compactStores(
HRegion.java:**776)

On Mon, Apr 22, 2013 at 6:46 PM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:

> Hi Kaveh,
>
> the respons is maybe already displayed on the logs you sent ;)
>
> "This disconnect could have been caused by a network partition or a
> long-running GC pause, either way it's recommended that you verify
> your environment."
>
> Do you have GC logs? Have you tried anything to solve that?
>
> JM
>
> 2013/4/22 kaveh minooie <[EMAIL PROTECTED]>:
> >
> > Hi
> >
> > after a few mapreduce jobs my regionservers shut themselves down. this is
> > the latest time that this has happened:
> >
> > 2013-04-22 16:47:21,843 INFO
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> > This client just lost it's session with ZooKeeper, trying to reconnect.
> > 2013-04-22 16:47:21,843 FATAL
> > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region
> server
> > serverName=d1r1n17.prod.plutoz.com,60020,1366657358443, load=(requests=5
> > 392, regions=196, usedHeap=1063, maxHeap=3966):
> > regionserver:60020-0x13dd980d2ab8661-0x13dd980d2ab8661
> > regionserver:60020-0x13dd980d2ab8661-0x13dd980d2ab8661 received expired
> fr
> > om ZooKeeper, aborting
> > org.apache.zookeeper.KeeperException$SessionExpiredException:
> > KeeperErrorCode = Session expired
> >         at
> >
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:352)
> >         at
> >
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:270)
> >         at
> >
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:523)
> >         at
> > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:499)
> > 2013-04-22 16:47:21,843 INFO
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> > Trying to reconnect to zookeeper.
> > 2013-04-22 16:47:21,844 INFO
> > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
> > requests=1794, regions=196, stores=1561, storefiles=1585,
> > storefileIndexSize=104, memstoreSize=306, compactionQueueSize=10,
> > flushQueueSize=0, usedHeap=1073, maxHeap=3966, blockCacheSize=661986032,
> > blockCacheFree=169901776, blockCacheCount=7242,
> blockCacheHitCount=910925,
> > blockCacheMissCount=1558134, blockCacheEvictedCount=1344753,
> > blockCacheHitRatio=36, blockCacheHitCachingRatio=40
> > 2013-04-22 16:47:21,844 INFO
> > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED:
> > regionserver:60020-0x13dd980d2ab8661-0x13dd980d2ab8661
> > regionserver:60020-0x13dd980d2ab8661-0x13dd980d2ab8661 received expired
> from
> > ZooKeeper, aborting
> > 2013-04-22 16:47:21,844 INFO org.apache.zookeeper.ClientCnxn: EventThread
> > shut down
> > 2013-04-22 16:47:21,900 WARN
> org.apache.hadoop.hbase.regionserver.wal.HLog:
> > Too many consecutive RollWriter requests, it's a sign of the total
> number of
> > live datanodes is lower than the tolerable replicas.
> > 2013-04-22 16:47:22,341 INFO org.apache.zookeeper.ZooKeeper: Initiating
> > client connection, connectString=zk1:2181 sessionTimeout=180000
> > watcher=hconnection
> > 2013-04-22 16:47:22,357 INFO
> > org.apache.hadoop.hbase.regionserver.HRegionServer: Waiting on 1 regions
> to
> > close
> > 2013-04-22 16:47:22,394 INFO org.apache.zookeeper.ClientCnxn: Opening
+
kaveh minooie 2013-04-23, 04:47
+
Kevin Odell 2013-04-23, 10:15
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB