Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> help why do my regionservers shut themselves down?


Copy link to this message
-
help why do my regionservers shut themselves down?

Hi

after a few mapreduce jobs my regionservers shut themselves down. this
is the latest time that this has happened:

2013-04-22 16:47:21,843 INFO
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
This client just lost it's session with ZooKeeper, trying to reconnect.
2013-04-22 16:47:21,843 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region
server serverName=d1r1n17.prod.plutoz.com,60020,1366657358443,
load=(requests=5
392, regions=196, usedHeap=1063, maxHeap=3966):
regionserver:60020-0x13dd980d2ab8661-0x13dd980d2ab8661
regionserver:60020-0x13dd980d2ab8661-0x13dd980d2ab8661 received expired fr
om ZooKeeper, aborting
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired
         at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:352)
         at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:270)
         at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:523)
         at
org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:499)
2013-04-22 16:47:21,843 INFO
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
Trying to reconnect to zookeeper.
2013-04-22 16:47:21,844 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
requests=1794, regions=196, stores=1561, storefiles=1585,
storefileIndexSize=104, memstoreSize=306, compactionQueueSize=10,
flushQueueSize=0, usedHeap=1073, maxHeap=3966, blockCacheSize=661986032,
blockCacheFree=169901776, blockCacheCount=7242,
blockCacheHitCount=910925, blockCacheMissCount=1558134,
blockCacheEvictedCount=1344753, blockCacheHitRatio=36,
blockCacheHitCachingRatio=40
2013-04-22 16:47:21,844 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED:
regionserver:60020-0x13dd980d2ab8661-0x13dd980d2ab8661
regionserver:60020-0x13dd980d2ab8661-0x13dd980d2ab8661 received expired
from ZooKeeper, aborting
2013-04-22 16:47:21,844 INFO org.apache.zookeeper.ClientCnxn:
EventThread shut down
2013-04-22 16:47:21,900 WARN
org.apache.hadoop.hbase.regionserver.wal.HLog: Too many consecutive
RollWriter requests, it's a sign of the total number of live datanodes
is lower than the tolerable replicas.
2013-04-22 16:47:22,341 INFO org.apache.zookeeper.ZooKeeper: Initiating
client connection, connectString=zk1:2181 sessionTimeout=180000
watcher=hconnection
2013-04-22 16:47:22,357 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Waiting on 1 regions
to close
2013-04-22 16:47:22,394 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server d1r2n2.prod.plutoz.com/10.0.0.66:2181. Will
not attempt to authenticate using SASL (unknown error)
2013-04-22 16:47:22,395 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to d1r2n2.prod.plutoz.com/10.0.0.66:2181,
initiating session
2013-04-22 16:47:22,397 INFO org.apache.zookeeper.ClientCnxn: Session
establishment complete on server d1r2n2.prod.plutoz.com/10.0.0.66:2181,
sessionid = 0x13dd980d2abbf93, negotiated timeout = 40000
2013-04-22 16:47:22,400 INFO
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
Reconnected successfully. This disconnect could have been caused by a
network partition or a long-running GC pause, either way it's
recommended that you verify your environment.
2013-04-22 16:47:22,400 INFO org.apache.zookeeper.ClientCnxn:
EventThread shut down
2013-04-22 16:47:56,830 INFO
org.apache.hadoop.hbase.regionserver.HRegion: compaction interrupted by
user:
java.io.InterruptedIOException: Aborting compaction of store f in region
t1_webpage,com.pandora.www:http/shaggy,1366670139658.9f565d5da3468c0725e590dc232abc23.
because user requested stop.
         at
org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:998)
         at
org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:779)
         at
org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:776)
         at
org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:721)
         at
org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:81)
2013-04-22 16:47:56,830 INFO
org.apache.hadoop.hbase.regionserver.HRegion: aborted compaction on
region
t1_webpage,com.pandora.www:http/shaggy,1366670139658.9f565d5da3468c0725e590dc232abc23.
after 5mins, 58sec
2013-04-22 16:47:56,830 INFO
org.apache.hadoop.hbase.regionserver.CompactSplitThread:
regionserver60020.compactor exiting
2013-04-22 16:47:56,832 INFO
org.apache.hadoop.hbase.regionserver.HRegion: Closed
t1_webpage,com.pandora.www:http/shaggy,1366670139658.9f565d5da3468c0725e590dc232abc23.
2013-04-22 16:47:57,363 INFO
org.apache.hadoop.hbase.regionserver.wal.HLog:
regionserver60020.logSyncer exiting
2013-04-22 16:47:57,366 INFO
org.apache.hadoop.hbase.regionserver.Leases: regionserver60020 closing
leases
2013-04-22 16:47:57,366 INFO
org.apache.hadoop.hbase.regionserver.Leases: regionserver60020 closed leases
2013-04-22 16:47:57,366 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020
exiting
2013-04-22 16:47:57,497 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook
starting; hbase.shutdown.hook=true; fsShutdownHook=Thread[Thread-15,5,main]
2013-04-22 16:47:57,497 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown hook
2013-04-22 16:47:57,497 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Starting fs shutdown
hook thread.
2013-04-22 16:47:57,504 INFO
org.apache.hadoop.hbase.regionserver.Leases:
regionserver60020.leaseChecker closing leases
2013-04-22 16:47:57,504 INFO
org.apache.hadoop.hbase.regionserver.Leases:
regionserver60020.leaseChecker closed leases
2013-04-22 16:47:57,598 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook finished.

I would appreciate it very much if
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB