Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Region Servers Crashing during Random Reads


Copy link to this message
-
RE: Region Servers Crashing during Random Reads
How much heap are you running on your RegionServers?

6GB of total RAM is on the low end.  For high throughput applications, I would recommend at least 6-8GB of heap (so 8+ GB of RAM).

> -----Original Message-----
> From: charan kumar [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, February 03, 2011 11:47 AM
> To: [EMAIL PROTECTED]
> Subject: Region Servers Crashing during Random Reads
>
> Hello,
>
>  I am using hbase 0.90.0 with hadoop-append. h/w ( Dell 1950, 2 CPU, 6 GB
> RAM)
>
> I had 9 Region Servers crash (out of 30) in a span of 30 minutes during a heavy
> reads. It looks like a GC, ZooKeeper Connection Timeout thingy to me.
> I did all recommended configuration from the Hbase wiki... Any other
> suggestions?
>
>
> 2011-02-03T09:43:07.890-0800: 70693.632: [GC 70693.632: [ParNew
> (promotion
> failed): 5555K->5540K(5568K), 0.0280950 secs]70693.660:
> [CMS2011-02-03T09:43:16.864-0800: 70702.606: [CMS-concurrent-mark:
> 12.549/69.323 secs] [Times: user=11.90 sys=1.26, real=69.31 secs]
>
> 2011-02-03T09:53:35.165-0800: 71320.785: [GC 71320.785: [ParNew
> (promotion
> failed): 5568K->5568K(5568K), 0.4384530 secs]71321.224:
> [CMS2011-02-03T09:53:45.111-0800: 71330.731: [CMS-concurrent-mark:
> 17.511/51.564 secs] [Times: user=38.72 sys=5.67, real=51.60 secs]
>
> 2011-02-03T09:43:07.890-0800: 70693.632: [GC 70693.632: [ParNew
> (promotion
> failed): 5555K->5540K(5568K), 0.0280950 secs]70693.660:
> [CMS2011-02-03T09:43:16.864-0800: 70702.606: [CMS-concurrent-mark:
> 12.549/69.323 secs] [Times: user=11.90 sys=1.26, real=69.31 secs]
>
>
> The following is the log entry in region Server
>
> 2011-02-03 10:37:43,946 INFO org.apache.zookeeper.ClientCnxn: Client
> session timed out, have not heard from server in 47172ms for sessionid
> 0x12db9f722421ce3, closing socket connection and attempting reconnect
> 2011-02-03 10:37:43,947 INFO org.apache.zookeeper.ClientCnxn: Client
> session timed out, have not heard from server in 48159ms for sessionid
> 0x22db9f722501d93, closing socket connection and attempting reconnect
> 2011-02-03 10:37:44,401 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server XXXXXXXXXXXXXXXX
> 2011-02-03 10:37:44,402 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to XXXXXXXXX, initiating session
> 2011-02-03 10:37:44,709 INFO org.apache.zookeeper.ClientCnxn: Opening
> socket connection to server XXXXXXXXXXXXXXX
> 2011-02-03 10:37:44,709 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to XXXXXXXXXXXXXXXXXXXXX, initiating session
> 2011-02-03 10:37:44,767 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> started; Attempting to free 81.93 MB of total=696.25 MB
> 2011-02-03 10:37:44,784 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> completed; freed=81.94 MB, total=614.81 MB, single=379.98 MB,
> multi=309.77 MB, memory=0 KB
> 2011-02-03 10:37:45,205 INFO org.apache.zookeeper.ClientCnxn: Unable to
> reconnect to ZooKeeper service, session 0x22db9f722501d93 has expired,
> closing socket connection
> 2011-02-03 10:37:45,206 INFO
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplem
> entation:
> This client just lost it's session with ZooKeeper, trying to reconnect.
> 2011-02-03 10:37:45,453 INFO
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplem
> entation:
> Trying to reconnect to zookeeper
> 2011-02-03 10:37:45,206 INFO org.apache.zookeeper.ClientCnxn: Unable to
> reconnect to ZooKeeper service, session 0x12db9f722421ce3 has expired,
> closing socket connection
> gionserver:60020-0x22db9f722501d93 regionserver:60020-
> 0x22db9f722501d93
> received expired from ZooKeeper, aborting
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired
>         at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(
> ZooKeeperWatcher.java:328)
>         at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeep
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB