Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Hmaster and HRegionServer disappearance reason to ask


Copy link to this message
-
Re: Hmaster and HRegionServer disappearance reason to ask
 Pablo, instead of CMSIncrementalMode try UseParNewGC.. That seemed to be the silver bullet when I was dealing with HBase region server crashes

Regards,
Dhaval
________________________________
From: Pablo Musa <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Thursday, 5 July 2012 5:37 PM
Subject: RE: Hmaster and HRegionServer disappearance reason to ask

I am having the same problem. I tried N different things but I cannot solve the problem.

hadoop-0.20.noarch      0.20.2+923.256-1
hadoop-hbase.noarch     0.90.6+84.29-1
hadoop-zookeeper.noarch 3.3.5+19.1-1

I already set:

        <property>
                <name>hbase.hregion.memstore.mslab.enabled</name>
                <value>true</value>
        </property>
        <property>
                <name>hbase.regionserver.handler.count</name>
                <value>20</value>
        </property>

But it does not seem to work.
How can I check if this variables are really set in the HRegionServer?

I am starting the server with the following:
-Xmx8192m -XX:NewSize=64m -XX:MaxNewSize=64m -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps

I am also having trouble to read reagionserver.out
[GC 72004.406: [ParNew: 55830K->2763K(59008K), 0.0043820 secs] 886340K->835446K(1408788K) icms_dc=0 , 0.0044900 secs] [Times: user=0.04 sys=0.00, real=0.00 secs]
[GC 72166.759: [ParNew: 55192K->6528K(59008K), 135.1102750 secs] 887876K->839688K(1408788K) icms_dc=0 , 135.1103920 secs] [Times: user=1045.58 sys=138.11, real=135.09 secs]
[GC 72552.616: [ParNew: 58977K->6528K(59008K), 0.0083040 secs] 892138K->847415K(1408788K) icms_dc=0 , 0.0084060 secs] [Times: user=0.05 sys=0.01, real=0.01 secs]
[GC 72882.991: [ParNew: 58979K->6528K(59008K), 151.4924490 secs] 899866K->853931K(1408788K) icms_dc=0 , 151.4925690 secs] [Times: user=0.07 sys=151.48, real=151.47 secs]

What does each part means?
Each line is a GC cicle?

Thanks,
Pablo
-----Original Message-----
From: Lars George [mailto:[EMAIL PROTECTED]]
Sent: segunda-feira, 2 de julho de 2012 06:43
To: [EMAIL PROTECTED]
Subject: Re: Hmaster and HRegionServer disappearance reason to ask

Hi lztaomin,

> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired

indicates that you have experienced the "Juliet Pause" issue, which means you ran into a JVM garbage collection that lasted longer than the configured ZooKeeper timeout threshold.

If you search for it on Google http://www.google.com/search?q=juliet+pause+hbase you will find quite a few pages explaining the problem, and what you can do to avoid this.

Lars

On Jul 2, 2012, at 10:30 AM, lztaomin wrote:

> HI ALL
>
>      My HBase group a total of 3 machine, Hadoop HBase mounted in the same machine, zookeeper using HBase own. Operation 3 months after the reported abnormal as follows. Cause hmaster and HRegionServer processes are gone. Please help me.
> Thanks
>
> The following is a log
>
> ABORTING region server serverName=datanode1,60020,1325326435553,
> load=(requests=332, regions=188, usedHeap=2741, maxHeap=8165):
> regionserver:60020-0x3488dec38a02b1
> regionserver:60020-0x3488dec38a02b1 received expired from ZooKeeper,
> aborting
> Cause:
> org.apache.zookeeper.KeeperException$SessionExpiredException:
> KeeperErrorCode = Session expired at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(Zoo
> KeeperWatcher.java:343) at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWa
> tcher.java:261) at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.ja
> va:530) at
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
> 2012-07-01 13:45:38,707 INFO
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler:
> Splitting logs for datanode1,60020,1325326435553
> 2012-07-01 13:45:38,756 INFO
> org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Splitting 32
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB