Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> RegionServers Crashing every hour in production env


Copy link to this message
-
Re: RegionServers Crashing every hour in production env
Hello guys,
I stopped my research on HBase ZK timeout for while due to
other issues I had to do, but I am back.

A very weird behavior that I would like your comments is that my
RegionServers perform better (less crashes) under heavy load instead
of light load.
There is, if I let HBase alone with 50 requestsPerSecond along the
whole day the crashes are higher than if I run a mapred Job every hour.
Another weird thing is the following:

RS startTime = Mon Apr 01 13:22:35 BRT 2013

[...]$ grep slept /var/log/hbase/hbase-hbase-regionserver-PSLBHDN00*.log
2013-04-01 20:09:21,135 WARN org.apache.hadoop.hbase.util.Sleeper: We
slept 45491ms instead of 3000ms, this is likely due to a long garbage
collecting pause and it's usually bad, see
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
2013-04-01 22:45:59,407 WARN org.apache.hadoop.hbase.util.Sleeper: We
slept 101271ms instead of 3000ms, this is likely due to a long garbage
collecting pause and it's usually bad, see
http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired

[...]$ egrep 'real=[1-9][0-9][0-9][0-9]*\.[0-9][0-9]'
/var/log/hbase/hbase-hbase-regionserver-PSLBHDN00*.out
* the below report is the above command for each time range.
0.0 - 0.1  secs GCs = 5084
0.1 - 0.5  secs GCs = 9
0.5 - 1.0  secs GCs = 3
1.0 - 010  secs GCs = 0
010 - 100  secs GCs = 0
100 - 1000 secs GCs = 0

So, my script for getting every gc time ("real=... secs") says that
there is no gc that took longer than 1 second.
However the RS log says twice that the RS slept more than 40 seconds
instead of 3.

"this is likely due to a long garbage collecting pause", yes
this is likely but I dont think it is the case.

The machine is a huge machine with 70GB RAM, 32 procs, light load,
no swap or iowait.

Any ideas?

Thanks,
Pablo

On 03/12/2013 12:43 PM, Pablo Musa wrote:
> Guys,
> thank you very much for the help.
>
> Yesterday I spent 14 hours trying to tune the whole cluster.
> The cluster is not ready yet needs a lot of tunning, but at least is
> working.
>
> My first big problem was namenode + datanode GC. They were not using
> CMS and thus were taking "incremental" time to run. Ii started in 0.01
> ms and
> in 20 minutes was taking 150 secs.
> After setting CMSGC this time is much smaller taking a maximum of 70 secs,
> which is VERY HIGH, but for now does not stop HBase.
>
> With this issue solved, it was clear that the RS was doing a long pause GC,
> taking up to 220 secs. Zookeeper expired the RS and it shutdown.
> I tried a lot of different flags configuration (MORE than 20), and could not
> get small gcs. Eventually it would take more than 150 secs (zookeeper
> timeout)
> and shutdown.
>
> Finally I tried a config that so far, 12 hours, is working with a maximum GC
> time of 90 secs. Which of course is a terrible problem since HBase is a
> database, but at least the cluster is stable while I can tune it a
> little more.
>
> In my opinion, my biggest problem is to have a few "monster" machines in the
> cluster instead of a bunch of commodities machines. I don't know if
> there are
> a lot companies using this kind of machines inside a hadoop cluster, but
> a fast search on google could not find a lot of tunes for big heap GCs.
>
> I guess my next step will be search for big heap gc tuning.
>
> Back to some questions ;)
>
>   > You have ganglia or tsdb running?
>
> I use zabbix for now, and no there is nothing going on when the big
> pause happens.
>
>   > When you see the big pause above, can you see anything going on on the
>   > machine? (swap, iowait, concurrent fat mapreduce job?)
>   > what are you doing during long GC happened? read or write? if
> reading, what
>   > the block cache size?
>
> The cpu for the RS process goes to 100% and the logs "pause", until it
> gets out.
> Ex: [NewPar
>
> IO and SWAP are normal. There is no MR running, just normal database
> load, which is
> very low. I am probably doing reads AND writes to the database with
> default block
> cache size.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB