Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Long waiting loop for " Waiting for region servers count to settle" when doing hmaster failover


Copy link to this message
-
Long waiting loop for " Waiting for region servers count to settle" when doing hmaster failover
Hi Commnunity,

When I do a testing, I met this issue on 0.94.3.

There are 1 active hmaster, 1 backup hmaster, 4 region servers.
I run YCSB workload on it to load data. During the running of workload,
I manually kill -9 the active hmaster, seems that backup master took
over the active role quickly, but looping on

"
INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 0, slept for 0 ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.
INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 0, slept for xxx ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.
INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 0, slept for xxx ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.
...
...
...
<for about 5 - 7 mins looping on this log message>
...

INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 1, slept for 0 ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.

INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 2, slept for 0 ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.
INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 3, slept for 0 ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.
INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 4, slept for 0 ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.

"
It seems there always a looping of 5 - 7 mins for the above waiting
message for region servers to checked in to the new active master. Then
after a long wait loop, it suddenly checked in 4 region servers
successfully.

Any idea of this waiting loop? Thanks a lot for the advice~
-- Best Regards, Julian
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB