Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Hbase master failing to start after reaching 95% disk and expanding cluster


Copy link to this message
-
Re:: Hbase master failing to start after reaching 95% disk and expanding cluster

Seems like your zookeeper quorum is in a bad state (possibly because it ran out of disk in the past). Can you try stopping hbase, removing the zookeeper znode for hbase and then starting hbase.
------------------------------
On Sat 29 Dec, 2012 5:14 PM IST Marco Gallotta wrote:

>Hi there
>
>I've been running an hbase cluster for several months, and it recently experienced problems as the nodes reached 95% disk capacity. I added an extra node, and now the master keeps crashing with the errors below. I also increased the disk capacity on each individual node after this, and the errors are the same. I tried removing the new node, and that doesn't help.
>
>There are similar errors in the regionserver and zookeeper logs, but the all seem to echo from the master logs.
>
>Anything I can look at to help diagnose what the problem here is?
>
>hbase-root-master-analytics.log:
>Sat Dec 29 03:14:22 PST 2012 Starting master on analytics
>core file size          (blocks, -c) 0
>data seg size           (kbytes, -d) unlimited
>scheduling priority             (-e) 0
>file size               (blocks, -f) unlimited
>pending signals                 (-i) 59480
>max locked memory       (kbytes, -l) 64
>max memory size         (kbytes, -m) unlimited
>open files                      (-n) 1024
>pipe size            (512 bytes, -p) 8
>POSIX message queues     (bytes, -q) 819200
>real-time priority              (-r) 0
>stack size              (kbytes, -s) 8192
>cpu time               (seconds, -t) unlimited
>max user processes              (-u) 59480
>virtual memory          (kbytes, -v) unlimited
>file locks                      (-x) unlimited
>2012-12-29 03:14:24,601 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,614 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,622 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,631 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,636 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,643 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,651 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,665 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,675 INFO org.apache.hadoop.ipc.HBaseServer: Starting Thread-2
>2012-12-29 03:14:24,698 INFO org.apache.hadoop.ipc.HBaseServer: Starting IPC Server listener on 60000
>2012-12-29 03:14:25,322 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
>2012-12-29 03:14:28,735 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
>
>2012-12-29 03:14:32,797 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
>2012-12-29 03:14:41,427 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
>2012-12-29 03:14:41,427 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 3 retries
>2012-12-29 03:14:41,428 ERROR org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
>java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster
>
>        at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1740)
>        at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:146)
>        at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)