Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> getting HBase up after an unexpected power failure - need some advice


Copy link to this message
-
Re: getting HBase up after an unexpected power failure - need some advice
Hey,

Looks like you have a corrupted ZK. Try and stop ZK (after stopping HBase of course) and restart it. If that also fails, then wipe the data dir ZK uses (check the config, for example the zoo.cfg for stand alone ZK nodes). ZK is going to recreate the data files and it should be able to move forward.

Cheers,
Lars
On Nov 30, 2011, at 7:51 AM, Taylor, Ronald C wrote:

> Hello folks,
>
> We have a small Hadoop/Hbase cluster whose power got shut off without HBase and Hadoop being shut down.
>
> So – I am trying to bring the cluster back up. Hadoop comes back up fine, and the “hadoop fsck” says that the HDFS file system is healthy.
>
> However: when I then tried to bring up Hbase,  I get errors in the log file
>     hbase-hbase-master-h01.emsl.pnl.gov.log
>
> and  the HBase web site for monitoring does not come up at
>  http://h01.emsl.pnl.gov:60010/master.jsp
>
> The log file says “Failed to create /hbase”. And that Hbase is “unable to read additional data from server” and “likely server has closed socket” and “check quorum servers”, in reference to the three nodes that I selected for use in the  zookeeper quorum that manages our HBase copy at
>      h09:2182, h06:2182, h05:2182
>
> I rebooted the entire cluster again, after shutting down Hadoop using stop-all.sh. I then brought Hadoop back up, and tried the Hbase start command again:
>
>    /home/hbase/hbase/bin/start-hbase.sh
>
> Same errors seen. See the tail end of the log at bottom.
>
> We are running the Apache distribution, using Hadoop 0.20.2 and HBase 0.89.20100726. (Yep, I know we should upgrade and probably switch to the Cloudera stack – hope to do so soon – but, right now, could use some more immediate help).
>
> Can anybody give me some guidance as to what is going wrong?
>
> -          Ron
>
>
> Ronald Taylor, Ph.D.
> Computational Biology & Bioinformatics Group
> Pacific Northwest National Laboratory (U.S. Dept of Energy/Battelle)
> Richland, WA 99352
> phone: (509) 372-6568
> email: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
> HBase log output:
>
> 2011-11-29 22:14:01,345 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: <h05,h06,h09:/hbase,org.apache.hadoop.hbase.master.HMaster>Trying to read /hbase/master
> 2011-11-29 22:14:01,393 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server h06/192.168.200.26:2182
> 2011-11-29 22:14:01,393 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to h06/192.168.200.26:2182, initiating session
> 2011-11-29 22:14:01,393 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempt\
> ing reconnect
> 2011-11-29 22:14:01,495 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: <h05,h06,h09:/hbase,org.apache.hadoop.hbase.master.HMaster>Failed to read org.apache.zookeeper.KeeperExcepti\
> on$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
> 2011-11-29 22:14:01,495 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: <h05,h06,h09:/hbase,org.apache.hadoop.hbase.master.HMaster>Writing master address 192.168.200.21:60000 to zn\
> ode /hbase/master
> 2011-11-29 22:14:01,894 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server h05/192.168.200.25:2182
> 2011-11-29 22:14:01,894 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to h05/192.168.200.25:2182, initiating session
> 2011-11-29 22:14:01,894 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempt\
> ing reconnect
> 2011-11-29 22:14:01,997 WARN org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: <h05,h06,h09:/hbase,org.apache.hadoop.hbase.master.HMaster>Failed to create /hbase -- check quorum servers, c\
> urrently=h09:2182,h06:2182,h05:2182
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase