Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Region server crashes when using replication


Copy link to this message
-
Region server crashes when using replication
Eran Kutner 2011-03-22, 17:39
Hi,
I'm trying to use replication between two HBase clusters and I'm
encountering all kinds of crashes and weird behavior.

First, it seems that starting a region server when the peer ZKs are
not available will cause the server to fail to start:

2011-03-22 08:31:56,647 INFO
org.apache.hadoop.hbase.replication.ReplicationZookeeper: Replication
is now started
2011-03-22 08:31:56,668 WARN
org.apache.hadoop.hbase.zookeeper.ZKConfig:
java.net.UnknownHostException: haddop2-zk3
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:850)
        at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1201)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1154)
        at java.net.InetAddress.getAllByName(InetAddress.java:1084)
        at java.net.InetAddress.getAllByName(InetAddress.java:1020)
        at java.net.InetAddress.getByName(InetAddress.java:970)
        at org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:206)
        at org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:250)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:113)
        at org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:288)
        at org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:253)
        at org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectExistingPeers(ReplicationZookeeper.java:182)
        at org.apache.hadoop.hbase.replication.ReplicationZookeeper.<init>(ReplicationZookeeper.java:142)
        at org.apache.hadoop.hbase.replication.regionserver.Replication.<init>(Replication.java:75)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.setupWALAndReplication(HRegionServer.java:1092)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:875)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1472)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:563)
        at java.lang.Thread.run(Thread.java:662)

2011-03-22 08:31:56,669 WARN
org.apache.hadoop.hbase.zookeeper.ZKConfig:
java.net.UnknownHostException: haddop2-zk2
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:850)
        at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1201)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1154)
        at java.net.InetAddress.getAllByName(InetAddress.java:1084)
        at java.net.InetAddress.getAllByName(InetAddress.java:1020)
        at java.net.InetAddress.getByName(InetAddress.java:970)
        at org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:206)
        at org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:250)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:113)
        at org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:288)
        at org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:253)
        at org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectExistingPeers(ReplicationZookeeper.java:182)
        at org.apache.hadoop.hbase.replication.ReplicationZookeeper.<init>(ReplicationZookeeper.java:142)
        at org.apache.hadoop.hbase.replication.regionserver.Replication.<init>(Replication.java:75)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.setupWALAndReplication(HRegionServer.java:1092)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:875)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.tryReportForDuty(HRegionServer.java:1472)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:563)
        at java.lang.Thread.run(Thread.java:662)

2011-03-22 08:31:56,669 INFO org.apache.zookeeper.ZooKeeper:
Initiating client connection,
connectString=haddop2-zk3:2181,haddop2-zk2:2181,hadoop2-zk1:2181
sessionTimeout=180000 watcher=connection to cluster: 1
2011-03-22 08:31:56,670 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Failed
initialization
2011-03-22 08:31:56,670 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer: Failed init
java.net.UnknownHostException: haddop2-zk3
        at java.net.InetAddress.getAllByName0(InetAddress.java:1158)
        at java.net.InetAddress.getAllByName(InetAddress.java:1084)
        at java.net.InetAddress.getAllByName(InetAddress.java:1020)
        at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:386)
        at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:331)
        at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:377)
        at org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:97)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:119)
        at org.apache.hadoop.hbase.replication.ReplicationZookeeper.getPeer(ReplicationZookeeper.java:288)
        at org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectToPeer(ReplicationZookeeper.java:253)
        at org.apache.hadoop.hbase.replication.ReplicationZookeeper.connectExistingPeers(ReplicationZookeeper.java:182)
        at org.apache.hadoop.hbase.replication.ReplicationZookeeper.<init>(ReplicationZookeeper.java:142)
        at org.apache.hadoop.hbase.replication.regionserver.Replication.<init>(Replication.java:75)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.setupWALAndReplication(HRegionServer.java:1092)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer