Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> replicating to oneself?


Copy link to this message
-
Re: replicating to oneself?
Himanshu and Nick,

many thanks for your help.  I don't have all the answers to Nick's
questions, since the deployment is built by another team and combined with
a lot of other components like zookeeper, hadoop, hbase, hive, oozie, etc.

I followed Himanshu's suggestion and checked the hbase.id on two different
problematic cluster, they are different. So seems normal to me. About the
deployment. I did clean install(well, at least that is my intention), and
not re-using existing znodes. The installation is to stop
everything(zookeeper, hadoop, hbase, etc), remove all the files and data;
then install everything. so should be nothing left over.

Let me describe current setup and my investigation so far. Rows can be
replicated from the correct cluster to problematic cluster, but can't be
replicated from the problematic one EVEN with both have the same hbase.jar.

** Problematic Cluster: *
name = bdvm134
/hbase/hbase.id =  $b13a0e3a-2bec-4e13-8b1d-043aa1a66443
> list_peers  (I put two there just for debug purpose)
 PEER_ID CLUSTER_KEY STATE
 6 hdtest014.svl.ibm.com:2181:/hbase ENABLED
 7 hdtest014.svl.ibm.com:2181:/hbase ENABLED
** Correct Cluster: *
name = hdtest014
/hbase/hbase.id = ce41a00d-5b0c-44b2-8bf7-bfd35bda1d42
> list_peers
 PEER_ID CLUSTER_KEY STATE
 1 bdvm134.svl.ibm.com:2181:/hbase ENABLED
I injected some debugging code into ReplicationSource.run()
public void run() {
  ....
    LOG.info("Replicating "+clusterId + " -> " + peerClusterId);

    Map<String, ReplicationPeer> peerList = zkHelper.getPeerClusters();

    for (Map.Entry<String, ReplicationPeer> peer : peerList.entrySet()) {
      LOG.info("Demai ---------------begin");
      String peerId_A = peer.getKey();
      ReplicationPeer rPeer = peer.getValue();
      try {
        LOG.info("clusterUUId = " + zkHelper.getUUIDForCluster(
zkHelper.getZookeeperWatcher()));
        LOG.info("peerUUID = " + zkHelper.getPeerUUID(peerId_A));
      } catch (KeeperException e) {
        LOG.info("exception = " + e);
      }

      LOG.info("peerID = " + peerId_A);
      LOG.info("peer Value=" + rPeer.toString());

      List<ServerName> sList = zkHelper.getSlavesAddresses(peerId_A);
      for (ServerName sName : sList) {
        LOG.info("sName = " + sName.getHostname()); *// value incorrect on
problematic cluster*
      }
      LOG.info("Peer Cluster=" + rPeer.getClusterKey() + ",Peer ID = " +
rPeer.getId());
      LOG.info("Demai ---------------end");
    }
...
}

on bdvm134- regionserver:
2013-11-01 10:20:44,757 DEBUG
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening
log for replication bdvm134.svl.ibm.com%2C60020%2C1383324585548.1383324589592
at 3073
2013-11-01 10:20:44,761 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
Replicating b13a0e3a-2bec-4e13-8b1d-043aa1a66443 ->
b13a0e3a-2bec-4e13-8b1d-043aa1a66443
2013-11-01 10:20:44,761 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Demai
---------------begin
2013-11-01 10:20:44,773 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
clusterUUId = b13a0e3a-2bec-4e13-8b1d-043aa1a66443
2013-11-01 10:20:44,777 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
peerUUID = b13a0e3a-2bec-4e13-8b1d-043aa1a66443
2013-11-01 10:20:44,777 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: peerID
= 6
2013-11-01 10:20:44,777 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: peer
Value=org.apache.hadoop.hbase.replication.ReplicationPeer@33bb33bb
2013-11-01 10:20:44,779 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: sName bdvm134.svl.ibm.com
2013-11-01 10:20:44,779 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Peer
Cluster=6,Peer ID = hdtest014.svl.ibm.com:2181:/hbase
2013-11-01 10:20:44,779 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Demai
---------------end
2013-11-01 10:20:44,779 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Demai
2013-11-01 10:20:44,786 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
clusterUUId = b13a0e3a-2bec-4e13-8b1d-043aa1a66443
2013-11-01 10:20:44,790 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
peerUUID = b13a0e3a-2bec-4e13-8b1d-043aa1a66443
2013-11-01 10:20:44,790 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: peerID
= 7
2013-11-01 10:20:44,790 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: peer
Value=org.apache.hadoop.hbase.replication.ReplicationPeer@710071
2013-11-01 10:20:44,792 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: sName *bdvm134.svl.ibm.com*
2013-11-01 10:20:44,792 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Peer
Cluster=7,Peer ID = *hdtest014.svl.ibm.com*:2181:/hbase
2013-11-01 10:20:44,792 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Demai
2013-11-01 10:20:44,794 DEBUG
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening
log for replication bdvm134.svl.ibm.com%2C60020%2C1383324585548.1383324589592
at 3073
on hdtest014 regionsever:
2013-11-01 10:25:01,260 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
Replicating ce41a00d-5b0c-44b2-8bf7-bfd35bda1d42 ->
b13a0e3a-2bec-4e13-8b1d-043aa1a66443
2013-11-01 10:25:01,260 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Demai
2013-11-01 10:25:01,263 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
clusterUUId = ce41a00d-5b0c-44b2-8bf7-bfd35bda1d42
2013-11-01 10:25:01,279 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource:
peerUUID = b13a0e3a-2bec-4e13-8b1d-043aa1a66443
2013-11-01 10:25:01,279 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: peerID
= 1
2013-11-01 10:25:01,279 INFO
org.apache.hadoop.hbase.replication.reg