Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - replicating to oneself?


Copy link to this message
-
Re: replicating to oneself?
Demai Ni 2013-11-01, 21:19
the problem is figured out by one of my co-worker. Someone put a zoo.cfg
under ./hbase/conf, which messed up the quorum look up.

many thanks for your help
On Fri, Nov 1, 2013 at 12:48 PM, Demai Ni <[EMAIL PROTECTED]> wrote:

> I injected more debug code into ReplicationPeer.
>
>  public ReplicationPeer(Configuration conf, String key,
>       String id) throws IOException {
>     this.conf = conf;
>     this.clusterKey = key;
>     this.id = id;
>     this.reloadZkWatcher()
>
>     LOG.info("Demai @ReplicationPeer : clusterkey=" + key + ",id=" + id);
>     LOG.info("Demai @ReplicationPeer : this.zkw.quom =" +
> this.zkw.getQuorum()); *//Quorum is incorrect*
>     LOG.info("Demai @ReplicationPeer : this.zkw=" + this.zkw.toString());
>   }
>
>
> and on the problematic cluster, the ReplicationPeer.zkw.quorum is wrong
>
> 2013-11-01 12:40:33,351 INFO
> org.apache.hadoop.hbase.replication.ReplicationPeer: Demai @ReplicationPeer
> : clusterkey=6,id=hdtest014.svl.ibm.com:2181:/hbase
> 2013-11-01 12:40:33,351 INFO
> org.apache.hadoop.hbase.replication.ReplicationPeer: Demai @ReplicationPeer
> : this.zkw.quom =*bdvm134.svl.ibm.com:2181*
> 2013-11-01 12:40:33,351 INFO
> org.apache.hadoop.hbase.replication.ReplicationPeer: Demai @ReplicationPeer
> : this.zkw=connection to cluster: hdtest014.svl.ibm.com:2181:/hbase
>
>
>
> On Fri, Nov 1, 2013 at 11:12 AM, Demai Ni <[EMAIL PROTECTED]> wrote:
>
>> Himanshu and Nick,
>>
>> many thanks for your help.  I don't have all the answers to Nick's
>> questions, since the deployment is built by another team and combined with
>> a lot of other components like zookeeper, hadoop, hbase, hive, oozie, etc.
>>
>> I followed Himanshu's suggestion and checked the hbase.id on two
>> different problematic cluster, they are different. So seems normal to me.
>> About the deployment. I did clean install(well, at least that is my
>> intention), and not re-using existing znodes. The installation is to stop
>> everything(zookeeper, hadoop, hbase, etc), remove all the files and data;
>> then install everything. so should be nothing left over.
>>
>> Let me describe current setup and my investigation so far. Rows can be
>> replicated from the correct cluster to problematic cluster, but can't be
>> replicated from the problematic one EVEN with both have the same hbase.jar.
>>
>> ** Problematic Cluster: *
>> name = bdvm134
>> /hbase/hbase.id =  $b13a0e3a-2bec-4e13-8b1d-043aa1a66443
>> > list_peers  (I put two there just for debug purpose)
>>  PEER_ID CLUSTER_KEY STATE
>>  6 hdtest014.svl.ibm.com:2181:/hbase ENABLED
>>  7 hdtest014.svl.ibm.com:2181:/hbase ENABLED
>>
>>
>> ** Correct Cluster: *
>> name = hdtest014
>> /hbase/hbase.id = ce41a00d-5b0c-44b2-8bf7-bfd35bda1d42
>> > list_peers
>>  PEER_ID CLUSTER_KEY STATE
>>  1 bdvm134.svl.ibm.com:2181:/hbase ENABLED
>>
>>
>> I injected some debugging code into ReplicationSource.run()
>> public void run() {
>>   ....
>>
>>     LOG.info("Replicating "+clusterId + " -> " + peerClusterId);
>>
>>     Map<String, ReplicationPeer> peerList = zkHelper.getPeerClusters();
>>
>>     for (Map.Entry<String, ReplicationPeer> peer : peerList.entrySet()) {
>>       LOG.info("Demai ---------------begin");
>>       String peerId_A = peer.getKey();
>>       ReplicationPeer rPeer = peer.getValue();
>>       try {
>>         LOG.info("clusterUUId = " + zkHelper.getUUIDForCluster(
>> zkHelper.getZookeeperWatcher()));
>>         LOG.info("peerUUID = " + zkHelper.getPeerUUID(peerId_A));
>>       } catch (KeeperException e) {
>>         LOG.info("exception = " + e);
>>       }
>>
>>       LOG.info("peerID = " + peerId_A);
>>       LOG.info("peer Value=" + rPeer.toString());
>>
>>       List<ServerName> sList = zkHelper.getSlavesAddresses(peerId_A);
>>       for (ServerName sName : sList) {
>>         LOG.info("sName = " + sName.getHostname()); *// value incorrect
>> on problematic cluster*
>>       }
>>       LOG.info("Peer Cluster=" + rPeer.getClusterKey() + ",Peer ID = " +
>> rPeer.getId());
>>       LOG.info("Demai ---------------end");