Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBaseAdmin#checkHBaseAvailable COST ABOUT 1 MINUTE TO CHECK A DEAD(OR NOT EXISTS) HBASE MASTER


Copy link to this message
-
Re: Re: HBaseAdmin#checkHBaseAvailable COST ABOUT 1 MINUTE TO CHECK A DEAD(OR NOT EXISTS) HBASE MASTER
jingych,

inline:

On Wed, Nov 13, 2013 at 7:06 PM, jingych <[EMAIL PROTECTED]> wrote:

>  Thanks, Esteban and Stack!
>
> As Esteban said, the problem was solved.
>
> My config is below:
> <code>
>  conf.setInt("hbase.client.retries.number", 1);
> conf.setInt("zookeeper.session.timeout", 5000);
> conf.setInt("zookeeper.recovery.retry", 1);
> conf.setInt("zookeeper.recovery.retry.intervalmill", 50);
> </code>
> But it still cost 46 seconds.
> And the log printing:
> <log>
>
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
>
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/root-region-server
>
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/root-region-server
>
> </log>
> It still tried to build the 4 above connections.
>

The client (via HConnectionManager) needs  to set 3 watchers on each of
those znodes in ZK, each attempt will have a max timeout of 5 seconds (you
have a single zk server) plus 10 seconds of the second attempt: 3 * (5 *
2^0) + 3 * (5 * 2^1) = 45 and the extra second should come from a hardcoded
sleep in the RPC implementation during a retry.
Setting zookeeper.recovery.retry=0 can make it fail faster but in case of a
transient failure then you will have to handle the reconnection in your
code.

>
> Could you please explain why the ZK do this? ( I'm realy new to the HBase
> world.)
> If i set the ZK session timeout with 1s, is't OK?
>

you *could* but you don't want clients to overwhelm ZK by re-establishing
connections over and over.
> And what do you mean about "depending on the number of ZK servers you have
> running the socket level timeout in the client to a ZK server will be
> zookeeper.session.timeout/#ZKs"?
> It means that if i hava 3 zookeepers and zookeeper.session.timeout=5000,
> each connection will 5000/3 timeout?
>

thats correct, the timeout to establish a connection to  ZK will be around
1.6 seconds (5000 milliseconds / 3) with 3 ZKs.
> I'm running ZK and HBase Master at one node as pseudo-distributed mode.
>

> Best Regards!
>
> ------------------------------
>
> jingych
>
> 2013-11-14
>
>  *发件人:* Esteban Gutierrez <[EMAIL PROTECTED]>
> *发送时间:* 2013-11-14 06:10
> *收件人:* Stack <[EMAIL PROTECTED]>
> *抄送:* Hbase-User <[EMAIL PROTECTED]>; jingych <[EMAIL PROTECTED]>
> *主题:* Re: Re: HBaseAdmin#checkHBaseAvailable COST ABOUT 1 MINUTE TO CHECK
> A DEAD(OR NOT EXISTS) HBASE MASTER
>
> jingych,
>
> That timeout comes from ZooKeeper, are you running ZK on the same node you
> are running the HBase Master? If your environment requires to fail fast
> even for ZK connection timeouts then you need to reduce
> zookeeper.recovery.retry.intervalmill and zookeeper.recovery.retry since
> the retries are done via an exponential backoff (1 second, 2 seconds, 8
> seconds), also depending on the number of ZK servers you have running the
> socket level timeout in the client to a ZK server will be
> zookeeper.session.timeout/#ZKs
>
> cheers,
> esteban.
>
>
>
>
>
>
>  --
> Cloudera, Inc.
>
>
>
> On Wed, Nov 13, 2013 at 7:21 AM, Stack <[EMAIL PROTECTED]> wrote:
>
>> More of the log and the version of HBase involved please.  Thanks.
>> St.Ack
>>
>>
>>  On Wed, Nov 13, 2013 at 1:07 AM, jingych <[EMAIL PROTECTED]> wrote:
>>
>>> Thanks, esteban!
>>>
>>> I'v tried. But it did not work.
>>>
>>> I first load the customer hbase-site.xml, and then try to check the
>>> hbase server.
>>> So my code is like this:
>>> <code>
>>> conf.setInt("hbase.client.retries.number", 1);
>>> conf.setInt("hbase.client.pause", 5);
>>> conf.setInt("ipc.socket.timeout", 5000);
>>> conf.setInt("hbase.rpc.timeout", 5000);
>>> </code>
>>>
>>> The log printing: Sleeping 4000ms before retry #2...
>>>
>>> If the zookeeper's quarum is the wrong address, the process will take
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB