Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> 5 seconds pause connecting to ZK in unit tests


Copy link to this message
-
Re: 5 seconds pause connecting to ZK in unit tests
Can you disable IPv6 on that machine and see if it will connect over IPv4
or set the java setting(don't remember off the top of my head) to prefer
IPv4?

On Wed, Nov 7, 2012 at 1:18 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote:

> Or if I use the shell on my machine:
>
> 12/11/07 10:17:15 INFO zookeeper.RecoverableZooKeeper: Running exists
> on /hbase/master
> 12/11/07 10:17:20 INFO zookeeper.ClientCnxn: Socket connection
> established to fe80:0:0:0:0:0:0:1%1/fe80:0:0:0:0:0:0:1%1:2181,
> initiating session
> 12/11/07 10:17:20 DEBUG zookeeper.ClientCnxn: Session establishment
> request sent on fe80:0:0:0:0:0:0:1%1/fe80:0:0:0:0:0:0:1%1:2181
> 12/11/07 10:17:20 INFO zookeeper.ClientCnxn: Session establishment
> complete on server fe80:0:0:0:0:0:0:1%1/fe80:0:0:0:0:0:0:1%1:2181,
> sessionid = 0x13adc16563a0003, negotiated timeout = 40000
> 12/11/07 10:17:20 DEBUG zookeeper.ClientCnxn: Reading reply
> sessionid:0x13adc16563a0003, packet:: clientPath:null serverPath:null
> finished:false header:: 1,3  replyHeader:: 1,36,0  request::
> '/hbase/master,T  response::
> s{10,10,1352312182945,1352312182945,0,0,0,88625131181178880,73,0,10}
> 12/11/07 10:17:20 INFO zookeeper.RecoverableZooKeeper: Done with
> exists on /hbase/master
>
> This sounds DNS-y...
>
> On Wed, Nov 7, 2012 at 10:13 AM, Jean-Daniel Cryans <[EMAIL PROTECTED]>
> wrote:
> > Hey devs,
> >
> > I'm wondering if anyone else saw this. I was running TestReplication
> > on my machine and I saw it failing in testDeleteTypes trying to
> > replicate _3_ rows:
> >
> > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 59.62
> > sec <<< FAILURE!
> > testDeleteTypes(org.apache.hadoop.hbase.replication.TestReplication)
> > Time elapsed: 0.081 sec  <<< FAILURE!
> > java.lang.AssertionError: Waited too much time for put replication
> >         at org.junit.Assert.fail(Assert.java:93)
> >         at
> org.apache.hadoop.hbase.replication.TestReplication.testDeleteTypes(TestReplication.java:242)
> >
> > Digging down, I was able to jstack while the test is waiting for
> > replication and I see:
> >
> > Thread[REPL IPC Server handler 0 on 62520,5,main]
> >         at java.lang.Object.wait(Native Method)
> >         at java.lang.Object.wait(Object.java:485)
> >         at
> org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1259)
> >         at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:998)
> >         at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:171)
> >
> > It stays there the whole time. Adding debug around exists I see:
> >
> > 2012-11-07 10:01:15,759 INFO  [REPL IPC Server handler 0 on 62520]
> > zookeeper.RecoverableZooKeeper(170): Running exists on /2/master
> > 2012-11-07 10:01:20,244 INFO  [REPL IPC Server handler 0 on 62520]
> > zookeeper.RecoverableZooKeeper(184): Done with exists on /2/master
> >
> > If I turn on debug for ZK I see that ZK doesn't process the exists
> > until a few ms before the call returns.
> >
> > FWIW I see other exits() hanging for about 5 seconds in the log, and
> > it always seems to happen on the first call after the session is
> > started.
> >
> > Elliott saw the same issue with his machine yesterday but I'm
> > wondering if anyone else saw it and/or have a clue.
> >
> > Thx,
> >
> > J-D
>

--
Kevin O'Dell
Customer Operations Engineer, Cloudera