Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> "Client session timed out, have not heard from server in 143198ms", etc.


Copy link to this message
-
"Client session timed out, have not heard from server in 143198ms", etc.
Hi folks,

I've been seeing a lot of client session timed out messages from my server
logs recently - normally those wouldn't concern me too much since there're
occasion network outages. However, some of the session time out values are
abnormally large.

I've been using 40s as the session timeout value for my ZooKeeper sessions,
and those are confirmed as the negotiated timeouts in the client
establishment logs as well. However, I'd sometimes see timeout logs stating
times far longer than 40s - e.g. the one in the title.

Reading from ZooKeeper's source code (I'm using v3.4.4) - it seems there's
no way the clientCnxnSocket.getIdleRecv() call would cause session time out
delays of more than 2/3 * sessionTimeout (which is 26.66s in my case). The
theory I have is.. let's say the ZooKeeper client receives a ping from
server at time t, and the ClientCnxn.SendThread schedules the next
doTransport() at t + 26.66s - then the worst thing that could happen is
there's nothing from the server for 26.66s and so I'd get a session timed
out mesage with sth like 26667ms - which is quite common during network
outage. However, sometimes I'm getting these >40s and even >100s time outs
- and I just can't understand them.

Any clues on how these can happen?

Best Regards,
Martin Kou