|
|
-
trying to understand the keep alive codeBrian Tarbox 2013-02-27, 19:03
I've been trying to debug a problem related to getting session timeouts
when running an app that does tons of non-zookeeper related I/O. That I/O pressure has led to lots of zookeeper timeouts. This has led me to try to understand the timeout code. I am puzzled because the timeout code seems far more complicated than it might need to be...I'm not trying to critique the code so much as wondering if I'm missing something. Rather than simply spinning a thread that sends a ping every so often the code seems to go to great lengths to avoid sending pings and even to avoid calling System.currentTimeMillis (which it uses to determine if its really time to send a ping). While I'm a big fan of efficiency the resulting code seems convoluted....help me see what I'm missing. The main keep alive loop is basically the following code: *int to = readTimeout - clientCnxnSocket.getIdleRecv(); int timeToNextPing = (readTimeout / 2) - clientCnxnSocket.getIdleSend(); if (timeToNextPing <= 0) { sendPing(); clientCnxnSocket.updateLastSend(); } else { if (timeToNextPing < to) to = timeToNextPing; } clientCnxnSocket .doTransport(to); // blocks for "to" ms in a select, then processes I/O and can take tens of seconds* * * There is added complexity in that getIdleSend and getIdleRecv maintain separate counters of how long its been since a send or receive...but only relative to "now" which represents the current time...which is in turn updated by various methods when various things happen. I guess my question boils down to: is the system really that sensitive to an extra call to check the time or an extra ping packet every few seconds that the code in ClientCnxn.java is justified? I ask because I've seen a) "now" get significantly skewed from the actual time b) under heavy I/O pressure doTransport takes 20, 30 or 40 seconds, causing session timeouts c) increasing the client timeout mostly doesn't help because the code increases "to" based on the timeout Again, sorry to be critical but I'm really asking in the spirit of trying to understand. Thanks. http://about.me/BrianTarbox |