Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> adding a separate thread to detect network timeouts faster


+
Jeremy Stribling 2013-09-10, 20:01
+
Ted Dunning 2013-09-10, 20:31
+
Jeremy Stribling 2013-09-10, 20:34
+
mattdaumen@... 2013-09-10, 20:45
+
Jeremy Stribling 2013-09-10, 20:48
+
Ted Dunning 2013-09-10, 20:59
+
Ted Dunning 2013-09-10, 21:04
+
Jeremy Stribling 2013-09-10, 21:05
+
German Blanco 2013-09-11, 05:40
+
Jeremy Stribling 2013-09-11, 06:32
+
Michi Mutsuzaki 2013-09-11, 20:36
Copy link to this message
-
RE: adding a separate thread to detect network timeouts faster

AFAIK, ping requests would not involve any disk I/O, but it would go through the RequestProcessor chain and executes sequentially.
There could be cases when there are another set of requests which are in the queue for committing(say these requests needs database/disk operations). Now a ping request has come from the client, this will be queued up at the end of the queue. In this case, it would delay the ping request processing and resulting in slow responses.

Here the server is slow due to I/O response time and affecting the client ping responses. Anyway after seeing the ping failure, client would look for another server.

Earlier I tried by passing ping requests from entering to RequestProcessor chain, instead directly send response back to the client. It has disadvantage of violating the requests lifecycle. Interesting point is how to differentiate the slow servers and servers which are really down...
-Rakesh

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Michi Mutsuzaki
Sent: 12 September 2013 02:07
To: [EMAIL PROTECTED]
Cc: German Blanco
Subject: Re: adding a separate thread to detect network timeouts faster

Slow disk does affect client <-> server ping requests since ping requests go through the commit processor.

Here is how the current client <-> server ping request works. Say the session timeout is set to 30 seconds.

1. The client sends a ping request if the session has been inactive for 10 seconds (1/3 of the session timeout).
2. The client waits for ping response for another 10 seconds (1/3 of the session timeout).
3. If the client doesn't receive ping response after 10 seconds, it tries to connect to another server.

So in this case, it can take up to 20 seconds for the client to detect a server failure. I think this 1/3 value is picked somewhat arbitrarily. Maybe you can make this configurable for faster failure detection instead of introducing another heartbeat mechanism?

--Michi
On Tue, Sep 10, 2013 at 11:32 PM, Jeremy Stribling <[EMAIL PROTECTED]> wrote:
> Hi Germán,
>
> A very quick scan of that JIRA makes me think you're talking about
> server->server heartbeats, and not client->server heartbeats (which is
> server->what
> I'm talking about).  I have not tested it explicitly or inspected that
> part of the code, but I've hit many cases in testing and production
> where client session expirations coincide with long fsync times as logged by the server.
>
> Jeremy
>
>
> On 09/10/2013 10:40 PM, German Blanco wrote:
>>
>> Hello Jeremy and all,
>>
>> my idea was that the current implementation of ping handling already
>> does not wait on disk IO.
>> I am even working in a JIRA case that is related with this:
>> https://issues.apache.org/jira/browse/ZOOKEEPER-87
>> And I have also made some tests that seem to confirm that ping
>> handling is done in a different thread than transaction handling.
>> But actually, I don't have any confirmation from any person in this
>> project. Are you sure that ping handling waits on IO for anything?
>> Have you tested it?
>>
>> Regards,
>> Germán Blanco.
>>
>>
>>
>> On Tue, Sep 10, 2013 at 11:05 PM, Jeremy Stribling <[EMAIL PROTECTED]>
>> wrote:
>>
>>> Good suggestion, thanks.  At the very least, I think what we have in
>>> mind would be off by default, so users could only turn it on if they
>>> know they have relatively few clients and slow disks.  An adaptive
>>> scheme would be even better, obviously.
>>>
>>>
>>> On 09/10/2013 02:04 PM, Ted Dunning wrote:
>>>
>>>> Perhaps you should be suggesting a design that is adaptive rather
>>>> than configured and guarantees low overhead at the cost of
>>>> notification time in extreme scenarios.
>>>>
>>>> For instance, the server can send no more than 1000 (or whatever
>>>> number) HB's per second and never more than one per second to any
>>>> client.  This caps the cost nicely.
>>>>
>>>>
>>>>
>>>> On Tue, Sep 10, 2013 at 1:59 PM, Ted Dunning
>>>> <[EMAIL PROTECTED]<mailto:
+
Michi Mutsuzaki 2013-09-12, 18:05
+
Rakesh R 2013-09-13, 06:24