Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper, mail # user - hang in zookeeper_close() in the mt C client


+
Jeremy Stribling 2011-01-29, 19:23
+
Jeremy Stribling 2011-01-31, 18:26
+
Patrick Hunt 2011-02-01, 01:05
+
Jeremy Stribling 2011-02-01, 02:09
+
Patrick Hunt 2011-02-01, 18:00
+
Jeremy Stribling 2011-02-01, 20:23
+
Patrick Hunt 2011-02-01, 21:24
Copy link to this message
-
Re: hang in zookeeper_close() in the mt C client
Michi Mutsuzaki 2011-02-01, 21:33
I'll take a look and see if I can reproduce it.

--Michi

On 2/1/11 1:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote:

> Great. I marked it as critical given the client hangs. If someone has
> a chance to look at what this might be (mahadev or michi?) it would be
> great to get a fix in.
>
> Patrick
>
> On Tue, Feb 1, 2011 at 12:23 PM, Jeremy Stribling <[EMAIL PROTECTED]> wrote:
>> Ok, done:
>>
>> https://issues.apache.org/jira/browse/ZOOKEEPER-981
>>
>> On 02/01/2011 10:00 AM, Patrick Hunt wrote:
>>>
>>> Hi Jeremy. Nothing comes to mind. I searched around on jira a bit and
>>> nothing there pops out at me either.
>>>
>>> I'd encourage you to create a jira regardless, add the details you
>>> have available currently and if you are able to reproduce attach
>>> additional information.
>>>
>>> Regards,
>>>
>>> Patrick
>>>
>>> On Mon, Jan 31, 2011 at 6:09 PM, Jeremy Stribling<[EMAIL PROTECTED]>
>>>  wrote:
>>>
>>>>
>>>> I haven't been able to reproduce it, but if I do I will update again with
>>>> more details and make a JIRA.  I was hoping someone might just know
>>>> something off the top of their head.  Thanks!
>>>>
>>>> On 01/31/2011 05:05 PM, Patrick Hunt wrote:
>>>>
>>>>>
>>>>> Hi Jeremy, that is unusual, is it reproduceable? Do you have details
>>>>> on the stack for threads other than this thread doing the close?
>>>>>
>>>>> It would be best if you could create a JIRA for this, the more detail
>>>>> you could provide the better (full stacks and any log files)
>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER
>>>>>
>>>>> Patrick
>>>>>
>>>>> On Mon, Jan 31, 2011 at 10:26 AM, Jeremy Stribling<[EMAIL PROTECTED]>
>>>>>  wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> I responded to someone off-list about this, but I just wanted to
>>>>>> clarify
>>>>>> to
>>>>>> everyone that the part of the backtrace that isn't shown is entirely
>>>>>> within
>>>>>> my application, and zookeeper_close isn't being called from any
>>>>>> Zookeeper
>>>>>> completion thread.
>>>>>>
>>>>>> On 01/29/2011 11:23 AM, Jeremy Stribling wrote:
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> I use the multithreaded ZK C client library (3.3.2), and I'm seeing my
>>>>>>> application hang, and the only thread in it that's doing anything
>>>>>>> interesting is this one:
>>>>>>>
>>>>>>> Thread 8 (Thread 5644):
>>>>>>> #0  0x00007f5d7bb5bbe4 in __lll_lock_wait () from /lib/libpthread.so.0
>>>>>>> #1  0x00007f5d7bb59ad0 in pthread_cond_broadcast@@GLIBC_2.3.2 () from
>>>>>>> /lib/libpthread.so.0
>>>>>>> #2  0x00007f5d793628f6 in unlock_completion_list (l=0x32b4d68) at
>>>>>>> .../zookeeper/src/c/src/mt_adaptor.c:66
>>>>>>> #3  0x00007f5d79354d4b in free_completions (zh=0x32b4c80,
>>>>>>> callCompletion=1, reason=-116) at
>>>>>>> .../zookeeper/src/c/src/zookeeper.c:1069
>>>>>>> #4  0x00007f5d79355008 in cleanup_bufs (zh=0x32b4c80,
>>>>>>> callCompletion=1,
>>>>>>> rc=-116) at .../thirdparty/zookeeper/src/c/src/zookeeper.c:1125
>>>>>>> #5  0x00007f5d79353200 in destroy (zh=0x32b4c80) at
>>>>>>> .../thirdparty/zookeeper/src/c/src/zookeeper.c:366
>>>>>>> #6  0x00007f5d79358e0e in zookeeper_close (zh=0x32b4c80) at
>>>>>>> .../zookeeper/src/c/src/zookeeper.c:2326
>>>>>>> #7  0x00007f5d79356d18 in api_epilog (zh=0x32b4c80, rc=0) at
>>>>>>> .../zookeeper/src/c/src/zookeeper.c:1661
>>>>>>> #8  0x00007f5d79362f2f in adaptor_finish (zh=0x32b4c80) at
>>>>>>> .../zookeeper/src/c/src/mt_adaptor.c:205
>>>>>>> #9  0x00007f5d79358c8c in zookeeper_close (zh=0x32b4c80) at
>>>>>>> .../zookeeper/src/c/src/zookeeper.c:2297
>>>>>>> ....
>>>>>>>
>>>>>>> I've seen some threads online about how there's a race condition
>>>>>>> associated with zookeeper_close, where if you app is making a
>>>>>>> synchronous
>>>>>>> call at the same time using the closed zk_handle, there could be a
>>>>>>> hang.
>>>>>>>  However, my app makes no synchronous calls, and I'm 99% sure that no
>>>>>>> other
>>>>>>> thread in my app is making any concurrent call into the library
>>>>>>> ('thread
+
Mahadev Konar 2011-02-07, 05:38