|
|
-
hang in zookeeper_close() in the mt C client
Jeremy Stribling 2011-01-29, 19:23
Hi everyone,
I use the multithreaded ZK C client library (3.3.2), and I'm seeing my application hang, and the only thread in it that's doing anything interesting is this one:
Thread 8 (Thread 5644): #0 0x00007f5d7bb5bbe4 in __lll_lock_wait () from /lib/libpthread.so.0 #1 0x00007f5d7bb59ad0 in pthread_cond_broadcast@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #2 0x00007f5d793628f6 in unlock_completion_list (l=0x32b4d68) at .../zookeeper/src/c/src/mt_adaptor.c:66 #3 0x00007f5d79354d4b in free_completions (zh=0x32b4c80, callCompletion=1, reason=-116) at .../zookeeper/src/c/src/zookeeper.c:1069 #4 0x00007f5d79355008 in cleanup_bufs (zh=0x32b4c80, callCompletion=1, rc=-116) at .../thirdparty/zookeeper/src/c/src/zookeeper.c:1125 #5 0x00007f5d79353200 in destroy (zh=0x32b4c80) at .../thirdparty/zookeeper/src/c/src/zookeeper.c:366 #6 0x00007f5d79358e0e in zookeeper_close (zh=0x32b4c80) at .../zookeeper/src/c/src/zookeeper.c:2326 #7 0x00007f5d79356d18 in api_epilog (zh=0x32b4c80, rc=0) at .../zookeeper/src/c/src/zookeeper.c:1661 #8 0x00007f5d79362f2f in adaptor_finish (zh=0x32b4c80) at .../zookeeper/src/c/src/mt_adaptor.c:205 #9 0x00007f5d79358c8c in zookeeper_close (zh=0x32b4c80) at .../zookeeper/src/c/src/zookeeper.c:2297 ....
I've seen some threads online about how there's a race condition associated with zookeeper_close, where if you app is making a synchronous call at the same time using the closed zk_handle, there could be a hang. However, my app makes no synchronous calls, and I'm 99% sure that no other thread in my app is making any concurrent call into the library ('thread apply all bt' in gdb doesn't show any other usage of the library, anyway).
Has anyone seen this before? Any leads? Thanks,
Jeremy
-
Re: hang in zookeeper_close() in the mt C client
Jeremy Stribling 2011-01-31, 18:26
I responded to someone off-list about this, but I just wanted to clarify to everyone that the part of the backtrace that isn't shown is entirely within my application, and zookeeper_close isn't being called from any Zookeeper completion thread.
On 01/29/2011 11:23 AM, Jeremy Stribling wrote: > Hi everyone, > > I use the multithreaded ZK C client library (3.3.2), and I'm seeing my > application hang, and the only thread in it that's doing anything > interesting is this one: > > Thread 8 (Thread 5644): > #0 0x00007f5d7bb5bbe4 in __lll_lock_wait () from /lib/libpthread.so.0 > #1 0x00007f5d7bb59ad0 in pthread_cond_broadcast@@GLIBC_2.3.2 () from > /lib/libpthread.so.0 > #2 0x00007f5d793628f6 in unlock_completion_list (l=0x32b4d68) at > .../zookeeper/src/c/src/mt_adaptor.c:66 > #3 0x00007f5d79354d4b in free_completions (zh=0x32b4c80, > callCompletion=1, reason=-116) at > .../zookeeper/src/c/src/zookeeper.c:1069 > #4 0x00007f5d79355008 in cleanup_bufs (zh=0x32b4c80, > callCompletion=1, rc=-116) at > .../thirdparty/zookeeper/src/c/src/zookeeper.c:1125 > #5 0x00007f5d79353200 in destroy (zh=0x32b4c80) at > .../thirdparty/zookeeper/src/c/src/zookeeper.c:366 > #6 0x00007f5d79358e0e in zookeeper_close (zh=0x32b4c80) at > .../zookeeper/src/c/src/zookeeper.c:2326 > #7 0x00007f5d79356d18 in api_epilog (zh=0x32b4c80, rc=0) at > .../zookeeper/src/c/src/zookeeper.c:1661 > #8 0x00007f5d79362f2f in adaptor_finish (zh=0x32b4c80) at > .../zookeeper/src/c/src/mt_adaptor.c:205 > #9 0x00007f5d79358c8c in zookeeper_close (zh=0x32b4c80) at > .../zookeeper/src/c/src/zookeeper.c:2297 > .... > > I've seen some threads online about how there's a race condition > associated with zookeeper_close, where if you app is making a > synchronous call at the same time using the closed zk_handle, there > could be a hang. However, my app makes no synchronous calls, and I'm > 99% sure that no other thread in my app is making any concurrent call > into the library ('thread apply all bt' in gdb doesn't show any other > usage of the library, anyway). > > Has anyone seen this before? Any leads? Thanks, > > Jeremy
-
Re: hang in zookeeper_close() in the mt C client
Patrick Hunt 2011-02-01, 01:05
Hi Jeremy, that is unusual, is it reproduceable? Do you have details on the stack for threads other than this thread doing the close? It would be best if you could create a JIRA for this, the more detail you could provide the better (full stacks and any log files) https://issues.apache.org/jira/browse/ZOOKEEPERPatrick On Mon, Jan 31, 2011 at 10:26 AM, Jeremy Stribling <[EMAIL PROTECTED]> wrote: > I responded to someone off-list about this, but I just wanted to clarify to > everyone that the part of the backtrace that isn't shown is entirely within > my application, and zookeeper_close isn't being called from any Zookeeper > completion thread. > > On 01/29/2011 11:23 AM, Jeremy Stribling wrote: >> >> Hi everyone, >> >> I use the multithreaded ZK C client library (3.3.2), and I'm seeing my >> application hang, and the only thread in it that's doing anything >> interesting is this one: >> >> Thread 8 (Thread 5644): >> #0 0x00007f5d7bb5bbe4 in __lll_lock_wait () from /lib/libpthread.so.0 >> #1 0x00007f5d7bb59ad0 in pthread_cond_broadcast@@GLIBC_2.3.2 () from >> /lib/libpthread.so.0 >> #2 0x00007f5d793628f6 in unlock_completion_list (l=0x32b4d68) at >> .../zookeeper/src/c/src/mt_adaptor.c:66 >> #3 0x00007f5d79354d4b in free_completions (zh=0x32b4c80, >> callCompletion=1, reason=-116) at .../zookeeper/src/c/src/zookeeper.c:1069 >> #4 0x00007f5d79355008 in cleanup_bufs (zh=0x32b4c80, callCompletion=1, >> rc=-116) at .../thirdparty/zookeeper/src/c/src/zookeeper.c:1125 >> #5 0x00007f5d79353200 in destroy (zh=0x32b4c80) at >> .../thirdparty/zookeeper/src/c/src/zookeeper.c:366 >> #6 0x00007f5d79358e0e in zookeeper_close (zh=0x32b4c80) at >> .../zookeeper/src/c/src/zookeeper.c:2326 >> #7 0x00007f5d79356d18 in api_epilog (zh=0x32b4c80, rc=0) at >> .../zookeeper/src/c/src/zookeeper.c:1661 >> #8 0x00007f5d79362f2f in adaptor_finish (zh=0x32b4c80) at >> .../zookeeper/src/c/src/mt_adaptor.c:205 >> #9 0x00007f5d79358c8c in zookeeper_close (zh=0x32b4c80) at >> .../zookeeper/src/c/src/zookeeper.c:2297 >> .... >> >> I've seen some threads online about how there's a race condition >> associated with zookeeper_close, where if you app is making a synchronous >> call at the same time using the closed zk_handle, there could be a hang. >> However, my app makes no synchronous calls, and I'm 99% sure that no other >> thread in my app is making any concurrent call into the library ('thread >> apply all bt' in gdb doesn't show any other usage of the library, anyway). >> >> Has anyone seen this before? Any leads? Thanks, >> >> Jeremy >
-
Re: hang in zookeeper_close() in the mt C client
Jeremy Stribling 2011-02-01, 02:09
I haven't been able to reproduce it, but if I do I will update again with more details and make a JIRA. I was hoping someone might just know something off the top of their head. Thanks! On 01/31/2011 05:05 PM, Patrick Hunt wrote: > Hi Jeremy, that is unusual, is it reproduceable? Do you have details > on the stack for threads other than this thread doing the close? > > It would be best if you could create a JIRA for this, the more detail > you could provide the better (full stacks and any log files) > https://issues.apache.org/jira/browse/ZOOKEEPER> > Patrick > > On Mon, Jan 31, 2011 at 10:26 AM, Jeremy Stribling<[EMAIL PROTECTED]> wrote: > >> I responded to someone off-list about this, but I just wanted to clarify to >> everyone that the part of the backtrace that isn't shown is entirely within >> my application, and zookeeper_close isn't being called from any Zookeeper >> completion thread. >> >> On 01/29/2011 11:23 AM, Jeremy Stribling wrote: >> >>> Hi everyone, >>> >>> I use the multithreaded ZK C client library (3.3.2), and I'm seeing my >>> application hang, and the only thread in it that's doing anything >>> interesting is this one: >>> >>> Thread 8 (Thread 5644): >>> #0 0x00007f5d7bb5bbe4 in __lll_lock_wait () from /lib/libpthread.so.0 >>> #1 0x00007f5d7bb59ad0 in pthread_cond_broadcast@@GLIBC_2.3.2 () from >>> /lib/libpthread.so.0 >>> #2 0x00007f5d793628f6 in unlock_completion_list (l=0x32b4d68) at >>> .../zookeeper/src/c/src/mt_adaptor.c:66 >>> #3 0x00007f5d79354d4b in free_completions (zh=0x32b4c80, >>> callCompletion=1, reason=-116) at .../zookeeper/src/c/src/zookeeper.c:1069 >>> #4 0x00007f5d79355008 in cleanup_bufs (zh=0x32b4c80, callCompletion=1, >>> rc=-116) at .../thirdparty/zookeeper/src/c/src/zookeeper.c:1125 >>> #5 0x00007f5d79353200 in destroy (zh=0x32b4c80) at >>> .../thirdparty/zookeeper/src/c/src/zookeeper.c:366 >>> #6 0x00007f5d79358e0e in zookeeper_close (zh=0x32b4c80) at >>> .../zookeeper/src/c/src/zookeeper.c:2326 >>> #7 0x00007f5d79356d18 in api_epilog (zh=0x32b4c80, rc=0) at >>> .../zookeeper/src/c/src/zookeeper.c:1661 >>> #8 0x00007f5d79362f2f in adaptor_finish (zh=0x32b4c80) at >>> .../zookeeper/src/c/src/mt_adaptor.c:205 >>> #9 0x00007f5d79358c8c in zookeeper_close (zh=0x32b4c80) at >>> .../zookeeper/src/c/src/zookeeper.c:2297 >>> .... >>> >>> I've seen some threads online about how there's a race condition >>> associated with zookeeper_close, where if you app is making a synchronous >>> call at the same time using the closed zk_handle, there could be a hang. >>> However, my app makes no synchronous calls, and I'm 99% sure that no other >>> thread in my app is making any concurrent call into the library ('thread >>> apply all bt' in gdb doesn't show any other usage of the library, anyway). >>> >>> Has anyone seen this before? Any leads? Thanks, >>> >>> Jeremy >>> >>
-
Re: hang in zookeeper_close() in the mt C client
Patrick Hunt 2011-02-01, 18:00
Hi Jeremy. Nothing comes to mind. I searched around on jira a bit and nothing there pops out at me either. I'd encourage you to create a jira regardless, add the details you have available currently and if you are able to reproduce attach additional information. Regards, Patrick On Mon, Jan 31, 2011 at 6:09 PM, Jeremy Stribling <[EMAIL PROTECTED]> wrote: > I haven't been able to reproduce it, but if I do I will update again with > more details and make a JIRA. I was hoping someone might just know > something off the top of their head. Thanks! > > On 01/31/2011 05:05 PM, Patrick Hunt wrote: >> >> Hi Jeremy, that is unusual, is it reproduceable? Do you have details >> on the stack for threads other than this thread doing the close? >> >> It would be best if you could create a JIRA for this, the more detail >> you could provide the better (full stacks and any log files) >> https://issues.apache.org/jira/browse/ZOOKEEPER>> >> Patrick >> >> On Mon, Jan 31, 2011 at 10:26 AM, Jeremy Stribling<[EMAIL PROTECTED]> >> wrote: >> >>> >>> I responded to someone off-list about this, but I just wanted to clarify >>> to >>> everyone that the part of the backtrace that isn't shown is entirely >>> within >>> my application, and zookeeper_close isn't being called from any Zookeeper >>> completion thread. >>> >>> On 01/29/2011 11:23 AM, Jeremy Stribling wrote: >>> >>>> >>>> Hi everyone, >>>> >>>> I use the multithreaded ZK C client library (3.3.2), and I'm seeing my >>>> application hang, and the only thread in it that's doing anything >>>> interesting is this one: >>>> >>>> Thread 8 (Thread 5644): >>>> #0 0x00007f5d7bb5bbe4 in __lll_lock_wait () from /lib/libpthread.so.0 >>>> #1 0x00007f5d7bb59ad0 in pthread_cond_broadcast@@GLIBC_2.3.2 () from >>>> /lib/libpthread.so.0 >>>> #2 0x00007f5d793628f6 in unlock_completion_list (l=0x32b4d68) at >>>> .../zookeeper/src/c/src/mt_adaptor.c:66 >>>> #3 0x00007f5d79354d4b in free_completions (zh=0x32b4c80, >>>> callCompletion=1, reason=-116) at >>>> .../zookeeper/src/c/src/zookeeper.c:1069 >>>> #4 0x00007f5d79355008 in cleanup_bufs (zh=0x32b4c80, callCompletion=1, >>>> rc=-116) at .../thirdparty/zookeeper/src/c/src/zookeeper.c:1125 >>>> #5 0x00007f5d79353200 in destroy (zh=0x32b4c80) at >>>> .../thirdparty/zookeeper/src/c/src/zookeeper.c:366 >>>> #6 0x00007f5d79358e0e in zookeeper_close (zh=0x32b4c80) at >>>> .../zookeeper/src/c/src/zookeeper.c:2326 >>>> #7 0x00007f5d79356d18 in api_epilog (zh=0x32b4c80, rc=0) at >>>> .../zookeeper/src/c/src/zookeeper.c:1661 >>>> #8 0x00007f5d79362f2f in adaptor_finish (zh=0x32b4c80) at >>>> .../zookeeper/src/c/src/mt_adaptor.c:205 >>>> #9 0x00007f5d79358c8c in zookeeper_close (zh=0x32b4c80) at >>>> .../zookeeper/src/c/src/zookeeper.c:2297 >>>> .... >>>> >>>> I've seen some threads online about how there's a race condition >>>> associated with zookeeper_close, where if you app is making a >>>> synchronous >>>> call at the same time using the closed zk_handle, there could be a hang. >>>> However, my app makes no synchronous calls, and I'm 99% sure that no >>>> other >>>> thread in my app is making any concurrent call into the library ('thread >>>> apply all bt' in gdb doesn't show any other usage of the library, >>>> anyway). >>>> >>>> Has anyone seen this before? Any leads? Thanks, >>>> >>>> Jeremy >>>> >>> >>> >
-
Re: hang in zookeeper_close() in the mt C client
Jeremy Stribling 2011-02-01, 20:23
Ok, done: https://issues.apache.org/jira/browse/ZOOKEEPER-981On 02/01/2011 10:00 AM, Patrick Hunt wrote: > Hi Jeremy. Nothing comes to mind. I searched around on jira a bit and > nothing there pops out at me either. > > I'd encourage you to create a jira regardless, add the details you > have available currently and if you are able to reproduce attach > additional information. > > Regards, > > Patrick > > On Mon, Jan 31, 2011 at 6:09 PM, Jeremy Stribling<[EMAIL PROTECTED]> wrote: > >> I haven't been able to reproduce it, but if I do I will update again with >> more details and make a JIRA. I was hoping someone might just know >> something off the top of their head. Thanks! >> >> On 01/31/2011 05:05 PM, Patrick Hunt wrote: >> >>> Hi Jeremy, that is unusual, is it reproduceable? Do you have details >>> on the stack for threads other than this thread doing the close? >>> >>> It would be best if you could create a JIRA for this, the more detail >>> you could provide the better (full stacks and any log files) >>> https://issues.apache.org/jira/browse/ZOOKEEPER>>> >>> Patrick >>> >>> On Mon, Jan 31, 2011 at 10:26 AM, Jeremy Stribling<[EMAIL PROTECTED]> >>> wrote: >>> >>> >>>> I responded to someone off-list about this, but I just wanted to clarify >>>> to >>>> everyone that the part of the backtrace that isn't shown is entirely >>>> within >>>> my application, and zookeeper_close isn't being called from any Zookeeper >>>> completion thread. >>>> >>>> On 01/29/2011 11:23 AM, Jeremy Stribling wrote: >>>> >>>> >>>>> Hi everyone, >>>>> >>>>> I use the multithreaded ZK C client library (3.3.2), and I'm seeing my >>>>> application hang, and the only thread in it that's doing anything >>>>> interesting is this one: >>>>> >>>>> Thread 8 (Thread 5644): >>>>> #0 0x00007f5d7bb5bbe4 in __lll_lock_wait () from /lib/libpthread.so.0 >>>>> #1 0x00007f5d7bb59ad0 in pthread_cond_broadcast@@GLIBC_2.3.2 () from >>>>> /lib/libpthread.so.0 >>>>> #2 0x00007f5d793628f6 in unlock_completion_list (l=0x32b4d68) at >>>>> .../zookeeper/src/c/src/mt_adaptor.c:66 >>>>> #3 0x00007f5d79354d4b in free_completions (zh=0x32b4c80, >>>>> callCompletion=1, reason=-116) at >>>>> .../zookeeper/src/c/src/zookeeper.c:1069 >>>>> #4 0x00007f5d79355008 in cleanup_bufs (zh=0x32b4c80, callCompletion=1, >>>>> rc=-116) at .../thirdparty/zookeeper/src/c/src/zookeeper.c:1125 >>>>> #5 0x00007f5d79353200 in destroy (zh=0x32b4c80) at >>>>> .../thirdparty/zookeeper/src/c/src/zookeeper.c:366 >>>>> #6 0x00007f5d79358e0e in zookeeper_close (zh=0x32b4c80) at >>>>> .../zookeeper/src/c/src/zookeeper.c:2326 >>>>> #7 0x00007f5d79356d18 in api_epilog (zh=0x32b4c80, rc=0) at >>>>> .../zookeeper/src/c/src/zookeeper.c:1661 >>>>> #8 0x00007f5d79362f2f in adaptor_finish (zh=0x32b4c80) at >>>>> .../zookeeper/src/c/src/mt_adaptor.c:205 >>>>> #9 0x00007f5d79358c8c in zookeeper_close (zh=0x32b4c80) at >>>>> .../zookeeper/src/c/src/zookeeper.c:2297 >>>>> .... >>>>> >>>>> I've seen some threads online about how there's a race condition >>>>> associated with zookeeper_close, where if you app is making a >>>>> synchronous >>>>> call at the same time using the closed zk_handle, there could be a hang. >>>>> However, my app makes no synchronous calls, and I'm 99% sure that no >>>>> other >>>>> thread in my app is making any concurrent call into the library ('thread >>>>> apply all bt' in gdb doesn't show any other usage of the library, >>>>> anyway). >>>>> >>>>> Has anyone seen this before? Any leads? Thanks, >>>>> >>>>> Jeremy >>>>> >>>>> >>>> >>>> >>
-
Re: hang in zookeeper_close() in the mt C client
Patrick Hunt 2011-02-01, 21:24
Great. I marked it as critical given the client hangs. If someone has a chance to look at what this might be (mahadev or michi?) it would be great to get a fix in. Patrick On Tue, Feb 1, 2011 at 12:23 PM, Jeremy Stribling <[EMAIL PROTECTED]> wrote: > Ok, done: > > https://issues.apache.org/jira/browse/ZOOKEEPER-981> > On 02/01/2011 10:00 AM, Patrick Hunt wrote: >> >> Hi Jeremy. Nothing comes to mind. I searched around on jira a bit and >> nothing there pops out at me either. >> >> I'd encourage you to create a jira regardless, add the details you >> have available currently and if you are able to reproduce attach >> additional information. >> >> Regards, >> >> Patrick >> >> On Mon, Jan 31, 2011 at 6:09 PM, Jeremy Stribling<[EMAIL PROTECTED]> >> wrote: >> >>> >>> I haven't been able to reproduce it, but if I do I will update again with >>> more details and make a JIRA. I was hoping someone might just know >>> something off the top of their head. Thanks! >>> >>> On 01/31/2011 05:05 PM, Patrick Hunt wrote: >>> >>>> >>>> Hi Jeremy, that is unusual, is it reproduceable? Do you have details >>>> on the stack for threads other than this thread doing the close? >>>> >>>> It would be best if you could create a JIRA for this, the more detail >>>> you could provide the better (full stacks and any log files) >>>> https://issues.apache.org/jira/browse/ZOOKEEPER>>>> >>>> Patrick >>>> >>>> On Mon, Jan 31, 2011 at 10:26 AM, Jeremy Stribling<[EMAIL PROTECTED]> >>>> wrote: >>>> >>>> >>>>> >>>>> I responded to someone off-list about this, but I just wanted to >>>>> clarify >>>>> to >>>>> everyone that the part of the backtrace that isn't shown is entirely >>>>> within >>>>> my application, and zookeeper_close isn't being called from any >>>>> Zookeeper >>>>> completion thread. >>>>> >>>>> On 01/29/2011 11:23 AM, Jeremy Stribling wrote: >>>>> >>>>> >>>>>> >>>>>> Hi everyone, >>>>>> >>>>>> I use the multithreaded ZK C client library (3.3.2), and I'm seeing my >>>>>> application hang, and the only thread in it that's doing anything >>>>>> interesting is this one: >>>>>> >>>>>> Thread 8 (Thread 5644): >>>>>> #0 0x00007f5d7bb5bbe4 in __lll_lock_wait () from /lib/libpthread.so.0 >>>>>> #1 0x00007f5d7bb59ad0 in pthread_cond_broadcast@@GLIBC_2.3.2 () from >>>>>> /lib/libpthread.so.0 >>>>>> #2 0x00007f5d793628f6 in unlock_completion_list (l=0x32b4d68) at >>>>>> .../zookeeper/src/c/src/mt_adaptor.c:66 >>>>>> #3 0x00007f5d79354d4b in free_completions (zh=0x32b4c80, >>>>>> callCompletion=1, reason=-116) at >>>>>> .../zookeeper/src/c/src/zookeeper.c:1069 >>>>>> #4 0x00007f5d79355008 in cleanup_bufs (zh=0x32b4c80, >>>>>> callCompletion=1, >>>>>> rc=-116) at .../thirdparty/zookeeper/src/c/src/zookeeper.c:1125 >>>>>> #5 0x00007f5d79353200 in destroy (zh=0x32b4c80) at >>>>>> .../thirdparty/zookeeper/src/c/src/zookeeper.c:366 >>>>>> #6 0x00007f5d79358e0e in zookeeper_close (zh=0x32b4c80) at >>>>>> .../zookeeper/src/c/src/zookeeper.c:2326 >>>>>> #7 0x00007f5d79356d18 in api_epilog (zh=0x32b4c80, rc=0) at >>>>>> .../zookeeper/src/c/src/zookeeper.c:1661 >>>>>> #8 0x00007f5d79362f2f in adaptor_finish (zh=0x32b4c80) at >>>>>> .../zookeeper/src/c/src/mt_adaptor.c:205 >>>>>> #9 0x00007f5d79358c8c in zookeeper_close (zh=0x32b4c80) at >>>>>> .../zookeeper/src/c/src/zookeeper.c:2297 >>>>>> .... >>>>>> >>>>>> I've seen some threads online about how there's a race condition >>>>>> associated with zookeeper_close, where if you app is making a >>>>>> synchronous >>>>>> call at the same time using the closed zk_handle, there could be a >>>>>> hang. >>>>>> However, my app makes no synchronous calls, and I'm 99% sure that no >>>>>> other >>>>>> thread in my app is making any concurrent call into the library >>>>>> ('thread >>>>>> apply all bt' in gdb doesn't show any other usage of the library, >>>>>> anyway). >>>>>> >>>>>> Has anyone seen this before? Any leads? Thanks, >>>>>> >>>>>> Jeremy >>>>>> >>>>>> >>>>> >>>>> >>> >>> >
-
Re: hang in zookeeper_close() in the mt C client
Michi Mutsuzaki 2011-02-01, 21:33
I'll take a look and see if I can reproduce it. --Michi On 2/1/11 1:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > Great. I marked it as critical given the client hangs. If someone has > a chance to look at what this might be (mahadev or michi?) it would be > great to get a fix in. > > Patrick > > On Tue, Feb 1, 2011 at 12:23 PM, Jeremy Stribling <[EMAIL PROTECTED]> wrote: >> Ok, done: >> >> https://issues.apache.org/jira/browse/ZOOKEEPER-981>> >> On 02/01/2011 10:00 AM, Patrick Hunt wrote: >>> >>> Hi Jeremy. Nothing comes to mind. I searched around on jira a bit and >>> nothing there pops out at me either. >>> >>> I'd encourage you to create a jira regardless, add the details you >>> have available currently and if you are able to reproduce attach >>> additional information. >>> >>> Regards, >>> >>> Patrick >>> >>> On Mon, Jan 31, 2011 at 6:09 PM, Jeremy Stribling<[EMAIL PROTECTED]> >>> wrote: >>> >>>> >>>> I haven't been able to reproduce it, but if I do I will update again with >>>> more details and make a JIRA. I was hoping someone might just know >>>> something off the top of their head. Thanks! >>>> >>>> On 01/31/2011 05:05 PM, Patrick Hunt wrote: >>>> >>>>> >>>>> Hi Jeremy, that is unusual, is it reproduceable? Do you have details >>>>> on the stack for threads other than this thread doing the close? >>>>> >>>>> It would be best if you could create a JIRA for this, the more detail >>>>> you could provide the better (full stacks and any log files) >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER>>>>> >>>>> Patrick >>>>> >>>>> On Mon, Jan 31, 2011 at 10:26 AM, Jeremy Stribling<[EMAIL PROTECTED]> >>>>> wrote: >>>>> >>>>> >>>>>> >>>>>> I responded to someone off-list about this, but I just wanted to >>>>>> clarify >>>>>> to >>>>>> everyone that the part of the backtrace that isn't shown is entirely >>>>>> within >>>>>> my application, and zookeeper_close isn't being called from any >>>>>> Zookeeper >>>>>> completion thread. >>>>>> >>>>>> On 01/29/2011 11:23 AM, Jeremy Stribling wrote: >>>>>> >>>>>> >>>>>>> >>>>>>> Hi everyone, >>>>>>> >>>>>>> I use the multithreaded ZK C client library (3.3.2), and I'm seeing my >>>>>>> application hang, and the only thread in it that's doing anything >>>>>>> interesting is this one: >>>>>>> >>>>>>> Thread 8 (Thread 5644): >>>>>>> #0 0x00007f5d7bb5bbe4 in __lll_lock_wait () from /lib/libpthread.so.0 >>>>>>> #1 0x00007f5d7bb59ad0 in pthread_cond_broadcast@@GLIBC_2.3.2 () from >>>>>>> /lib/libpthread.so.0 >>>>>>> #2 0x00007f5d793628f6 in unlock_completion_list (l=0x32b4d68) at >>>>>>> .../zookeeper/src/c/src/mt_adaptor.c:66 >>>>>>> #3 0x00007f5d79354d4b in free_completions (zh=0x32b4c80, >>>>>>> callCompletion=1, reason=-116) at >>>>>>> .../zookeeper/src/c/src/zookeeper.c:1069 >>>>>>> #4 0x00007f5d79355008 in cleanup_bufs (zh=0x32b4c80, >>>>>>> callCompletion=1, >>>>>>> rc=-116) at .../thirdparty/zookeeper/src/c/src/zookeeper.c:1125 >>>>>>> #5 0x00007f5d79353200 in destroy (zh=0x32b4c80) at >>>>>>> .../thirdparty/zookeeper/src/c/src/zookeeper.c:366 >>>>>>> #6 0x00007f5d79358e0e in zookeeper_close (zh=0x32b4c80) at >>>>>>> .../zookeeper/src/c/src/zookeeper.c:2326 >>>>>>> #7 0x00007f5d79356d18 in api_epilog (zh=0x32b4c80, rc=0) at >>>>>>> .../zookeeper/src/c/src/zookeeper.c:1661 >>>>>>> #8 0x00007f5d79362f2f in adaptor_finish (zh=0x32b4c80) at >>>>>>> .../zookeeper/src/c/src/mt_adaptor.c:205 >>>>>>> #9 0x00007f5d79358c8c in zookeeper_close (zh=0x32b4c80) at >>>>>>> .../zookeeper/src/c/src/zookeeper.c:2297 >>>>>>> .... >>>>>>> >>>>>>> I've seen some threads online about how there's a race condition >>>>>>> associated with zookeeper_close, where if you app is making a >>>>>>> synchronous >>>>>>> call at the same time using the closed zk_handle, there could be a >>>>>>> hang. >>>>>>> However, my app makes no synchronous calls, and I'm 99% sure that no >>>>>>> other >>>>>>> thread in my app is making any concurrent call into the library >>>>>>> ('thread
-
Re: hang in zookeeper_close() in the mt C client
Mahadev Konar 2011-02-07, 05:38
great. thanks michi! On Tue, Feb 1, 2011 at 1:33 PM, Michi Mutsuzaki <[EMAIL PROTECTED]> wrote: > I'll take a look and see if I can reproduce it. > > --Michi > > On 2/1/11 1:24 PM, "Patrick Hunt" <[EMAIL PROTECTED]> wrote: > >> Great. I marked it as critical given the client hangs. If someone has >> a chance to look at what this might be (mahadev or michi?) it would be >> great to get a fix in. >> >> Patrick >> >> On Tue, Feb 1, 2011 at 12:23 PM, Jeremy Stribling <[EMAIL PROTECTED]> wrote: >>> Ok, done: >>> >>> https://issues.apache.org/jira/browse/ZOOKEEPER-981>>> >>> On 02/01/2011 10:00 AM, Patrick Hunt wrote: >>>> >>>> Hi Jeremy. Nothing comes to mind. I searched around on jira a bit and >>>> nothing there pops out at me either. >>>> >>>> I'd encourage you to create a jira regardless, add the details you >>>> have available currently and if you are able to reproduce attach >>>> additional information. >>>> >>>> Regards, >>>> >>>> Patrick >>>> >>>> On Mon, Jan 31, 2011 at 6:09 PM, Jeremy Stribling<[EMAIL PROTECTED]> >>>> wrote: >>>> >>>>> >>>>> I haven't been able to reproduce it, but if I do I will update again with >>>>> more details and make a JIRA. I was hoping someone might just know >>>>> something off the top of their head. Thanks! >>>>> >>>>> On 01/31/2011 05:05 PM, Patrick Hunt wrote: >>>>> >>>>>> >>>>>> Hi Jeremy, that is unusual, is it reproduceable? Do you have details >>>>>> on the stack for threads other than this thread doing the close? >>>>>> >>>>>> It would be best if you could create a JIRA for this, the more detail >>>>>> you could provide the better (full stacks and any log files) >>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER>>>>>> >>>>>> Patrick >>>>>> >>>>>> On Mon, Jan 31, 2011 at 10:26 AM, Jeremy Stribling<[EMAIL PROTECTED]> >>>>>> wrote: >>>>>> >>>>>> >>>>>>> >>>>>>> I responded to someone off-list about this, but I just wanted to >>>>>>> clarify >>>>>>> to >>>>>>> everyone that the part of the backtrace that isn't shown is entirely >>>>>>> within >>>>>>> my application, and zookeeper_close isn't being called from any >>>>>>> Zookeeper >>>>>>> completion thread. >>>>>>> >>>>>>> On 01/29/2011 11:23 AM, Jeremy Stribling wrote: >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Hi everyone, >>>>>>>> >>>>>>>> I use the multithreaded ZK C client library (3.3.2), and I'm seeing my >>>>>>>> application hang, and the only thread in it that's doing anything >>>>>>>> interesting is this one: >>>>>>>> >>>>>>>> Thread 8 (Thread 5644): >>>>>>>> #0 0x00007f5d7bb5bbe4 in __lll_lock_wait () from /lib/libpthread.so.0 >>>>>>>> #1 0x00007f5d7bb59ad0 in pthread_cond_broadcast@@GLIBC_2.3.2 () from >>>>>>>> /lib/libpthread.so.0 >>>>>>>> #2 0x00007f5d793628f6 in unlock_completion_list (l=0x32b4d68) at >>>>>>>> .../zookeeper/src/c/src/mt_adaptor.c:66 >>>>>>>> #3 0x00007f5d79354d4b in free_completions (zh=0x32b4c80, >>>>>>>> callCompletion=1, reason=-116) at >>>>>>>> .../zookeeper/src/c/src/zookeeper.c:1069 >>>>>>>> #4 0x00007f5d79355008 in cleanup_bufs (zh=0x32b4c80, >>>>>>>> callCompletion=1, >>>>>>>> rc=-116) at .../thirdparty/zookeeper/src/c/src/zookeeper.c:1125 >>>>>>>> #5 0x00007f5d79353200 in destroy (zh=0x32b4c80) at >>>>>>>> .../thirdparty/zookeeper/src/c/src/zookeeper.c:366 >>>>>>>> #6 0x00007f5d79358e0e in zookeeper_close (zh=0x32b4c80) at >>>>>>>> .../zookeeper/src/c/src/zookeeper.c:2326 >>>>>>>> #7 0x00007f5d79356d18 in api_epilog (zh=0x32b4c80, rc=0) at >>>>>>>> .../zookeeper/src/c/src/zookeeper.c:1661 >>>>>>>> #8 0x00007f5d79362f2f in adaptor_finish (zh=0x32b4c80) at >>>>>>>> .../zookeeper/src/c/src/mt_adaptor.c:205 >>>>>>>> #9 0x00007f5d79358c8c in zookeeper_close (zh=0x32b4c80) at >>>>>>>> .../zookeeper/src/c/src/zookeeper.c:2297 >>>>>>>> .... >>>>>>>> >>>>>>>> I've seen some threads online about how there's a race condition >>>>>>>> associated with zookeeper_close, where if you app is making a >>>>>>>> synchronous >>>>>>>> call at the same time using the closed zk_handle, there could be a >>>>>>>> hang.
|
|