Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # dev - about zookeeper-cli have bug or some doubt


Copy link to this message
-
Re: about zookeeper-cli have bug or some doubt
Michi Mutsuzaki 2012-06-09, 18:28
Ok, I'll take a look.

--Michi

On Fri, Jun 8, 2012 at 3:17 PM, Patrick Hunt <[EMAIL PROTECTED]> wrote:
> Speaking of windows, Michi can you take a look why the windows job has
> started failing of late? Perhaps an environment change? (you might
> look at other windows jobs on that box to get an idea)
>
> https://builds.apache.org//view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk-WinVS2008/
>
> Thanks!
>
> Patrick
>
> On Fri, Jun 8, 2012 at 10:16 AM, Michi Mutsuzaki <[EMAIL PROTECTED]> wrote:
>> I think there is a bug in windows port (are you on windows?) that
>> doesn't set recursive attribute for the to_send mutex. Please open a
>> jira:
>>
>> https://issues.apache.org/jira/browse/ZOOKEEPER
>>
>> Thanks!
>> --Michi
>>
>> On Fri, Jun 8, 2012 at 1:00 AM, 乱麻的魅��� <[EMAIL PROTECTED]> wrote:
>>> hi dev:
>>>     I now try to use the zookeeper cli (c code version)to connect the zookeeper server, but i find only can connect to ZK,but cann't send any cmd to ZK, like "ls /".  if i send cmd ,then zk-cli goto deadlock at this line  lock_buffer_list(list)   {//LINE 00945 dequeue_buffer() function of zookeeper.c};   then i try to locate this case.
>>>
>>>    i download  the zk cli (ver 3.4.3) from http://labs.renren.com/apache-mirror/zookeeper/ ,  buid the project again, find bug locate the line 00945   in zookeeper-3.4.3.tar.gz\zookeeper-3.4.3\src\c\src\zookeeper.c too. now i describe this case below:
>>>
>>>  1 if client send cmd to ZKserver, client need call some function to send the cmd ,like zoo_awget,send_ping,zoo_aget,etc.., all this function need call  adaptor_send_queue(zh, 0); then below...
>>>
>>>  2 adaptor_send_queue(zh, 0) call  flush_send_queue(zh, timeout);
>>>
>>>  int flush_send_queue(zhandle_t*zh, int timeout)
>>> {
>>>    int rc= ZOK;
>>>    struct timeval started;
>>> #ifdef WIN32
>>>    fd_set pollSet;
>>>    struct timeval wait;
>>> #endif
>>>    gettimeofday(&started,0);
>>>    // we can't use dequeue_buffer() here because if (non-blocking) send_buffer()
>>>    // returns EWOULDBLOCK we'd have to put the buffer back on the queue.
>>>    // we use a recursive lock instead and only dequeue the buffer if a send was
>>>    // successful
>>>    lock_buffer_list(&zh->to_send);  /*first time lock the buffer, wfs 20120608 */
>>>    while (zh->to_send.head != 0&& zh->state == ZOO_CONNECTED_STATE) {
>>>        if(timeout!=0){
>>>            int elapsed;
>>>            struct timeval now;
>>>            gettimeofday(&now,0);
>>>            elapsed=calculate_interval(&started,&now);
>>>            if (elapsed>timeout) {
>>>                rc = ZOPERATIONTIMEOUT;
>>>                break;
>>>            }
>>>  #ifdef WIN32
>>>            wait = get_timeval(timeout-elapsed);
>>>            FD_ZERO(&pollSet);
>>>            FD_SET(zh->fd, &pollSet);
>>>            // Poll the socket
>>>            rc = select((int)(zh->fd)+1, NULL,  &pollSet, NULL, &wait);
>>> #else
>>>            struct pollfd fds;
>>>            fds.fd = zh->fd;
>>>            fds.events = POLLOUT;
>>>            fds.revents = 0;
>>>            rc = poll(&fds, 1, timeout-elapsed);
>>> #endif
>>>            if (rc<=0) {
>>>                /* timed out or an error or POLLERR */
>>>                rc = rc==0 ? ZOPERATIONTIMEOUT : ZSYSTEMERROR;
>>>                break;
>>>            }
>>>        }
>>>         rc = send_buffer(zh->fd, zh->to_send.head);
>>>        if(rc==0 && timeout==0){
>>>            /* send_buffer would block while sending this buffer */
>>>            rc = ZOK;
>>>            break;
>>>        }
>>>        if (rc < 0) {
>>>            rc = ZCONNECTIONLOSS;
>>>            break;
>>>        }
>>>        // if the buffer has been sent successfully, remove it from the queue
>>>        if (rc > 0)
>>>            remove_buffer(&zh->to_send); /*this function will second time lock the buffer with lock under locked status, wfs 20120608 */
>>>
>>>        gettimeofday(&zh->last_send, 0);
>>>        rc = ZOK;
>>>    }
>>>    unlock_buffer_list(&zh->to_send);