Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # dev >> Possible bug - xid mismatch, out of order responses, "unexpected server response: expected 0x..., but recieved 0x..."


Copy link to this message
-
Re: Possible bug - xid mismatch, out of order responses, "unexpected server response: expected 0x..., but recieved 0x..."
I have another instance of this issue while running at DEBUG trace level.

----------------------------------------------------------------------------------------------------------------------------
2014-01-01 12:55:12,278 - WARN  [NIOWorkerThread-4:NIOServerCnxn@362][] -
Unable to read additional data from client sessionid 0x200000209eb0002,
likely client has closed socket
2014-01-01 12:55:12,278 - INFO  [NIOWorkerThread-4:NIOServerCnxn@1000][] -
Closed socket connection for client /10.11.19.2:54313 which had sessionid
0x200000209eb0002
2014-01-01 12:55:12,279 - DEBUG [NIOWorkerThread-4:NIOServerCnxn@1036][] -
ignoring exception during input shutdown
java.net.SocketException: Transport endpoint is not connected
        at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
        at
sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:658)
        at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:378)
        at
org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1032)
        at
org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1005)
        at
org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:989)
        at
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:364)
        at
org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:530)
        at
org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:152)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:679)
2014-01-01 12:55:12,799 - INFO
 [NIOServerCxnFactory.AcceptThread:/10.11.19.2:2181
:NIOServerCnxnFactory$AcceptThread@296][] - Accepted socket connection from
/10.11.19.2:54326
2014-01-01 12:55:12,799 - DEBUG [NIOWorkerThread-1:ZooKeeperServer@761][] -
Session establishment request from client /10.11.19.2:54326 client's
lastZxid is 0x2000003b3
2014-01-01 12:55:12,800 - INFO  [NIOWorkerThread-1:ZooKeeperServer@816][] -
Client attempting to renew session 0x200000209eb0002 at /10.11.19.2:54326
2014-01-01 12:55:12,800 - INFO  [NIOWorkerThread-1:Learner@114][] -
Revalidating client: 0x200000209eb0002
2014-01-01 12:55:12,802 - INFO
 [QuorumPeer[myid=3175620633077743617]/10.11.19.2:2181:ZooKeeperServer@567][]
- Established session 0x200000209eb0002 with negotiated timeout 10000 for
client /10.11.19.2:54326

...

2014-01-01 12:55:14,291 - DEBUG
[FollowerRequestProcessor:3175620633077743617:CommitProcessor@315][] -
Processing request:: sessionid:0x200000209eb0002 type:getChildren cxid:0x1
zxid:0xfffffffffffffffe txntype:unknown reqpath:/j/l/m

2014-01-01 12:55:14,298 - DEBUG
[FollowerRequestProcessor:3175620633077743617:CommitProcessor@315][] -
Processing request:: sessionid:0x200000209eb0002 type:sync: cxid:0x2
zxid:0xfffffffffffffffe txntype:unknown reqpath:/j/k

2014-01-01 12:55:14,299 - DEBUG
[FollowerRequestProcessor:3175620633077743617:CommitProcessor@315][] -
Processing request:: sessionid:0x200000209eb0002 type:getChildren cxid:0x3
zxid:0xfffffffffffffffe txntype:unknown reqpath:/a/b/c/d/e/f/g/h.i

2014-01-01 12:55:14,300 - DEBUG
[QuorumPeer[myid=3175620633077743617]/10.11.19.2:2181:CommitProcessor@302][]
- Committing request:: sessionid:0x200000209eb0002 type:sync: cxid:0x2
zxid:0xfffffffffffffffe txntype:unknown reqpath:/j/k

2014-01-01 12:55:14,306 - DEBUG
[CommitProcWorkThread-1:FinalRequestProcessor@88][] - Processing request::
sessionid:0x200000209eb0002 type:sync: cxid:0x2 zxid:0xfffffffffffffffe
txntype:unknown reqpath:/j/k

2014-01-01 12:55:14,307 - DEBUG
[CommitProcWorkThread-1:FinalRequestProcessor@159][] -
sessionid:0x200000209eb0002 type:sync: cxid:0x2 zxid:0xfffffffffffffffe
txntype:unknown reqpath:/j/k
---------------------------------------------------------------------------------------------------------------------
Leading to, on the (remote) client:
Jan  1 04:54:49 myhost 2014-01-01 04:54:49,310 MainProcess: ERROR:pid
677:tid 140289981269760:kazoo.protocol.connection:('xids do not match,
expected %r received %r', 1, 2)#012Traceback (most recent call last):#012
 File "/usr/lib/python2.7/dist-packages/kazoo/protocol/connection.py", line
512, in _connect_loop#012    response = self._read_socket(read_timeout)#012
 File "/usr/lib/python2.7/dist-packages/kazoo/protocol/connection.py", line
397, in _read_socket#012    return self._read_response(header, buffer,
offset)#012  File
"/usr/lib/python2.7/dist-packages/kazoo/protocol/connection.py", line 322,
in _read_response#012    'received %r', xid, header.xid)#012RuntimeError:
('xids do not match, expected %r received %r', 1, 2)

I'm confused as to how the sync (cxid==0x2) can apparently hit the final
request processor before get_children (cxid==0x1) does.  If anyone has any
insight (or guesses) into this problem I'd be very curious to hear your
thoughts.   I'd like to push this through to a patch, if appropriate, but I
don't see the root cause here.

Thanks,
On Tue, Dec 10, 2013 at 10:04 AM, Dutch Meyer <[EMAIL PROTECTED]> wrote:
-=-Dutch