|
|
-
Testing expired sessions bug?
Ben Bangert 2012-08-21, 20:00
The Zookeeper FAQ documents the following to create a session expiration for testing:
"In the case of testing we want to cause a problem, so to explicitly expire a session an application connects to ZooKeeper, saves the session id and password, creates another ZooKeeper handle with that id and password, and then closes the new handle. Since both handles reference the same session, the close on second handle will invalidate the session causing a SESSION_EXPIRED on the first handle." This apparently works.... some of the time. Our tests show that it seems to work as this describes about 95% of the time. The other times, the new client being connected with the existing client id immediately gets an EXPIRED_SESSION_STATE upon connecting. The existing client does *not* get the session expiration as the docs seem to think, its session is fine. This is a bit odd as the client attempting to re-use the session ID/password is told the session is expired, while the existing connected client is fine (seemingly invalidating the claim that the session is expired).
This appears in a 3-server setup when having the new client go to a separate server in the cluster from the first one, and when they're both on the same Zookeeper server.
When they go to the same server, I get rather confusing information by watching the clients global watcher debug info:
ZooKeeper: INFO: log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.3 ZooKeeper: INFO: log_env@716: Client environment:host.name=Lars.local ZooKeeper: INFO: log_env@723: Client environment:os.name=Darwin ZooKeeper: INFO: log_env@724: Client environment:os.arch=12.0.0 ZooKeeper: INFO: log_env@725: Client environment:os.version=Darwin Kernel Version 12.0.0: Sun Jun 24 23:00:16 PDT 2012; root:xnu-2050.7.9~1/RELEASE_X86_64 ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host=127.0.0.1:20000/kazootests6859289a29b041e9bf7c4f69afde7e6c sessionTimeout=10000 watcher=0x1016c2420 sessionId=0 sessionPasswd=<null> context=0x102d12f00 flags=0 ZooKeeper: INFO: check_events@1701: initiated connection to server [127.0.0.1:20000] ZooKeeper: INFO: check_events@1748: session establishment complete on server [127.0.0.1:20000], sessionId=0x1393db197700bff, negotiated timeout=10000 kazoo.client: DEBUG: Client Instance: 4319500112, Handle: 2, Type: SESSION_EVENT, State: CONNECTED_STATE, Path: ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host=127.0.0.1:20000 sessionTimeout=10000 watcher=0x1016c2420 sessionId=0x1393db197700bff sessionPasswd=<hidden> context=0x102d14050 flags=0 ZooKeeper: INFO: check_events@1701: initiated connection to server [127.0.0.1:20000] ZooKeeper: ERROR: handle_socket_error_msg@1719: Socket [127.0.0.1:20000] zk retcode=-4, errno=64(Host is down): failed while receiving a server response ZooKeeper: ERROR: handle_socket_error_msg@1737: Socket [127.0.0.1:20000] zk retcode=-112, errno=70(Stale NFS file handle): sessionId=0x1393db197700bff has expired. kazoo.client: DEBUG: Client Instance: 4319532240, Handle: 3, Type: SESSION_EVENT, State: EXPIRED_SESSION_STATE, Path: kazoo.client: DEBUG: Client Instance: 4319500112, Handle: 2, Type: SESSION_EVENT, State: CONNECTING_STATE, Path: ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host=127.0.0.1:20000 sessionTimeout=10000 watcher=0x1016c2420 sessionId=0 sessionPasswd=<null> context=0x102d08cc0 flags=0 ZooKeeper: INFO: check_events@1701: initiated connection to server [127.0.0.1:20000] ZooKeeper: INFO: check_events@1748: session establishment complete on server [127.0.0.1:20000], sessionId=0x1393db197700c00, negotiated timeout=10000 kazoo.client: DEBUG: Client Instance: 4319532240, Handle: 4, Type: SESSION_EVENT, State: CONNECTED_STATE, Path: ZooKeeper: INFO: zookeeper_close@2506: Closing zookeeper sessionId=0x1393db197700c00 to [127.0.0.1:20000] ZooKeeper: INFO: check_events@1701: initiated connection to server [127.0.0.1:20000] ZooKeeper: INFO: check_events@1748: session establishment complete on server [127.0.0.1:20000], sessionId=0x1393db197700bff, negotiated timeout=10000 kazoo.client: DEBUG: Client Instance: 4319500112, Handle: 2, Type: SESSION_EVENT, State: CONNECTED_STATE, Path:
The second client gets an expired session state upon connecting, the first client gets disconnect (but NOT expired). The debug clearly indicates the server just said that sessionId 0x1393db197700bff has expired. Yet the second line from the bottom, we can see the first client successfully reconnect with its session that supposedly is expired.
When the second client uses a different server, it just gets kicked off and the first client never gets disconnected:
ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host=127.0.0.1:20000/kazootests0209e8a645674d4d865d22960fb579bb sessionTimeout=10000 watcher=0x1016c2420 sessionId=0 sessionPasswd=<null> context=0x10189bf10 flags=0 ZooKeeper: INFO: check_events@1701: initiated connection to server [127.0.0.1:20000] ZooKeeper: INFO: check_events@1748: session establishment complete on server [127.0.0.1:20000], sessionId=0x1393db197700cc7, negotiated timeout=10000 kazoo.client: DEBUG: Client Instance: 4319500688, Handle: 2, Type: SESSION_EVENT, State: CONNECTED_STATE, Path: ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host=127.0.0.1:20010 sessionTimeout=10000 watcher=0x1016c2420 sessionId=0x1393db197700cc7 sessionPasswd=<hidden> context=0x101835030 flags=0 ZooKeeper: INFO: check_events@1701: initiated connection to server [127.0.0.1:20010] ZooKeeper: ERROR: handle_socket_error_msg@1737: Socket [127.0.0.1:20010] zk retcode=-112, errno=70(Stale NFS file handle): sessionId=0x1393db197700cc7 has expired. kazoo.client: DEBUG: Client Instance: 4319532240, Handle: 3, Type: SESSION_EVENT, State: EXPIRED_SESSION_STATE, Path: ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host=127.0.0.1:20010 sessionTimeout=10000 watcher=0x1016c2420 sessionId=0 sessionPass
-
Re: Testing expired sessions bug?
Jordan Zimmerman 2012-08-21, 20:02
FYI - Curator has a utility for expiring a session: https://github.com/Netflix/curator/blob/master/curator-test/src/main/java/com/netflix/curator/test/KillSession.javaOn Aug 21, 2012, at 1:00 PM, Ben Bangert <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: The Zookeeper FAQ documents the following to create a session expiration for testing: "In the case of testing we want to cause a problem, so to explicitly expire a session an application connects to ZooKeeper, saves the session id and password, creates another ZooKeeper handle with that id and password, and then closes the new handle. Since both handles reference the same session, the close on second handle will invalidate the session causing a SESSION_EXPIRED on the first handle." This apparently works.... some of the time. Our tests show that it seems to work as this describes about 95% of the time. The other times, the new client being connected with the existing client id immediately gets an EXPIRED_SESSION_STATE upon connecting. The existing client does *not* get the session expiration as the docs seem to think, its session is fine. This is a bit odd as the client attempting to re-use the session ID/password is told the session is expired, while the existing connected client is fine (seemingly invalidating the claim that the session is expired). This appears in a 3-server setup when having the new client go to a separate server in the cluster from the first one, and when they're both on the same Zookeeper server. When they go to the same server, I get rather confusing information by watching the clients global watcher debug info: ZooKeeper: INFO: log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.3 ZooKeeper: INFO: log_env@716: Client environment:host.name=Lars.local ZooKeeper: INFO: log_env@723: Client environment:os.name=Darwin ZooKeeper: INFO: log_env@724: Client environment:os.arch=12.0.0 ZooKeeper: INFO: log_env@725: Client environment:os.version=Darwin Kernel Version 12.0.0: Sun Jun 24 23:00:16 PDT 2012; root:xnu-2050.7.9~1/RELEASE_X86_64 ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host=127.0.0.1:20000/kazootests6859289a29b041e9bf7c4f69afde7e6c sessionTimeout=10000 watcher=0x1016c2420 sessionId=0 sessionPasswd=<null> context=0x102d12f00 flags=0 ZooKeeper: INFO: check_events@1701: initiated connection to server [127.0.0.1:20000] ZooKeeper: INFO: check_events@1748: session establishment complete on server [127.0.0.1:20000], sessionId=0x1393db197700bff, negotiated timeout=10000 kazoo.client: DEBUG: Client Instance: 4319500112, Handle: 2, Type: SESSION_EVENT, State: CONNECTED_STATE, Path: ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host=127.0.0.1:20000 sessionTimeout=10000 watcher=0x1016c2420 sessionId=0x1393db197700bff sessionPasswd=<hidden> context=0x102d14050 flags=0 ZooKeeper: INFO: check_events@1701: initiated connection to server [127.0.0.1:20000] ZooKeeper: ERROR: handle_socket_error_msg@1719: Socket [127.0.0.1:20000] zk retcode=-4, errno=64(Host is down): failed while receiving a server response ZooKeeper: ERROR: handle_socket_error_msg@1737: Socket [127.0.0.1:20000] zk retcode=-112, errno=70(Stale NFS file handle): sessionId=0x1393db197700bff has expired. kazoo.client: DEBUG: Client Instance: 4319532240, Handle: 3, Type: SESSION_EVENT, State: EXPIRED_SESSION_STATE, Path: kazoo.client: DEBUG: Client Instance: 4319500112, Handle: 2, Type: SESSION_EVENT, State: CONNECTING_STATE, Path: ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host=127.0.0.1:20000 sessionTimeout=10000 watcher=0x1016c2420 sessionId=0 sessionPasswd=<null> context=0x102d08cc0 flags=0 ZooKeeper: INFO: check_events@1701: initiated connection to server [127.0.0.1:20000] ZooKeeper: INFO: check_events@1748: session establishment complete on server [127.0.0.1:20000], sessionId=0x1393db197700c00, negotiated timeout=10000 kazoo.client: DEBUG: Client Instance: 4319532240, Handle: 4, Type: SESSION_EVENT, State: CONNECTED_STATE, Path: ZooKeeper: INFO: zookeeper_close@2506: Closing zookeeper sessionId=0x1393db197700c00 to [127.0.0.1:20000] ZooKeeper: INFO: check_events@1701: initiated connection to server [127.0.0.1:20000] ZooKeeper: INFO: check_events@1748: session establishment complete on server [127.0.0.1:20000], sessionId=0x1393db197700bff, negotiated timeout=10000 kazoo.client: DEBUG: Client Instance: 4319500112, Handle: 2, Type: SESSION_EVENT, State: CONNECTED_STATE, Path: The second client gets an expired session state upon connecting, the first client gets disconnect (but NOT expired). The debug clearly indicates the server just said that sessionId 0x1393db197700bff has expired. Yet the second line from the bottom, we can see the first client successfully reconnect with its session that supposedly is expired. When the second client uses a different server, it just gets kicked off and the first client never gets disconnected: ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host=127.0.0.1:20000/kazootests0209e8a645674d4d865d22960fb579bb sessionTimeout=10000 watcher=0x1016c2420 sessionId=0 sessionPasswd=<null> context=0x10189bf10 flags=0 ZooKeeper: INFO: check_events@1701: initiated connection to server [127.0.0.1:20000] ZooKeeper: INFO: check_events@1748: session establishment complete on server [127.0.0.1:20000], sessionId=0x1393db197700cc7, negotiated timeout=10000 kazoo.client: DEBUG: Client Instance: 4319500688, Handle: 2, Type: SESSION_EVENT, State: CONNECTED_STATE, Path: ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host=127.0.0.1:20010 sessionTimeout=10000 watcher=0x1016c2420 sessionId=0x1393db197700cc7 sessionPasswd=<hidden> context=0x101835030 flags=0 ZooKeeper: INFO: check_events@1701: initiated connection to server [127.0.0.1:20010] ZooKeeper: ERROR: handle_socket_error_msg@1737: Socket [127.0.0.1:20010] zk retcode=-112, errno=70(Stale NFS file handle): sessionId=0x1393db197700cc7 has expired. kazoo.client: DE
-
Re: Testing expired sessions bug?
Ben Bangert 2012-08-21, 20:13
On Aug 21, 2012, at 1:02 PM, Jordan Zimmerman <[EMAIL PROTECTED]> wrote: > FYI - Curator has a utility for expiring a session: > > https://github.com/Netflix/curator/blob/master/curator-test/src/main/java/com/netflix/curator/test/KillSession.javaSo the question is, given that it apparently operates identically to the kazoo test, if you run it repeatedly a few hundred times, will you get this same error? Kapil verified this same error occurs using txzookeeper, so its in two Python clients. If it never happens for Curator, perhaps its a bug in the C binding. Cheers, Ben
-
Re: Testing expired sessions bug?
Jordan Zimmerman 2012-08-21, 20:40
Curator's unit tests don't usually fail due to KillSession not working AFAIK. On Aug 21, 2012, at 1:13 PM, Ben Bangert <[EMAIL PROTECTED]> wrote: > On Aug 21, 2012, at 1:02 PM, Jordan Zimmerman <[EMAIL PROTECTED]> wrote: > >> FYI - Curator has a utility for expiring a session: >> >> https://github.com/Netflix/curator/blob/master/curator-test/src/main/java/com/netflix/curator/test/KillSession.java> > So the question is, given that it apparently operates identically to the kazoo test, if you run it repeatedly a few hundred times, will you get this same error? > > Kapil verified this same error occurs using txzookeeper, so its in two Python clients. If it never happens for Curator, perhaps its a bug in the C binding. > > Cheers, > Ben
-
Re: Testing expired sessions bug?
Jordan Zimmerman 2012-08-21, 20:41
Note, though, that Curator does more than what the FAQ describes for killing the session. On Aug 21, 2012, at 1:13 PM, Ben Bangert <[EMAIL PROTECTED]> wrote: > On Aug 21, 2012, at 1:02 PM, Jordan Zimmerman <[EMAIL PROTECTED]> wrote: > >> FYI - Curator has a utility for expiring a session: >> >> https://github.com/Netflix/curator/blob/master/curator-test/src/main/java/com/netflix/curator/test/KillSession.java> > So the question is, given that it apparently operates identically to the kazoo test, if you run it repeatedly a few hundred times, will you get this same error? > > Kapil verified this same error occurs using txzookeeper, so its in two Python clients. If it never happens for Curator, perhaps its a bug in the C binding. > > Cheers, > Ben
-
Re: Testing expired sessions bug?
Ben Bangert 2012-08-21, 21:30
On Aug 21, 2012, at 1:41 PM, Jordan Zimmerman <[EMAIL PROTECTED]> wrote:
> Note, though, that Curator does more than what the FAQ describes for killing the session.
Ah, perhaps I'm misreading the Java code. It looks like it operates in the FAQ manner except it adds a watch for a non-existent node so that it can register a watcher for when it gets the next event... which is *assumed* to be a session being lost. The sessionLostWatch will get triggered for any session event, including a connection loss. I don't see anywhere that the original client is watched specifically to see that it has gotten a EXPIRED_SESSION_STATE event.
In the error I described originally, the first client *will* get its connection dropped, but will not actually be expired. If I was only watching for any event, and not specifically an expired session state, then my tests would appear to work.
I mainly program in Python, so its likely I'm missing something in how the Java ZK client or Curator works. Can you explain the technique here a little more?
Cheers, Ben
-
Re: Testing expired sessions bug?
Jordan Zimmerman 2012-08-21, 21:43
Yes, it watches the bogus node but it also blocks until the second ZK instance successfully connects. I found that both of these things are necessary for reliable session killing.
-JZ
On Aug 21, 2012, at 2:30 PM, Ben Bangert <[EMAIL PROTECTED]> wrote:
> Ah, perhaps I'm misreading the Java code. It looks like it operates in the FAQ manner except it adds a watch for a non-existent node so that it can register a watcher for when it gets the next event... which is *assumed* to be a session being lost.
-
Re: Testing expired sessions bug?
Ben Bangert 2012-08-21, 22:18
On Aug 21, 2012, at 2:43 PM, Jordan Zimmerman <[EMAIL PROTECTED]> wrote:
> Yes, it watches the bogus node but it also blocks until the second ZK instance successfully connects. I found that both of these things are necessary for reliable session killing.
Unfortunately that doesn't seem to help in Python. I've set the second ZK instance to wait up to 30 seconds for a connection, and it never gets anything other than a EXPIRED_SESSION_STATE during its connection attempts. The second ZK instance is completely unable to connect using the sessionid/password of the first client. The first client never gets booted during these attempts, thus my original note on a possible bug either in: 1. Zookeeper 2. Zookeeper C binding 3. Zookeeper Python C binding
If you can run your session killing code in a repeating test hundreds of times without the second ZK instance ever timing out on its connection attempts, it seems its probably a bug in 2 or 3.
Cheers, Ben
-
Re: Testing expired sessions bug?
Michi Mutsuzaki 2012-08-21, 22:39
Hi Ben,
The C binding currently does not support creating multiple connections with the same session id. Sorry this was not stated in the documentation. Please open a jira for this, and patches are always welcome :)
--Michi
On Tue, Aug 21, 2012 at 3:18 PM, Ben Bangert <[EMAIL PROTECTED]> wrote: > On Aug 21, 2012, at 2:43 PM, Jordan Zimmerman <[EMAIL PROTECTED]> wrote: > >> Yes, it watches the bogus node but it also blocks until the second ZK instance successfully connects. I found that both of these things are necessary for reliable session killing. > > Unfortunately that doesn't seem to help in Python. I've set the second ZK instance to wait up to 30 seconds for a connection, and it never gets anything other than a EXPIRED_SESSION_STATE during its connection attempts. The second ZK instance is completely unable to connect using the sessionid/password of the first client. The first client never gets booted during these attempts, thus my original note on a possible bug either in: > 1. Zookeeper > 2. Zookeeper C binding > 3. Zookeeper Python C binding > > If you can run your session killing code in a repeating test hundreds of times without the second ZK instance ever timing out on its connection attempts, it seems its probably a bug in 2 or 3. > > Cheers, > Ben
-
Re: Testing expired sessions bug?
Ben Bangert 2012-08-21, 23:08
On Aug 21, 2012, at 3:39 PM, Michi Mutsuzaki <[EMAIL PROTECTED]> wrote:
> The C binding currently does not support creating multiple connections > with the same session id. Sorry this was not stated in the > documentation. Please open a jira for this, and patches are always > welcome :)
Ah! That explains a lot. I assume if I spawn a separate process entirely that loads the existing session id then it'll be ok?
- Ben
-
Re: Testing expired sessions bug?
Michi Mutsuzaki 2012-08-21, 23:31
On Tue, Aug 21, 2012 at 4:08 PM, Ben Bangert <[EMAIL PROTECTED]> wrote: > On Aug 21, 2012, at 3:39 PM, Michi Mutsuzaki <[EMAIL PROTECTED]> wrote: > >> The C binding currently does not support creating multiple connections >> with the same session id. Sorry this was not stated in the >> documentation. Please open a jira for this, and patches are always >> welcome :) > > Ah! That explains a lot. I assume if I spawn a separate process entirely that loads the existing session id then it'll be ok?
That might work, but I couldn't say for sure since I haven't tried that myself. Please give it a shot and let us know!
--Michi
-
Re: Testing expired sessions bug?
Ben Bangert 2012-08-22, 00:56
On Aug 21, 2012, at 4:31 PM, Michi Mutsuzaki <[EMAIL PROTECTED]> wrote:
> That might work, but I couldn't say for sure since I haven't tried > that myself. Please give it a shot and let us know!
So the C binding doesn't work when loading a session ID if another process on the same computer is already using that? How come the tests pass most of the time?
It'd help a lot for us to look into patching this if we had some overview of why the C binding is managing state in an apparently global way such that two clients with the same session ID can't connect from the same computer... some of the time.
I switched my unit tests to spinning up a separate process that connects and closes the file handle in a separate process, which dropped how often this occurs a little more, but not completely.
Cheers, Ben
-
Re: Testing expired sessions bug?
Hanno Schlichting 2012-08-22, 18:54
On Wed, Aug 22, 2012 at 2:56 AM, Ben Bangert <[EMAIL PROTECTED]> wrote: > So the C binding doesn't work when loading a session ID if another process on the same computer is already using that? How come the tests pass most of the time?
As a bit more info, here's the trace log of such a failed session reuse using ZK 3.4.3 with a single server:
2012-08-22 20:20:38,548 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:20000:NIOServerCnxn$Factory@251][] - Accepted socket connection from /127.0.0.1:61933 2012-08-22 20:20:38,548 [myid:] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:20000:NIOServerCnxn@735][] - Session establishment request from client /127.0.0.1:61933 client's lastZxid is 0x0 2012-08-22 20:20:38,549 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:20000:NIOServerCnxn@770][] - Client attempting to renew session 0x1394f908ca00000 at /127.0.0.1:61933 2012-08-22 20:20:38,549 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:20000:NIOServerCnxn@1435][] - Closed socket connection for client /127.0.0.1:61932 which had sessionid 0x1394f908ca00000 2012-08-22 20:20:38,552 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:20000:NIOServerCnxn@1573][] - Invalid session 0x1394f908ca00000 for client /127.0.0.1:61933, probably expired 2012-08-22 20:20:38,552 [myid:] - TRACE [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:20000:NIOServerCnxn@435][] - Add a buffer to outgoingBuffers, sk sun.nio.ch.SelectionKeyImpl@6a5f6303 is valid: true 2012-08-22 20:20:38,554 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:20000:NIOServerCnxn@1435][] - Closed socket connection for client /127.0.0.1:61933 which had sessionid 0x1394f908ca00000 2012-08-22 20:20:38,555 [myid:] - DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:20000:NIOServerCnxn@1459][] - ignoring exception during input shutdown java.net.SocketException: Socket is not connected at sun.nio.ch.SocketChannelImpl.shutdown(Native Method) at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:633) at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) at org.apache.zookeeper.server.NIOServerCnxn.closeSock(NIOServerCnxn.java:1455) at org.apache.zookeeper.server.NIOServerCnxn.close(NIOServerCnxn.java:1412) at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:632) at org.apache.zookeeper.server.NIOServerCnxn$Factory.run(NIOServerCnxn.java:262)
Note how there's no ZooTrace entries in here and no "Session 0x... is valid..." message as logged in ZooKeeperServer.revalidateSession. Reading the server code it looks like the checkPasswd check fails in reopenSession. But I have no idea why that would suddenly happen in some of the cases.
Hanno
-
Re: Testing expired sessions bug?
Hanno Schlichting 2012-08-22, 20:09
On Wed, Aug 22, 2012 at 8:54 PM, Hanno Schlichting <[EMAIL PROTECTED]> wrote: > Note how there's no ZooTrace entries in here and no "Session 0x... is > valid..." message as logged in ZooKeeperServer.revalidateSession. > Reading the server code it looks like the checkPasswd check fails in > reopenSession. But I have no idea why that would suddenly happen in > some of the cases.
Apparently it's the session password as memorized in the C binding.
This is the output of doing essentially:
handle = zookeeper.init(...) print 'Client id %s' % repr(zookeeper.client_id(handle))
Run number: 9 test_bad_session_expire (kazoo.tests.test_client.TestConnection) ... Client id: (0L, '') Client id: (88189511064027136L, '\x7fIFM\xa0\x96\x9e\xbc\xf5\xadZ\x0e\xe9wd\xfe')
Run number: 10 test_bad_session_expire (kazoo.tests.test_client.TestConnection) ... Client id: (0L, '') Client id: (88189511508623360L, '\xff')
In all good cases we see proper passwords of 16 bytes. In the failure case we just get '\xff'.
Hanno
-
Re: Testing expired sessions bug?
Marshall McMullen 2012-08-22, 21:59
There is also a proposed patch to zookeeper ( https://issues.apache.org/jira/browse/ZOOKEEPER-829) that creates a znode at /zookeeper/sessions that you could use to test session loss and recovery. On Tue, Aug 21, 2012 at 2:00 PM, Ben Bangert <[EMAIL PROTECTED]> wrote: > The Zookeeper FAQ documents the following to create a session expiration > for testing: > > "In the case of testing we want to cause a problem, so to explicitly > expire a session an application connects to ZooKeeper, saves the session id > and password, creates another ZooKeeper handle with that id and password, > and then closes the new handle. Since both handles reference the same > session, the close on second handle will invalidate the session causing a > SESSION_EXPIRED on the first handle." > > > This apparently works.... some of the time. Our tests show that it seems > to work as this describes about 95% of the time. The other times, the new > client being connected with the existing client id immediately gets an > EXPIRED_SESSION_STATE upon connecting. The existing client does *not* get > the session expiration as the docs seem to think, its session is fine. This > is a bit odd as the client attempting to re-use the session ID/password is > told the session is expired, while the existing connected client is fine > (seemingly invalidating the claim that the session is expired). > > This appears in a 3-server setup when having the new client go to a > separate server in the cluster from the first one, and when they're both on > the same Zookeeper server. > > When they go to the same server, I get rather confusing information by > watching the clients global watcher debug info: > > ZooKeeper: INFO: log_env@712: Client > environment:zookeeper.version=zookeeper C client 3.4.3 > ZooKeeper: INFO: log_env@716: Client environment:host.name=Lars.local > ZooKeeper: INFO: log_env@723: Client environment:os.name=Darwin > ZooKeeper: INFO: log_env@724: Client environment:os.arch=12.0.0 > ZooKeeper: INFO: log_env@725: Client environment:os.version=Darwin Kernel > Version 12.0.0: Sun Jun 24 23:00:16 PDT 2012; > root:xnu-2050.7.9~1/RELEASE_X86_64 > ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host> 127.0.0.1:20000/kazootests6859289a29b041e9bf7c4f69afde7e6csessionTimeout=10000 watcher=0x1016c2420 sessionId=0 sessionPasswd=<null> > context=0x102d12f00 flags=0 > ZooKeeper: INFO: check_events@1701: initiated connection to server [ > 127.0.0.1:20000] > ZooKeeper: INFO: check_events@1748: session establishment complete on > server [127.0.0.1:20000], sessionId=0x1393db197700bff, negotiated > timeout=10000 > kazoo.client: DEBUG: Client Instance: 4319500112, Handle: 2, Type: > SESSION_EVENT, State: CONNECTED_STATE, Path: > ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host> 127.0.0.1:20000 sessionTimeout=10000 watcher=0x1016c2420 > sessionId=0x1393db197700bff sessionPasswd=<hidden> context=0x102d14050 > flags=0 > ZooKeeper: INFO: check_events@1701: initiated connection to server [ > 127.0.0.1:20000] > ZooKeeper: ERROR: handle_socket_error_msg@1719: Socket [127.0.0.1:20000] > zk retcode=-4, errno=64(Host is down): failed while receiving a server > response > ZooKeeper: ERROR: handle_socket_error_msg@1737: Socket [127.0.0.1:20000] > zk retcode=-112, errno=70(Stale NFS file handle): > sessionId=0x1393db197700bff has expired. > kazoo.client: DEBUG: Client Instance: 4319532240, Handle: 3, Type: > SESSION_EVENT, State: EXPIRED_SESSION_STATE, Path: > kazoo.client: DEBUG: Client Instance: 4319500112, Handle: 2, Type: > SESSION_EVENT, State: CONNECTING_STATE, Path: > ZooKeeper: INFO: zookeeper_init@786: Initiating client connection, host> 127.0.0.1:20000 sessionTimeout=10000 watcher=0x1016c2420 sessionId=0 > sessionPasswd=<null> context=0x102d08cc0 flags=0 > ZooKeeper: INFO: check_events@1701: initiated connection to server [ > 127.0.0.1:20000] > ZooKeeper: INFO: check_events@1748: session establishment complete on > server [127.0.0.1:20000], sessionId=0x1393db197700c00, negotiated
|
|