Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # dev >> Re: [jira] [Commented] (ZOOKEEPER-1367) Data inconsistencies and unexpired ephemeral nodes after cluster restart


Copy link to this message
-
Re: [jira] [Commented] (ZOOKEEPER-1367) Data inconsistencies and unexpired ephemeral nodes after cluster restart
I debugged this on the plane. Due to
Transaction log not saving create session. I think. Will give more details
later.

C

>From my phone
On Jan 27, 2012 2:58 PM, "Patrick Hunt (Commented) (JIRA)" <[EMAIL PROTECTED]>
wrote:

>
>    [
> https://issues.apache.org/jira/browse/ZOOKEEPER-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195053#comment-13195053]
>
> Patrick Hunt commented on ZOOKEEPER-1367:
> -----------------------------------------
>
> Jeremy - can the cleanup scripts be modified to gzip the old logs rather
> than rm'ing them? That might allow you to provide us with the full logs for
> reproducing the issue.
>
> > Data inconsistencies and unexpired ephemeral nodes after cluster restart
> > ------------------------------------------------------------------------
> >
> >                 Key: ZOOKEEPER-1367
> >                 URL:
> https://issues.apache.org/jira/browse/ZOOKEEPER-1367
> >             Project: ZooKeeper
> >          Issue Type: Bug
> >          Components: server
> >    Affects Versions: 3.4.2
> >         Environment: Debian Squeeze, 64-bit
> >            Reporter: Jeremy Stribling
> >            Priority: Blocker
> >             Fix For: 3.4.3
> >
> >         Attachments: ZOOKEEPER-1367.tgz
> >
> >
> > In one of our tests, we have a cluster of three ZooKeeper servers.  We
> kill all three, and then restart just two of them.  Sometimes we notice
> that on one of the restarted servers, ephemeral nodes from previous
> sessions do not get deleted, while on the other server they do.  We are
> effectively running 3.4.2, though technically we are running 3.4.1 with the
> patch manually applied for ZOOKEEPER-1333 and a C client for 3.4.1 with the
> patches for ZOOKEEPER-1163.
> > I noticed that when I connected using zkCli.sh to the first node
> (90.0.0.221, zkid 84), I saw only one znode in a particular path:
> > {quote}
> > [zk: 90.0.0.221:2888(CONNECTED) 0] ls /election/zkrsm
> > [nominee0000000011]
> > [zk: 90.0.0.221:2888(CONNECTED) 1] get /election/zkrsm/nominee0000000011
> > 90.0.0.222:7777
> > cZxid = 0x400000027
> > ctime = Thu Jan 19 08:18:24 UTC 2012
> > mZxid = 0x400000027
> > mtime = Thu Jan 19 08:18:24 UTC 2012
> > pZxid = 0x400000027
> > cversion = 0
> > dataVersion = 0
> > aclVersion = 0
> > ephemeralOwner = 0xa234f4f3bc220001
> > dataLength = 16
> > numChildren = 0
> > {quote}
> > However, when I connect zkCli.sh to the second server (90.0.0.222, zkid
> 251), I saw three znodes under that same path:
> > {quote}
> > [zk: 90.0.0.222:2888(CONNECTED) 2] ls /election/zkrsm
> > nominee0000000006   nominee0000000010   nominee0000000011
> > [zk: 90.0.0.222:2888(CONNECTED) 2] get /election/zkrsm/nominee0000000011
> > 90.0.0.222:7777
> > cZxid = 0x400000027
> > ctime = Thu Jan 19 08:18:24 UTC 2012
> > mZxid = 0x400000027
> > mtime = Thu Jan 19 08:18:24 UTC 2012
> > pZxid = 0x400000027
> > cversion = 0
> > dataVersion = 0
> > aclVersion = 0
> > ephemeralOwner = 0xa234f4f3bc220001
> > dataLength = 16
> > numChildren = 0
> > [zk: 90.0.0.222:2888(CONNECTED) 3] get /election/zkrsm/nominee0000000010
> > 90.0.0.221:7777
> > cZxid = 0x30000014c
> > ctime = Thu Jan 19 07:53:42 UTC 2012
> > mZxid = 0x30000014c
> > mtime = Thu Jan 19 07:53:42 UTC 2012
> > pZxid = 0x30000014c
> > cversion = 0
> > dataVersion = 0
> > aclVersion = 0
> > ephemeralOwner = 0xa234f4f3bc220000
> > dataLength = 16
> > numChildren = 0
> > [zk: 90.0.0.222:2888(CONNECTED) 4] get /election/zkrsm/nominee0000000006
> > 90.0.0.223:7777
> > cZxid = 0x200000cab
> > ctime = Thu Jan 19 08:00:30 UTC 2012
> > mZxid = 0x200000cab
> > mtime = Thu Jan 19 08:00:30 UTC 2012
> > pZxid = 0x200000cab
> > cversion = 0
> > dataVersion = 0
> > aclVersion = 0
> > ephemeralOwner = 0x5434f5074e040002
> > dataLength = 16
> > numChildren = 0
> > {quote}
> > These never went away for the lifetime of the server, for any clients
> connected directly to that server.  Note that this cluster is configured to
> have all three servers still, the third one being down (90.0.0.223, zkid
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB