Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> puzzling BadVersionException


Copy link to this message
-
Re: puzzling BadVersionException
Ishaaq,
 2 ZK servers is definitely not the right number for running a ZK
service but its no reason to get a Badversion exception because of
that. For more information on the size of the ZK ensemble take a look
at:

http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html

As for the version on the znode, can you try reading the version when
you get a setData/BadException?

Also, is there any chance of a delete on the znode that removes it and
another create happens for the same path?

I dont think we have seen this version issue in the releases, so I'd
be inclined to say that there could be something in the code thats
making some changes to the znode before you set the data.

Hope that helps
thanks
mahadev

On Fri, Oct 7, 2011 at 6:47 PM, Ishaaq Chandy <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> We're seeing a puzzling error. Here's the scenario:
>
> 1. We have a single thread that wakes up every two seconds (give or take)
> and does some work
> 2. As part of that work it updates a node on ZK. When it does this it first
> gets the Stat of the existing node and uses the version retrieved from it to
> update the value.
> 3. There are no other processes updating the node
>
> The code goes something like this:
>  final Stat stat = zooKeeper.exists(path, false);
> // do some other work here to create the path if it does not exist - this
> code only ever gets called once
>  zooKeeper.setData(path, value, stat.getVersion());
>
> What we're seeing is that every so often (once every 5 minutes or so?) is
> that that setData() call fails with a BadVersionException. This is very
> unexpected because, as I mentioned previously, this thread is the sole
> updater of that node.
>
> One possibility I am considering is that we are using the wrong number of
> ZKs in our cluster - i.e 2 nodes. I am wondering if 2 is the worst number of
> nodes possible for ZK as there is no way to resolve a disagreement.
>
> Another possibility is that we are using an old version of ZK (3.2.2),
> perhaps there is a known bug with it? Though I see nothing related to this
> in the release logs for subsequent versions.
>
> Thoughts/suggestions?
>
> Thanks,
> Ishaaq
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB