-Re: What should I do with SyncDisconnected
Jordan Zimmerman 2013-03-13, 21:21
SyncDisconnected can occur for a variety of reasons. It's in the class of recoverable errors. Your app needs to go into a waiting state until SysConnected is retrieved again or SessionExpired. Have you read http://wiki.apache.org/hadoop/ZooKeeper/ErrorHandling ?
You should consider using one of the high level ZooKeeper frameworks (such as Curator which I wrote).
On Mar 13, 2013, at 2:01 PM, Ivan Kelly <[EMAIL PROTECTED]> wrote:
> Hi guys,
> We have a usecase here where zookeeper is used to coordinate ownership
> of partitions of a resource. When one server dies, the partition
> should be moved to another server, etc. The action we need to take on
> SessionExpired is very clear. We just kill the server.
> However it is unclear what we should do on a SyncDisconnected. We
> can't just kill our server, as it may have just been one zookeeper
> server failing. If we block all client requests to our server while we
> wait for SyncConnected, we may block forever in the case that our
> server is partitioned away from the zk cluster. If we continue to
> serve requests, we risk split brain.
> What have people done in the past to resolve issues like this?
>  This is a risk anyhow without proper fencing, but a limited amount
> is ok in our application.