Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> question about data replication

Copy link to this message
Re: question about data replication
You specify the MINIMUM number of copies when you define the number of
nodes in your ZK cluster.

The idea is that ZK requires strong consistency and provides guarantees to
that effect.  The only way to provide those guarantees is if a majority of
the ZK cluster agree to and persist all changes.  That is in strong
contrast to Cassandra which tries to provide availability instead of

Since ZK requires a majority for every commit, a cluster defined with N
nodes will require ceiling((N+1)/2) nodes to commit every change.
 Likewise, N is not flexible without some care to make sure that these
guarantees are maintained.

On Tue, Nov 6, 2012 at 6:50 AM, Brian Tarbox <[EMAIL PROTECTED]> wrote:

> I'm working with both Cassandra and Zookeeper so please excuse me if this
> is a dumb question but does Zookeeper allow/require me to specify the
> number of copies of data (like Cassandra does) or is it simply the case
> that if a majority of nodes are up then ALL of my data is available?
> Thanks.  I'm guessing this should be obvious to me but searching the
> various docs didn't yield a clear answer.
> Thanks again.
> Brian Tarbox
> --
> http://about.me/BrianTarbox