I work on a project and I would be happy to have your thoughts about our
requirements and how Zookeeper meets them.
The facts :
* We need to share configuration items between 10 data centers.
Configuration must be synchronized between data centers (actually we can
tolerate a few seconds of inconsistency)
* Configuration items will be serialized in JSon and together they can fit
into 256MB of heap
* R/W ratio is 90% read and 10% write and client number should be low (50
to 100 in each data center)
* A client running in a DC can freely communicate with a host in an other DC
* Latency between data center is 20 to 60 ms
* Only 1 host (machine) per data center might be dedicated to a Zookeeper
process : machines are big IBM AIX boxes only one is dedicated for this
project in each DC
* Project must survive a data center crash
Since configuration items are small and they must be synchronized and we
need a fail-over mechanism Zookeeper appears to be a good candidate, but
i'm not sure how to deploy it mainly because we have to start only one
Zookeeper process in each data center.
My idea is to deploy 1 follower in only 5 DC. This way there are 5
followers all over the country and we can lost 2 DC). Of course all the
clients on all the data centers must know where are the 5 zookeeper servers.
Do you see any downside to do this ?
I know that Zookeeper has been designed to run on a LAN and on "commodity
hardware" but regarding the R/W ratio and the latency do you think that it
is a good idea to deploy it this way ?
Thanks for your comments