-Re: DR policies/HA setup in production - best practices
Sergei Babovich 2011-01-03, 20:58
With multiple ips - use of host name is definitely an option.
It is also understood about DR strategy. What is the mechanism for ZK to
resolve conflicts in such case? Let's say we have a primitive backup
strategy of shipping logs every hour. In theory it means (assuming the
worst case) that on DR site all servers will have snapshots of the data
made at different point in time. When I bring the DR cluster up what is
a protocol of resolving inconsistencies? That was a reason of my
question - it felt (may be naively) that recovering by replicating from
the single node data (snapshot+log) would be safer and more consistent
approach - it is easier to make guaranties about result.
On 01/03/2011 03:05 PM, Mahadev Konar wrote:
> Hi Sergei,
> Responses in line:
> On 12/22/10 8:20 AM, "Sergei Babovich"<[EMAIL PROTECTED]> wrote:
>> Hi all,
>> We are currently looking at the best ways of deploying ZK ensemble to
>> our production pod. And I have two things I'd like to clarify (sorry if
>> it has been already answered but I did not find exact confirmation in
>> admin guide).
>> 1. To provide redundancy our POD has two network switches connected to
>> each blade through different interfaces. So in case of failure of the
>> switch blades will still be connected to the network. Practically it
>> means that each blade will have two ips and at least one of them should
>> be available. So my question is how to reflect this fact in zk
>> configuration? Is there any way to provide multiple addresses for a
>> single server? Is it just multiple records in config file? Any catch
>> here? What is the best practice?
> We currently don't have any way of specifying 2 ip addresses for a single
> server. What we should do is use the hostname as the server address and
> resolve it to a ip address when we break connection in any of the cases
> (server to server or client to server).
> The server should be able to bind to all the ip addresses using 0.0.0.0
> Feel free to open a jira for 3.4 release. This would be nice to have.
>> 2. The second question regarding the best way of organizing DR policies.
>> Basically we want to periodically backup zk state so we will be able to
>> restore it remotely. In case of a single node ensemble just backing up
>> last data snapshot + log should be completely enough. But it is not
>> completely clear to me what would be the best practice in case of a
>> cluster? Should I maintain the backup of all nodes and try to restore it
>> as a cluster? But in such case how cluster will resolve possible
>> timedifference between taking snapshots? It feels enough to backup only
>> one node and than bring the whole cluster out of it, but how do I know
>> that the node I am planning to backup is a best one? Is it correct to
>> say that it is safe to backup any currently healthy node? What are the
>> common practices here?
>> Sorry if the answers are well known, but I am just starting...
> The policy usually is to back up all the nodes in your cluster for
> distributed setup. To restore the cluster you should just use the same setup
> as the production one.
> Hope that helps.
>> This e-mail message and all attachments transmitted with it may contain
>> privileged and/or confidential information intended solely for the use of the
>> addressee(s). If the reader of this message is not the intended recipient, you
>> are hereby notified that any reading, dissemination, distribution, copying,
>> forwarding or other use of this message or its attachments is strictly
>> prohibited. If you have received this message in error, please notify the
>> sender immediately and delete this message, all attachments and all copies and
>> backups thereof.
This e-mail message and all attachments transmitted with it may contain privileged and/or confidential information intended solely for the use of the addressee(s). If the reader of this message is not the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying, forwarding or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately and delete this message, all attachments and all copies and backups thereof.