Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - Running Zookeeper in 2 machines


Copy link to this message
-
Re: Running Zookeeper in 2 machines
Cameron McKenzie 2013-11-05, 22:29
Yes, I guess in this situation there's no guarantee that A has the latest
data. I think that this is just an inherent limitation of the quorum based
writes though. Unless you have three separate machines at geographically
redundant sites, I don't think that you have true redundancy.
cheers
Cam
On Wed, Nov 6, 2013 at 9:17 AM, Alexander Shraer <[EMAIL PROTECTED]> wrote:

> I don't think reconfiguration will help you here as it requires a
> quorum of the old and a quorum of the new ensembles, and here you're
> missing a quorum of the old one.
>
> The problem is that you may have some committed operations on the B
> servers that A doesn't know about (writes are done to a quorum).
> Moreover, B may just be slow and may be still operational.
>
> To solve the problem here I think you either need a tie breaker, a
> reliable failure detection mechanism (such as when you're manually
> doing this because you're sure that B is down) or some kind of
> stronger synchrony assumptions (e.g., if A didn't hear from B for 3
> sec it means that B has crashed), this is something that ZK doesn't do
> to be more robust to network delays.
>
> Since this scenario seems very common It may be interesting to
> implement some kind of a tie breaker quorum system in zookeeper.
>
> Alex
>
> On Tue, Nov 5, 2013 at 12:44 PM, Cameron McKenzie
> <[EMAIL PROTECTED]> wrote:
> > I have a similar problem to you. I have more than 2 machines, but only 2
> > geographically redundant sites.
> >
> > In your situation, you could get some redundancy by running 2 instances
> on
> > one host, and 1 instance on the other host. This would protect you from
> > temporary network glitches (because the machine with 2 instances can
> still
> > form a quorum), and will protect you from failure of the machine with the
> > single instance. It will not help you if the machine with 2 instances
> > crashes.
> >
> > In this situation, where the 2 instance machine dies, you can temporarily
> > configure the 1 instance machine to be a single instance cluster, and
> then
> > when the 2 instance machine is recovered, you can reconfigure the single
> > instance machine to be part of the 3 instance cluster again. This process
> > is manual, and slightly dangerous, because if you restart nodes in the
> > wrong order, you have potential to lose data. This is the approach that I
> > have tested and seems to work, but I'd recommend testing it also.
> >
> > Machine A has ZK instance 1
> > Machine B has ZK instances 2 and 3
> >
> > Machine B dies
> > Reconfigure ZK instance 1 so that it only has itself in the cluster. This
> > means that there is no redundancy at this point, but it can form a quorum
> > as its the only instance in the cluster.
> > Restart ZK instance 1 to pickup config changes
> > Fix up Machine B
> > Reconfigure ZK 1 instance to have ZK instances 2 and 3 in its
> configuration
> > Restart ZK instance 1 to pickup config changes
> > Start ZK instance 2 on Machine B.
> > Wait for ZK instance 1 on Machine A and ZK instance 2 on machine B form a
> > quorum. This is vitally important. If you start instance 3 before a
> quorum
> > is formed it is possible that instances 2 and 3 will form a quorum. This
> > will cause any updates that have occurred via instance 1 during the
> outage
> > of Machine B to be lost.
> > Start ZK instance 3 on Machine B
> >
> > This process should become easier once dynamic reconfiguration is
> > implemented (in ZK 3.5 I believe?) because restarts won't be required.
> > cheers
> > Cam
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Nov 5, 2013 at 6:05 PM, erolagnab <[EMAIL PROTECTED]> wrote:
> >
> >> Thanks, I got the idea now. So is it fair to say that it is not
> possible to
> >> create ZK cluster providing some redundancy with 2 physical machines? If
> >> so,
> >> is there a way to make it happen?
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >>
> http://zookeeper-user.578899.n2.nabble.com/Running-Zookeeper-in-2-machines-tp7579232p7579237.html