Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper, mail # user - avoiding the risk of starting ZooKeeper servers in the same ensemble with different transaction logs


+
German Blanco 2013-09-24, 12:45
+
Benjamin Reed 2013-09-25, 02:37
+
German Blanco 2013-09-25, 14:44
+
Alexander Shraer 2013-09-27, 00:00
Copy link to this message
-
Re: avoiding the risk of starting ZooKeeper servers in the same ensemble with different transaction logs
German Blanco 2013-09-28, 06:33
Thank you, I will check, I must admit I haven't search for this enough :-(
I have also noticed that there is a development option to force
synchronisation via snapshot for 3.5.0. That should avoid the problem.
There is anyway something strange that I have noticed. There were two nodes
showing up in one of the followers of the ensemble (3.4.5 with 3 nodes)
that were not there in the rest, and they were ephemeral nodes. I don't
think that it is easy for ephemeral nodes to be included in data files from
another ensemble, since normally the session that created them wouldn't be
there and they would expire.
Unfortunately, I don't have the logs anymore of when this happened. They
were running with DEBUG and they rotated.
Any ideas?
On Fri, Sep 27, 2013 at 2:00 AM, Alexander Shraer <[EMAIL PROTECTED]> wrote:

> Some time in the past we were discussing adding a unique identifier
> for each ensemble in the config files and checking it. For example
> when a server tries to connect to the leader. I'm not sure if the is a Jira
> for this.
>
>
> On Wed, Sep 25, 2013 at 7:44 AM, German Blanco <
> [EMAIL PROTECTED]> wrote:
>
> > Exactly.
> > I know it is silly, but I think this is what happened, and I would feel
> > better if there was a way to avoid it to happen again.
> >
> >
> > On Wed, Sep 25, 2013 at 4:37 AM, Benjamin Reed <[EMAIL PROTECTED]> wrote:
> >
> > > when you say inconsistent transaction log, are you talking about a
> > > transaction log from a different ensemble instance?
> > >
> > > for example, you ran zookeeper and did some things. then you reset the
> > all
> > > the servers but one and restarted everything.
> > >
> > > ben
> > >
> > >
> > > On Tue, Sep 24, 2013 at 5:45 AM, German Blanco <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > > > Hello,
> > > > I have run into this situation a couple of times.
> > > > Because of an error, one ZooKeeper server in the ensemble is started
> > > > with an inconsistent transaction log. This leads to serious and
> > difficult
> > > > to trace problems, until you notice that clients connected to one of
> > the
> > > > servers see a different data tree than the others.
> > > > I would really like to avoid this, and it happens that the amount of
> > data
> > > > in my data tree is not that much (around 40 kBytes). So I would like
> to
> > > > propose a new option to force synchronization via snapshot in the
> > > ZooKeeper
> > > > Leader.
> > > > Any opinions?
> > > > Any other options?
> > > > Regards,
> > > >
> > > > Germán Blanco.
> > > >
> > >
> >
>
+
German Blanco 2013-10-01, 06:14