Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Re: Efficient backup and a reasonable restore of an ensemble


Copy link to this message
-
RE: Efficient backup and a reasonable restore of an ensemble
The snapshot might have some of those transactions, it depends on when it
reads the znode affected by the transaction. Say you have txn T that sets
the data of /a. When generating the snapshot, if it serializes /a before T
is committed, then the snapshot will not include T. Otw, it includes T.

-Flavio

-----Original Message-----
From: Sergey Maslyakov [mailto:[EMAIL PROTECTED]]
Sent: 09 July 2013 18:03
To: [EMAIL PROTECTED]
Subject: Re: Efficient backup and a reasonable restore of an ensemble

I think I am having difficulties understanding the "fuzzy" concept. Let's
say I started to serialize DataTree into a snapshot file and it took 30
seconds. During these 30 seconds, the server saw 5 transactions that updated
the data. Does this mean that the snapshot that I get on disk at the end of
the 30-second interval will have some of these 5 transactions?
Or will it have none? Or will it have all of them? Or will it be
inconsistent and unreadable by Zookeeper?

Please help me better understand the behavior behind the "fuzzy" term.

For my use case, I am perfectly fine if I get a snapshot with none of these
5 transactions, considering that I will pick them up next time I take a
snapshot.
/Sergey
On Tue, Jul 9, 2013 at 12:08 AM, kishore g <[EMAIL PROTECTED]> wrote:

> Its not really elaborate, it is very similar to what zookeeper does
> when it starts up. It first reads the latest snapshot file and then
> the transaction logs and applies each and every transaction. What I am
> suggesting is that instead of applying all transactions stop at a
transaction i provide.
>
> Having this tool will actually simplify your task, you can go back to
> any point in time. Think of a something like this.
>
> checkpoint A // this can store the last zxid or timestamp from the leader.
> Make changes to zk
> //if things fails
> stop zks
> rollback A//run this on each zk, brings back the cluster to its
> previous state.
> start zks // any order should be fine.
>
>
> Also keep in mind that snapshot is fuzzy only if there are writes
> happening while taking snapshot. If you are sure no writes will happen
> when you are taking the snapshot then you are good. Experts, please
> correct me if this is incorrect.
>
> thanks,
> Kishore G
>
>
> On Mon, Jul 8, 2013 at 9:42 PM, Sergey Maslyakov <[EMAIL PROTECTED]>
> wrote:
>
> > Kishore,
> >
> > This sounds like a very elaborate tool. I was trying to find a
> > simplistic approach but what Thawan said about "fuzzy snapshots"
> > makes me a little afraid that there is no simple solution.
> >
> >
> > On Mon, Jul 8, 2013 at 11:05 PM, kishore g <[EMAIL PROTECTED]> wrote:
> >
> > > Agree, we already have such a tool. In fact we use it to
> > > reconstruct
> the
> > > sequence of events that led to a failure and actually restore the
> system
> > to
> > > a previous stable point and replay the events. Unfortunately this
> > > is
> tied
> > > closely with Helix but it should be easy to make this a generic tool.
> > >
> > > Sergey is this something that will be useful in your case.
> > >
> > > Thanks,
> > > Kishore G
> > >
> > >
> > > On Mon, Jul 8, 2013 at 8:09 PM, Thawan Kooburat <[EMAIL PROTECTED]> wrote:
> > >
> > > > On restore part, I think having a separate utility to manipulate
> > > > the data/snap dir (by truncating the log/removing snapshot to a
> > > > given
> zxid)
> > > > would be easier than modifying the server.
> > > >
> > > >
> > > > --
> > > > Thawan Kooburat
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 7/8/13 6:34 PM, "kishore g" <[EMAIL PROTECTED]> wrote:
> > > >
> > > > >I think what we are looking at is a  point in time restore
> > > functionality.
> > > > >How about adding a feature that says go back to a specific
> > > zxid/timestamp.
> > > > >This way before doing any change to zookeeper simply note down
> > > > >the timestamp/zxid on leader. If things go wrong after making
> > > > >changes,
> > bring
> > > > >down zookeepers and provide additional parameter of a