Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> Re: Efficient backup and a reasonable restore of an ensemble


+
Flavio Junqueira 2013-07-08, 21:30
+
Sergey Maslyakov 2013-07-08, 22:28
+
Thawan Kooburat 2013-07-09, 00:53
+
kishore g 2013-07-09, 01:34
+
Thawan Kooburat 2013-07-09, 03:09
+
kishore g 2013-07-09, 04:05
+
Sergey Maslyakov 2013-07-09, 04:42
+
Ted Dunning 2013-07-09, 05:32
Copy link to this message
-
Re: Efficient backup and a reasonable restore of an ensemble
Its not really elaborate, it is very similar to what zookeeper does when it
starts up. It first reads the latest snapshot file and then the transaction
logs and applies each and every transaction. What I am suggesting is that
instead of applying all transactions stop at a transaction i provide.

Having this tool will actually simplify your task, you can go back to any
point in time. Think of a something like this.

checkpoint A // this can store the last zxid or timestamp from the leader.
Make changes to zk
//if things fails
stop zks
rollback A//run this on each zk, brings back the cluster to its previous
state.
start zks // any order should be fine.
Also keep in mind that snapshot is fuzzy only if there are writes happening
while taking snapshot. If you are sure no writes will happen when you are
taking the snapshot then you are good. Experts, please correct me if this
is incorrect.

thanks,
Kishore G
On Mon, Jul 8, 2013 at 9:42 PM, Sergey Maslyakov <[EMAIL PROTECTED]> wrote:

> Kishore,
>
> This sounds like a very elaborate tool. I was trying to find a simplistic
> approach but what Thawan said about "fuzzy snapshots" makes me a little
> afraid that there is no simple solution.
>
>
> On Mon, Jul 8, 2013 at 11:05 PM, kishore g <[EMAIL PROTECTED]> wrote:
>
> > Agree, we already have such a tool. In fact we use it to reconstruct the
> > sequence of events that led to a failure and actually restore the system
> to
> > a previous stable point and replay the events. Unfortunately this is tied
> > closely with Helix but it should be easy to make this a generic tool.
> >
> > Sergey is this something that will be useful in your case.
> >
> > Thanks,
> > Kishore G
> >
> >
> > On Mon, Jul 8, 2013 at 8:09 PM, Thawan Kooburat <[EMAIL PROTECTED]> wrote:
> >
> > > On restore part, I think having a separate utility to manipulate the
> > > data/snap dir (by truncating the log/removing snapshot to a given zxid)
> > > would be easier than modifying the server.
> > >
> > >
> > > --
> > > Thawan Kooburat
> > >
> > >
> > >
> > >
> > >
> > > On 7/8/13 6:34 PM, "kishore g" <[EMAIL PROTECTED]> wrote:
> > >
> > > >I think what we are looking at is a  point in time restore
> > functionality.
> > > >How about adding a feature that says go back to a specific
> > zxid/timestamp.
> > > >This way before doing any change to zookeeper simply note down the
> > > >timestamp/zxid on leader. If things go wrong after making changes,
> bring
> > > >down zookeepers and provide additional parameter of a zxid/timestamp
> > while
> > > >restarting. The server can go the exact point and make it current. The
> > > >followers can be started blank.
> > > >
> > > >
> > > >
> > > >On Mon, Jul 8, 2013 at 5:53 PM, Thawan Kooburat <[EMAIL PROTECTED]>
> wrote:
> > > >
> > > >> Just saw that  this is the corresponding use case to the question
> > posted
> > > >> in dev list.
> > > >>
> > > >> In order to restore the data to a given point in time correctly, you
> > > >>need
> > > >> both snapshot and txnlog. This is because zookeeper snapshot is
> fuzzy
> > > >>and
> > > >> snapshot alone may not represent a valid state of the server if
> there
> > > >>are
> > > >> in-flight requests.
> > > >>
> > > >> The 4wl command should cause the server to roll the log and take a
> > > >> snapshot similar to periodic snapshotting operation. Your backup
> > script
> > > >> need grap the snapshot and corresponding txnlog file from the data
> > dir.
> > > >>
> > > >> To restore, just shutdown all hosts, clear the data dir, copy over
> the
> > > >> snapshot and txnlog, and restart them.
> > > >>
> > > >>
> > > >> --
> > > >> Thawan Kooburat
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On 7/8/13 3:28 PM, "Sergey Maslyakov" <[EMAIL PROTECTED]> wrote:
> > > >>
> > > >> >Thank you for your response, Flavio. I apologize, I did not
> provide a
> > > >> >clear
> > > >> >explanation of the use case.
> > > >> >
> > > >> >This backup/restore is not intended to be tied to any write event,
> > > >> >instead,
+
Flavio Junqueira 2013-07-09, 09:12
+
Sergey Maslyakov 2013-07-09, 16:02
+
Ted Dunning 2013-07-09, 20:00
+
Flavio Junqueira 2013-07-09, 16:47
+
kishore g 2013-07-09, 17:01
+
Flavio Junqueira 2013-07-09, 17:04
+
Sergey Maslyakov 2013-07-09, 04:40
+
Sergey Maslyakov 2013-07-09, 04:34
+
Sergey Maslyakov 2013-07-09, 04:25
+
jack ma 2013-07-16, 15:38
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB