Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - Backup/restore of an emsemble


Copy link to this message
-
Re: Backup/restore of an emsemble
Edward Ribeiro 2013-07-05, 21:59
Yes! :-) Please, cc to me.

Edward
Em 05/07/2013 17:53, "Sergey Maslyakov" <[EMAIL PROTECTED]> escreveu:

> Thank you for the reference, Edward! It looks like Guano mostly
> accomplished what I want out of it. I spent a little time today digging in
> the guts of ZK and found a few ways to make taking a backup more efficient
> by delegating all the work to the server. Description of these ways does
> not fit the boundaries of a user mailing list and I intend to send a little
> write-up into the "dev" mailing list. Please let me know if you are
> interested in the subject and I can CC you.
>
>
> On Fri, Jul 5, 2013 at 12:43 PM, Edward Ribeiro <[EMAIL PROTECTED]
> >wrote:
>
> > Sergey,
> >
> > Have you looked into the Guano Project (https://github.com/d2fn/guano) ?
> > It
> > seems to do exactly what you want, or at least as close a dump/restore so
> > that you can start from there.
> >
> > Edward
> >
> >
> > On Fri, Jul 5, 2013 at 1:40 PM, Sergey Maslyakov <[EMAIL PROTECTED]>
> > wrote:
> >
> > > The nature of the system that I am working with is mostly read-heavy.
> > > Writes happen rarely. In he event of disaster recovery to some point in
> > > time in the past, the system should sustain minor inconsistencies.
> > >
> > > I am also considering to have Zookeeper server create a copy of the
> > > DataTree, and then serialize it into a file, which will later be picked
> > up
> > > by the import client. This could be the most efficient way of taking a
> > > backup.
> > >
> > >
> > >
> > > On Fri, Jul 5, 2013 at 11:33 AM, Flavio Junqueira <
> [EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > In your approach, would you "lock" the zookeeper state and read the
> > data
> > > > tree using getData/getChildren? If you have concurrent updates, then
> > you
> > > > may
> > > > end up having an inconsistent snapshot.
> > > >
> > > > -Flavio
> > > >
> > > > -----Original Message-----
> > > > From: Sergey Maslyakov [mailto:[EMAIL PROTECTED]]
> > > > Sent: 05 July 2013 17:12
> > > > To: [EMAIL PROTECTED]
> > > > Subject: Re: Backup/restore of an emsemble
> > > >
> > > > Yes, Flavio, I looked at Exhibitor, but I need a pretty granular
> > control
> > > > over a cluster of ZK servers. This is why I'm inclined to build
> > something
> > > > by
> > > > hand. So far, a pair of external export and import clients seems
> like a
> > > > promising approach.
> > > >
> > > > Export would connect to the ensemble and dump out the data into a
> file
> > on
> > > > disk. Import would connect, wipe out the namespace, and then reload
> the
> > > > data
> > > > from the file that was earlier created by the export client.
> > > >
> > > >
> > > >
> > > > On Fri, Jul 5, 2013 at 8:31 AM, Flavio Junqueira
> > > > <[EMAIL PROTECTED]>wrote:
> > > >
> > > > > Sergey,
> > > > >
> > > > > Have you had a look at Exhibitor?
> > > > >
> > > > > https://github.com/Netflix/exhibitor
> > > > >
> > > > > -Flavio
> > > > >
> > > > > -----Original Message-----
> > > > > From: Sergey Maslyakov [mailto:[EMAIL PROTECTED]]
> > > > > Sent: 05 July 2013 04:39
> > > > > To: [EMAIL PROTECTED]
> > > > > Subject: Backup/restore of an emsemble
> > > > >
> > > > > A while ago, Jack Ma asked this question:
> > > > >
> > > > >
> > http://mail-archives.apache.org/mod_mbox/zookeeper-user/201306.mbox/%3
> > > > > CCAB%2
> > > > > BcfdyPDpbUh5FyDT%3D9mU%3DFCHEA1AZpkF6X0nN1t4mjwqu2tA%
> > 40mail.gmail.com%
> > > > > 3E
> > > > >
> > > > > I wonder if there were any helpful suggestions that did not go into
> > > > > the mailing list.
> > > > >
> > > > > I am mostly concerned about restoring data in a Zookeeper ensemble.
> > > > >
> > > > > There is no document at the project web-site that would explain the
> > > > > restore procedure for a distributed deployment. The home-grown
> > > > > solution that involves stopping the whole cluster, wiping out
> > > > > databases on all but one server, restoring the database on one
> > server,
> > > > > and then bring up the cluster and pray that the populated server