Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - Backup/restore of an emsemble


Copy link to this message
-
Re: Backup/restore of an emsemble
Edward Ribeiro 2013-07-05, 17:43
Sergey,

Have you looked into the Guano Project (https://github.com/d2fn/guano) ? It
seems to do exactly what you want, or at least as close a dump/restore so
that you can start from there.

Edward
On Fri, Jul 5, 2013 at 1:40 PM, Sergey Maslyakov <[EMAIL PROTECTED]> wrote:

> The nature of the system that I am working with is mostly read-heavy.
> Writes happen rarely. In he event of disaster recovery to some point in
> time in the past, the system should sustain minor inconsistencies.
>
> I am also considering to have Zookeeper server create a copy of the
> DataTree, and then serialize it into a file, which will later be picked up
> by the import client. This could be the most efficient way of taking a
> backup.
>
>
>
> On Fri, Jul 5, 2013 at 11:33 AM, Flavio Junqueira <[EMAIL PROTECTED]
> >wrote:
>
> > In your approach, would you "lock" the zookeeper state and read the data
> > tree using getData/getChildren? If you have concurrent updates, then you
> > may
> > end up having an inconsistent snapshot.
> >
> > -Flavio
> >
> > -----Original Message-----
> > From: Sergey Maslyakov [mailto:[EMAIL PROTECTED]]
> > Sent: 05 July 2013 17:12
> > To: [EMAIL PROTECTED]
> > Subject: Re: Backup/restore of an emsemble
> >
> > Yes, Flavio, I looked at Exhibitor, but I need a pretty granular control
> > over a cluster of ZK servers. This is why I'm inclined to build something
> > by
> > hand. So far, a pair of external export and import clients seems like a
> > promising approach.
> >
> > Export would connect to the ensemble and dump out the data into a file on
> > disk. Import would connect, wipe out the namespace, and then reload the
> > data
> > from the file that was earlier created by the export client.
> >
> >
> >
> > On Fri, Jul 5, 2013 at 8:31 AM, Flavio Junqueira
> > <[EMAIL PROTECTED]>wrote:
> >
> > > Sergey,
> > >
> > > Have you had a look at Exhibitor?
> > >
> > > https://github.com/Netflix/exhibitor
> > >
> > > -Flavio
> > >
> > > -----Original Message-----
> > > From: Sergey Maslyakov [mailto:[EMAIL PROTECTED]]
> > > Sent: 05 July 2013 04:39
> > > To: [EMAIL PROTECTED]
> > > Subject: Backup/restore of an emsemble
> > >
> > > A while ago, Jack Ma asked this question:
> > >
> > > http://mail-archives.apache.org/mod_mbox/zookeeper-user/201306.mbox/%3
> > > CCAB%2
> > > BcfdyPDpbUh5FyDT%3D9mU%3DFCHEA1AZpkF6X0nN1t4mjwqu2tA%40mail.gmail.com%
> > > 3E
> > >
> > > I wonder if there were any helpful suggestions that did not go into
> > > the mailing list.
> > >
> > > I am mostly concerned about restoring data in a Zookeeper ensemble.
> > >
> > > There is no document at the project web-site that would explain the
> > > restore procedure for a distributed deployment. The home-grown
> > > solution that involves stopping the whole cluster, wiping out
> > > databases on all but one server, restoring the database on one server,
> > > and then bring up the cluster and pray that the populated server
> > > becomes the leader and populates the cluster. Such solution seems to be
> > too error-prone.
> > >
> > > Does anyone have recommendations on how to make it robust?
> > >
> > > Maybe there is a way to force-populate the ensemble remotely?
> > >
> > >
> > > Thanks,
> > > /Sergey
> > >
> > >
> >
> >
>