Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Zookeeper ensemble backup questions?


Copy link to this message
-
Re: Zookeeper ensemble backup questions?
I can share this patch based on 3.4.5, which does thee trick.

It adds a "snps" 4lw command that accepts one mandatory argument, which is
an absolute path for the direcotry where the snapshot file will be dropped.
The "absoluteness" of the path s verified by UNIX rules. Not sure how it
would work in Windows, though. The target directory must exist and be
writeable by the effective UID of Zookeeper server.

If the operation was successful, Zookeeper server responds back with the
absolute path of the snapshot file. You can watch for the '/' character to
trigger your reaction to the response.

In my case, a 700MB snapshot takes about 30 seconds to write out.

Please see several examples below:

~ $ mkdir /tmp/snapshot-test

~ $ telnet localhost 12181
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
snps /tmp/snapshot-test
/tmp/snapshot-test/snapshot.316c8
Connection to localhost closed by foreign host.

~ $ ls -al /tmp/snapshot-test/snapshot.316c8
-rw-r--r--   1 srvr     srvr     719602373 Jul 19 14:09
/tmp/snapshot-test/snapshot.316c8

~ $ telnet localhost 12181
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
snps blah
Snapshot directory path must be absoulte, i.e., it must start with '/'.
Path "blah" does not meet the criteria.
Connection to localhost closed by foreign host.

~ $ telnet localhost 12181
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
snps /tmp/blah
Error while serializing snapshot into /tmp/blah/snapshot.316c8.
/tmp/blah/snapshot.316c8 (No such file or directory)
Connection to localhost closed by foreign host.

~ $ telnet localhost 12181
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
snps
Snapshot directory path must be absoulte, i.e., it must start with '/'.
Path "" does not meet the criteria.
Connection to localhost closed by foreign host.

~ $
On Fri, Jul 19, 2013 at 1:42 PM, jack ma <[EMAIL PROTECTED]> wrote:

> Thanks Sergei.
>
> That is great improvement idea for the zookeeper. I think that zookeeper is
> planning to add a new 4lrt command "snap", but it is not ready yet.
>
> My original questions is based on the current version of zookeeper (3.4.5),
> do you have any answers for them?
>
> Appreciate for the help.
>
> thanks
> Jack
>
>
>
>
> On Fri, Jul 19, 2013 at 11:19 AM, Sergey Maslyakov <[EMAIL PROTECTED]
> >wrote:
>
> > Jack,
> >
> > Here is how I see the backup process happening.
> >
> > 1. Zookeeper server can be changed to support a new 4lw that will write
> out
> > the current state of the DataTree into a snapshot file with the path and
> > name provided as an argument to this new command (barring all the
> > permissions, disk space, and other system-level restrictions). Probably,
> I
> > would ask Zookeeper to save the snapshot in a directory outside of the
> > standard "dataLog" for the sake of cleanliness.
> >
> > 2. When Zookeeper server responds to the new "snapshot" command with
> > success indication, the requesting process knows that the file has been
> > written out and it can go and process it. It can add some metadata and
> > create an archive to store it somewhere, for example. Alternatively,
> > Zookeeper server could stream the data it would have written into a
> > snapshot as the response to the new "snapshot" command. This way, the
> > client becomes responsible for persistence and this lifts a number of
> > permission-related issues (but raises some other issues too). Oh, and by
> > the way, it looks like snapshot files are rather compressible. I did see
> > the factor of 20 and more on the data that I have.
> >
> > 3. Disk cleanups are performed.
> >
> > With this backup procedure the restore would turn into:
> >
> > 1. Stopping all ensemble mebers
> >
> > 2. Wiping out dataDir/version-2 and dataLogDir/version-2
> >
> > 3. Restoring the snapshot taken by the above backup procedure on one of
> the
> > servers into dataDir/version-2
> >
> > 4. Bringing this server online
> >
> > 5. Allowing some time for it to load the snapshot. You could send "isro"
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB