Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Re: Efficient backup and a reasonable restore of an ensemble

Copy link to this message
Re: Efficient backup and a reasonable restore of an ensemble

It isn't that bad.  The deal is that a snapshot takes time to write to disk.  During this time updates are still allowed to the contents of memory.  All such updates are logged however so if you have the transaction log from the moment before the snap starts until some moment after the snap completes you can load the snapshot and then replay the log to get a moment in time snapshot as of the time if the final transaction that you have applied.  

This works because all if the logged transactions are idem potent. If they are applied to part of the snapshot that already recorded their effect, there is no problem.

If you want you can even do the replay in a side process after the snapshot is complete so that you don't have to carry around the transaction log.  

Sent from my iPhone

On Jul 8, 2013, at 21:42, Sergey Maslyakov <[EMAIL PROTECTED]> wrote:

> Kishore,
> This sounds like a very elaborate tool. I was trying to find a simplistic
> approach but what Thawan said about "fuzzy snapshots" makes me a little
> afraid that there is no simple solution.
> On Mon, Jul 8, 2013 at 11:05 PM, kishore g <[EMAIL PROTECTED]> wrote:
>> Agree, we already have such a tool. In fact we use it to reconstruct the
>> sequence of events that led to a failure and actually restore the system to
>> a previous stable point and replay the events. Unfortunately this is tied
>> closely with Helix but it should be easy to make this a generic tool.
>> Sergey is this something that will be useful in your case.
>> Thanks,
>> Kishore G
>> On Mon, Jul 8, 2013 at 8:09 PM, Thawan Kooburat <[EMAIL PROTECTED]> wrote:
>>> On restore part, I think having a separate utility to manipulate the
>>> data/snap dir (by truncating the log/removing snapshot to a given zxid)
>>> would be easier than modifying the server.
>>> --
>>> Thawan Kooburat
>>> On 7/8/13 6:34 PM, "kishore g" <[EMAIL PROTECTED]> wrote:
>>>> I think what we are looking at is a  point in time restore
>> functionality.
>>>> How about adding a feature that says go back to a specific
>> zxid/timestamp.
>>>> This way before doing any change to zookeeper simply note down the
>>>> timestamp/zxid on leader. If things go wrong after making changes, bring
>>>> down zookeepers and provide additional parameter of a zxid/timestamp
>> while
>>>> restarting. The server can go the exact point and make it current. The
>>>> followers can be started blank.
>>>> On Mon, Jul 8, 2013 at 5:53 PM, Thawan Kooburat <[EMAIL PROTECTED]> wrote:
>>>>> Just saw that  this is the corresponding use case to the question
>> posted
>>>>> in dev list.
>>>>> In order to restore the data to a given point in time correctly, you
>>>>> need
>>>>> both snapshot and txnlog. This is because zookeeper snapshot is fuzzy
>>>>> and
>>>>> snapshot alone may not represent a valid state of the server if there
>>>>> are
>>>>> in-flight requests.
>>>>> The 4wl command should cause the server to roll the log and take a
>>>>> snapshot similar to periodic snapshotting operation. Your backup
>> script
>>>>> need grap the snapshot and corresponding txnlog file from the data
>> dir.
>>>>> To restore, just shutdown all hosts, clear the data dir, copy over the
>>>>> snapshot and txnlog, and restart them.
>>>>> --
>>>>> Thawan Kooburat
>>>>> On 7/8/13 3:28 PM, "Sergey Maslyakov" <[EMAIL PROTECTED]> wrote:
>>>>>> Thank you for your response, Flavio. I apologize, I did not provide a
>>>>>> clear
>>>>>> explanation of the use case.
>>>>>> This backup/restore is not intended to be tied to any write event,
>>>>>> instead,
>>>>>> it is expected to run as a periodic (daily?) cron job on one of the
>>>>>> servers, which is not guaranteed to be the leader of the ensemble.
>>>>> There
>>>>>> is
>>>>>> no expectation that all recent changes are committed and persisted to
>>>>>> disk.
>>>>>> The system can sustain the loss of several hours worth of recent