Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - Backups


Copy link to this message
-
Re: Backups
Ted Dunning 2012-01-19, 19:41
Flavio,

Take as a use case the one where I am keeping configuration files in ZK.
 These will be manual installed and thus subject to manual error.

Backups would be invaluable.

On Thu, Jan 19, 2012 at 6:39 PM, Flavio Junqueira <[EMAIL PROTECTED]> wrote:

> Hi Ted, Znodes for leader election, group membership, etc, can all be
> recreated, so why should I back them up instead of recreating the znodes?
> In fact, one might bring back a previous snapshot of the system that
> reflects an incorrect system state.
>
> In the case that one stores data that can't be recovered by other means, I
> understand the need, but then we have the durability problem that I
> mentioned and you apparently agreed. Also, ZooKeeper is a replicated
> service, so why can't you simply rely upon the replication strategy that
> ZooKeeper provides to you already? Again, I'm trying to understand the use
> cases here.
>
> Thanks,
> -Flavio
>
> On Jan 19, 2012, at 7:11 PM, Ted Dunning wrote:
>
>  A backup can still be useful.  It is a common property that a database
>> backup is known to be slightly out of date.
>>
>> Such a backup can still be very useful.  In many systems, the most common
>> cause of error is simple human intervention.  This especially applies to
>> file systems and databases, but can still apply to ZK if an admin
>> carelessly tries to clean up part of the namespace and accidentally cleans
>> up all of it.  This should be much less common with ZK because manual
>> adjustments are so much less a part of standard operation, but they can
>> still occur.  In these cases, an out-of-date backup may be enormously
>> valuable.
>>
>> If somebody wants a precise backup from a particular moment in time, the
>> best option is to use the snapshot capabilities exposed by various file
>> systems.  Traditional NAS vendors all support this.  At a lower cost and
>> complexity point, you can get this from MapR clusters exposed as NFS or by
>> a ZFS file system.  This option also allows you to keep multiple snapshots
>> from points in the past.
>>
>>
>> What Jordan is doing would allow backups without special storage devices
>> and, with good backup of the log, would allow nearly current recovery in
>> the event of catastrophic loss.  Yes, this loses some durability, but it
>> is
>> still very desirable.
>>
>> On Thu, Jan 19, 2012 at 11:07 AM, Flavio Junqueira <[EMAIL PROTECTED]
>> >wrote:
>>
>>  Since you started this thread, I've been thinking about the idea of
>>> backing up, and I'm not sure I understand the motivation and if it is ok
>>> to
>>> violate safety properties.
>>>
>>> Given that ZooKeeper is used for coordination, I would think that in many
>>> cases all its state can be reconstructed in an algorithmic manner.
>>> Perhaps
>>> the use case for a backup would be the one in which it is being used as a
>>> database, for example, to keep the metadata of a file system. Periodic
>>> backups or even keeping an observer, however, won't guarantee that if you
>>> bring the system up using that backup you'll have all committed
>>> operations.
>>> The state of the leader reflects all committed operations, but one needs
>>> to
>>> have the latest state of the transaction log to not miss an update.
>>>
>>> But, it is true that I'm assuming that you can't miss updates. If you can
>>> miss updates, then that's a different story. By missing updates we'll be
>>> violating durability, which is  a property that ZooKeeper is supposed to
>>> provide, so I'm trying to understand in which cases violating durability
>>> would be acceptable. If it is not acceptable and you still want to have a
>>> backup, then I don't see a way other than shutting down the clients
>>> before
>>> you take a backup, which doesn't seem to be what is being proposed here.
>>>
>>> -Flavio
>>>
>>>
>>> On Jan 18, 2012, at 1:38 AM, Jordan Zimmerman wrote:
>>>
>>> Neha - can you send me your email address. Send it to:
>>>
>>>> [EMAIL PROTECTED]
>>>>
>>>> On 1/17/12 10:10 AM, "Neha Narkhede" <[EMAIL PROTECTED]> wrote: