Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> Backups


+
Jordan Zimmerman 2012-01-13, 22:24
+
Camille Fournier 2012-01-13, 22:28
+
Jordan Zimmerman 2012-01-13, 22:29
+
Patrick Hunt 2012-01-17, 01:39
+
Jordan Zimmerman 2012-01-17, 01:42
+
Camille Fournier 2012-01-17, 01:49
+
Neha Narkhede 2012-01-17, 18:10
+
Jordan Zimmerman 2012-01-17, 19:06
+
Jordan Zimmerman 2012-01-18, 00:38
+
Flavio Junqueira 2012-01-19, 11:07
+
Jordan Zimmerman 2012-01-19, 17:32
+
Ted Dunning 2012-01-19, 18:11
+
Jordan Zimmerman 2012-01-19, 18:16
+
Ted Dunning 2012-01-19, 18:23
+
Patrick Hunt 2012-01-19, 18:24
+
Ted Dunning 2012-01-19, 19:40
+
Flavio Junqueira 2012-01-19, 18:39
+
Jordan Zimmerman 2012-01-19, 19:07
+
Flavio Junqueira 2012-01-19, 19:30
+
Jordan Zimmerman 2012-01-19, 19:32
+
Ted Dunning 2012-01-19, 19:42
+
kishore g 2012-01-20, 07:42
+
Patrick Hunt 2012-01-20, 17:01
Flavio,

Take as a use case the one where I am keeping configuration files in ZK.
 These will be manual installed and thus subject to manual error.

Backups would be invaluable.

On Thu, Jan 19, 2012 at 6:39 PM, Flavio Junqueira <[EMAIL PROTECTED]> wrote:

> Hi Ted, Znodes for leader election, group membership, etc, can all be
> recreated, so why should I back them up instead of recreating the znodes?
> In fact, one might bring back a previous snapshot of the system that
> reflects an incorrect system state.
>
> In the case that one stores data that can't be recovered by other means, I
> understand the need, but then we have the durability problem that I
> mentioned and you apparently agreed. Also, ZooKeeper is a replicated
> service, so why can't you simply rely upon the replication strategy that
> ZooKeeper provides to you already? Again, I'm trying to understand the use
> cases here.
>
> Thanks,
> -Flavio
>
> On Jan 19, 2012, at 7:11 PM, Ted Dunning wrote:
>
>  A backup can still be useful.  It is a common property that a database
>> backup is known to be slightly out of date.
>>
>> Such a backup can still be very useful.  In many systems, the most common
>> cause of error is simple human intervention.  This especially applies to
>> file systems and databases, but can still apply to ZK if an admin
>> carelessly tries to clean up part of the namespace and accidentally cleans
>> up all of it.  This should be much less common with ZK because manual
>> adjustments are so much less a part of standard operation, but they can
>> still occur.  In these cases, an out-of-date backup may be enormously
>> valuable.
>>
>> If somebody wants a precise backup from a particular moment in time, the
>> best option is to use the snapshot capabilities exposed by various file
>> systems.  Traditional NAS vendors all support this.  At a lower cost and
>> complexity point, you can get this from MapR clusters exposed as NFS or by
>> a ZFS file system.  This option also allows you to keep multiple snapshots
>> from points in the past.
>>
>>
>> What Jordan is doing would allow backups without special storage devices
>> and, with good backup of the log, would allow nearly current recovery in
>> the event of catastrophic loss.  Yes, this loses some durability, but it
>> is
>> still very desirable.
>>
>> On Thu, Jan 19, 2012 at 11:07 AM, Flavio Junqueira <[EMAIL PROTECTED]
>> >wrote:
>>
>>  Since you started this thread, I've been thinking about the idea of
>>> backing up, and I'm not sure I understand the motivation and if it is ok
>>> to
>>> violate safety properties.
>>>
>>> Given that ZooKeeper is used for coordination, I would think that in many
>>> cases all its state can be reconstructed in an algorithmic manner.
>>> Perhaps
>>> the use case for a backup would be the one in which it is being used as a
>>> database, for example, to keep the metadata of a file system. Periodic
>>> backups or even keeping an observer, however, won't guarantee that if you
>>> bring the system up using that backup you'll have all committed
>>> operations.
>>> The state of the leader reflects all committed operations, but one needs
>>> to
>>> have the latest state of the transaction log to not miss an update.
>>>
>>> But, it is true that I'm assuming that you can't miss updates. If you can
>>> miss updates, then that's a different story. By missing updates we'll be
>>> violating durability, which is  a property that ZooKeeper is supposed to
>>> provide, so I'm trying to understand in which cases violating durability
>>> would be acceptable. If it is not acceptable and you still want to have a
>>> backup, then I don't see a way other than shutting down the clients
>>> before
>>> you take a backup, which doesn't seem to be what is being proposed here.
>>>
>>> -Flavio
>>>
>>>
>>> On Jan 18, 2012, at 1:38 AM, Jordan Zimmerman wrote:
>>>
>>> Neha - can you send me your email address. Send it to:
>>>
>>>> [EMAIL PROTECTED]
>>>>
>>>> On 1/17/12 10:10 AM, "Neha Narkhede" <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB