Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper, mail # dev - Storing znode on disks


+
Abhishek .E.S 2013-03-14, 16:49
+
Edward Ribeiro 2013-03-14, 17:52
+
Abhishek .E.S 2013-03-14, 18:13
+
Thawan Kooburat 2013-03-14, 20:09
Copy link to this message
-
RE: Storing znode on disks
Rakesh R 2013-03-15, 05:35
Hi Abishek,

Could you give more on the data set and the use case in detail.

ZooKeeper is desgined to manage co-ordination data and not designed to be a general database or large object store. Usually the co-ordination data will be relatively small: measured in kilobytes. If the data size is very large, I suggest to use either try splitting the data into multiple znodes(but this again can cause
lots of problems with watches and atomicity) or try using HDFS/NFS for storing the data.
But it depends on your use case/requirement.

The ZooKeeper client and the server implementations have sanity checks to ensure that znodes have less data. Also, user can configure znode data size using config 'jute.maxbuffer', by default its 1MB.

-Rakesh
________________________________________
From: Thawan Kooburat [[EMAIL PROTECTED]]
Sent: Friday, March 15, 2013 1:39 AM
To: [EMAIL PROTECTED]
Subject: Re: Storing znode on disks

This depends on the data size and availability requirement of your use
case.

Ideally, the size of RAM limit the total data size for ZooKeeper. However,
if you store several gigs of data into ZooKeeper, the server load time
will be quite long (minutes) depending on your disk bandwidth. When there
is a leader election, every server need to reload the data from disk into
memory so the quorum is considered unavailable during this period.

--
Thawan Kooburat

On 3/14/13 11:13 AM, "Abhishek .E.S" <[EMAIL PROTECTED]> wrote:

>Could I build a large scale data-store using Zookeeper though ?
>
>On Thu, Mar 14, 2013 at 12:52 PM, Edward Ribeiro
><[EMAIL PROTECTED]>wrote:
>
>> >> For me, latency is acceptable but I require the znodes to be on
>>disk.
>>
>> Why would you need to do that?
>>
>> ZooKeeper stores the dataTree in memory, but it performs periodic
>>snapshots
>> to disk, besides sync-ing a commit log also to disk, so that a node can
>> recover in case of failures. If you are asking to store znodes *only* in
>> disk then the answer is no (afaik!).
>>
>> Last but not least, you should be aware that znodes are not intended to
>> store large quantities of data, it's not mean to be a database, but a
>> coordination system.
>>
>> Edward
>>
>> On Thu, Mar 14, 2013 at 1:49 PM, Abhishek .E.S <[EMAIL PROTECTED]
>> >wrote:
>>
>> > Hi,
>> >
>> > I am new to Zookeeper .
>> > I had a question. Zookeeper places znodes in memory to optimize data
>> > access.
>> > I am working on an experiment for which I intend to use zookeeper.
>> > For me, latency is acceptable but I require the znodes to be on  disk.
>> >
>> > Can this be achieved.
>> > If so, could someone please provide me the pointers for the same.
>> >
>> > Thanks and Regards,
>> > Abhishek
>> >
>>
>>
>>
>> --
>> *"Matar um Leão por dia é fácil. O difícil é desviar das antas.",
>>anônimo*
>>
+
Edward Ribeiro 2013-03-15, 22:11
+
Abhishek .E.S 2013-03-16, 23:44