Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper, mail # user - Optimal ZooKeeper configuration for a cluster of size, N

mallikharjun.vemana@... 2012-10-04, 09:41
Patrick Hunt 2012-10-06, 14:28
Copy link to this message
RE: Optimal ZooKeeper configuration for a cluster of size, N
mallikharjun.vemana@... 2012-10-09, 06:02
Thanks for the response.

Session timeout is our major concern. I understand that we can give a higher value for session timeout to be on safer side. But, in our environment when we give a bigger value for session timeout, our system would be wasting time in retries before giving up the 'task'. As a result we end up syncing huge amount of data (stored in our system, not in zookeeper) when the system reconnects and resumes its 'task'. Hence we want to keep the session timeout value as small as possible to minimize our syncing overhead. In other words, we would like to keep the timeout value just enough for retrying majority of the servers (or may be a little more as a buffer). I would like to know this value for 3 & 5 ZooKeeper server deployment.

Thanks in advance.

-----Original Message-----
From: Patrick Hunt [mailto:[EMAIL PROTECTED]]
Sent: Saturday, October 06, 2012 7:59 PM
Subject: Re: Optimal ZooKeeper configuration for a cluster of size, N

The defaults should be fine for what you described. The initLimit would need to be increased if the data being stored is very large, the syncLimit increased if you were running across say a wan link (btw
servers) and leaderServes is only really useful if you have a large number of writes/clients and you want the leader to focus on coordination (rather than that plus serving client requests.)


On Thu, Oct 4, 2012 at 2:41 AM,  <[EMAIL PROTECTED]> wrote:
> We are using ZooKeeper 3.4.4 for our deployment. I would like to know what would be optimum config setting for various configuration parameter viz., initLimit, syncLimit, leaderServes, etc. As of now we are using default parameters for these parameters. But, I would like to understand how each one of them depend on the size of the zookeeper cluster.
> Say I have N servers in the cluster, I would like to know how each of these values can be configured (in terms of N) for optimum results. Also, I would like to understand how session timeout would effect the performance of the cluster.
> In our deployment,
> - we are basically using the zookeeper service to detect failures in our system.
> - we are storing very minimal data in the cluster.
> - we are planning to have 3 or 5 servers in the cluster.
> Can anyone please throw some light on these config settings? I did go through the admin guide. But, I would like to understand each one of them from the perspective of zookeeper cluster size. Please help.
> Thanks,
> Arjun.