Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Use cases for ZooKeeper


Copy link to this message
-
Re: Use cases for ZooKeeper
Sure - give me what you have and I'll port it to Curator.

On 1/12/12 6:18 PM, "Ted Dunning" <[EMAIL PROTECTED]> wrote:

>I think I have a bit of it written already.
>
>It doesn't use Curator and I think you could simplify it substantially if
>you were to use it.  Would that help?
>
>On Thu, Jan 12, 2012 at 12:52 PM, Jordan Zimmerman
><[EMAIL PROTECTED]>wrote:
>
>> Ted - are you interested in writing this on top of Curator? If not, I'll
>> give it a whack.
>>
>> -JZ
>>
>> On 1/5/12 12:50 AM, "Ted Dunning" <[EMAIL PROTECTED]> wrote:
>>
>> >Jordan, I don't think that leader election does what Josh wants.
>> >
>> >I don't think that consistent hashing is particularly good for that
>>either
>> >because the loss of one node causes the sequential state for lots of
>> >entities to move even among nodes that did not fail.
>> >
>> >What I would recommend is a variant of micro-sharding.  The key space
>>is
>> >divided into many micro-shards.  Then nodes that are alive claim the
>> >micro-shards using ephemerals and proceed as Josh described.  On loss
>>of a
>> >node, the shards that node was handling should be claimed by the
>>remaining
>> >nodes.  When a new node appears or new work appears, it is helpful to
>> >direct nodes to effect a hand-off of traffic.
>> >
>> >In my experience, the best way to implement shard balancing is with and
>> >external master instance much in the style of hbase or katta.  This
>> >external master can be exceedingly simple and only needs to wake up on
>> >various events like loss of a node or change in the set of live shards.
>> >It
>> >can also wake up at intervals if desired to backstop the normal
>> >notifications or to allow small changes for certain kinds of balancing.
>> > Typically, this only requires a few hundred lines of code.
>> >
>> >This external master can, of course, be run on multiple nodes and which
>> >master is in current control can be adjudicated with yet another leader
>> >election.
>> >
>> >You can view this as a package of many leader elections.  Or as
>> >discretized
>> >consistent hashing.  The distinctions are a bit subtle but are very
>> >important.  These include,
>> >
>> >- there is a clean division of control between the master which
>>determines
>> >who serves what and the nodes that do the serving
>> >
>> >- there is no herd effect because the master drives the assignments
>> >
>> >- node loss causes the minimum amount of change of assignments since no
>> >assignments to surviving nodes are disturbed.  This is a major win.
>> >
>> >- balancing is pretty good because there are many shards compared to
>>the
>> >number of nodes.
>> >
>> >- the balancing strategy is highly pluggable.
>> >
>> >This pattern would make a nice addition to Curator, actually.  It
>>comes up
>> >repeatedly in different contexts.
>> >
>> >On Thu, Jan 5, 2012 at 12:11 AM, Jordan Zimmerman
>> ><[EMAIL PROTECTED]>wrote:
>> >
>> >> OK - so this is two options for doing the same thing. You use a
>>Leader
>> >> Election algorithm to make sure that only one node in the cluster is
>> >> operating on a work unit. Curator has an implementation (it's really
>> >>just
>> >> a distributed lock with a slightly different API).
>> >>
>> >> -JZ
>> >>
>> >> On 1/5/12 12:04 AM, "Josh Stone" <[EMAIL PROTECTED]> wrote:
>> >>
>> >> >Thanks for the response. Comments below:
>> >> >
>> >> >On Wed, Jan 4, 2012 at 10:46 PM, Jordan Zimmerman
>> >> ><[EMAIL PROTECTED]>wrote:
>> >> >
>> >> >> Hi Josh,
>> >> >>
>> >> >> >Second use case: Distributed locking
>> >> >> This is one of the most common uses of ZooKeeper. There are many
>> >> >> implementations - one included with the ZK distro. Also, there is
>> >> >>Curator:
>> >> >> https://github.com/Netflix/curator
>> >> >>
>> >> >> >First use case: Distributing work to a cluster of nodes
>> >> >> This sounds feasible. If you give more details I and others on
>>this
>> >>list
>> >> >> can help more.
>> >> >>
>> >> >
>> >> >Sure. I basically want to handle race conditions where two commands
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB