Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Strategies for auto generating broker ID


Copy link to this message
-
Re: Strategies for auto generating broker ID
The one concern with this meta data approach, is that it seems like a
pretty low-level thing to have to manage.  If  I have a broker failure, and
I want to bring in a new node to replace it, what is the procedure for
having the new broker assume the identity of the previously failed one?
 Manually setting it in the config for the broker is perhaps error prone,
but is pretty straightforward.  I'm not sure it's cleaner to manually edit
replicated meta files in the data log dirs for the broker?
On Wed, Oct 2, 2013 at 2:20 PM, Sriram Subramanian <
[EMAIL PROTECTED]> wrote:

> Jason - You should be able to solve that with Jay's proposal below. If you
> just persist the id in a meta file, you can copy the meta file over to the
> new broker and broker will not re-generate another id.
>
> On 10/2/13 11:10 AM, "Jason Rosenberg" <[EMAIL PROTECTED]> wrote:
>
> >I recently moved away from generating a unique brokerId for each node, in
> >favor of assigning ids in configuration.  The reason for this, is that in
> >0.8, there isn't a convenient way yet to reassign partitions to a new
> >brokerid, should one broker have a failure.  So, it seems the only
> >work-around at the moment is to bring up a replacement broker, assign it
> >the same brokerId as one that has failed and is no longer running.  The
> >cluster will then automatically replicate all the partitions that were
> >assigned to the failed broker to the new broker.
> >
> >This appears the only operational way to deal with failed brokers, at the
> >moment.
> >
> >Longer term, it would be great if the cluster were self-healing, and if a
> >broker went down, we could mark it as no longer available somehow, and the
> >cluster would then reassign and re-replicate partitions to new brokers,
> >that were previously assigned to the failed broker.  I expect something
> >like this will be available in future versions, but that doesn't appear
> >the
> >case at present.
> >
> >And related, it would be nice, in the interests of horizontal scalability,
> >to have an easy way for the cluster to dynamically rebalance load, if new
> >nodes are added to the cluster (or to at least prefer assigning new
> >partitions to brokers which have more space available).  I expect this
> >will
> >be something to prioritize in the future versions as well.
> >
> >Jason
> >
> >
> >On Wed, Oct 2, 2013 at 1:00 PM, Sriram Subramanian <
> >[EMAIL PROTECTED]> wrote:
> >
> >> I agree that we need a unique id and have something independent of the
> >> machine. I am not sure you want a dependency on ZK to generate the
> >>unique
> >> id though. There are other ways to generate an unique id (Example -
> >>UUID).
> >> In case there was a collision (highly unlikely), the node creation in ZK
> >> will anyways fail and the broker can regenerate another id.
> >>
> >> On 10/2/13 9:52 AM, "Jay Kreps" <[EMAIL PROTECTED]> wrote:
> >>
> >> >There are scenarios in which you want a hostname to change or you want
> >>to
> >> >move the stored data off one machine onto another. This is the
> >>motivation
> >> >systems have for having a layer of indirection between the location and
> >> >the
> >> >identity of the nodes.
> >> >
> >> >-Jay
> >> >
> >> >
> >> >On Wed, Oct 2, 2013 at 9:23 AM, Guozhang Wang <[EMAIL PROTECTED]>
> >>wrote:
> >> >
> >> >> Wondering what is the reason behind decoupling the node id with its
> >> >> physical host(port)? If we found that for example, node 1 is not
> >>owning
> >> >>any
> >> >> partitions, how would we know which physical machine is this node
> >>then?
> >> >>
> >> >> Guozhang
> >> >>
> >> >>
> >> >> On Wed, Oct 2, 2013 at 9:07 AM, Jay Kreps <[EMAIL PROTECTED]>
> >>wrote:
> >> >>
> >> >> > I'm in favor of doing this if someone is willing to work on it! I
> >> >>agree
> >> >> it
> >> >> > would really help with easy provisioning.
> >> >> >
> >> >> > I filed a bug to discuss and track:
> >> >> > https://issues.apache.org/jira/browse/KAFKA-1070
> >> >> >
> >> >> > Some comments:
> >> >> > 1. I'm not in favor of having a pluggable strategy, unless we are

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB