Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Replacing brokers in a cluster (0.8)

Copy link to this message
Re: Replacing brokers in a cluster (0.8)
Is the kafka-reassign-partitions tool something I can experiment with now
(this will only be staging data, in the first go-round).  How does it work?
 Do I manually have to specify each replica I want to move?  This would be
cumbersome, as I have on the order of 100's of topics....Or does the tool
have the ability to specify all replicas on a particular broker?  How can I
easily check whether a partition has all its replicas in the ISR?

For some reason, I had thought there would be a default behavior, whereby a
replica could automatically be declared dead after a configurable timeout

Re-assigning broker id's would not be ideal, since I have a scheme
currently whereby broker id's are auto-generated, from a hostname/ip, etc.
 I could make it work, but it's not my preference to override that!

On Mon, Jul 22, 2013 at 11:50 AM, Jun Rao <[EMAIL PROTECTED]> wrote:

> A replica's data won't be automatically moved to another broker where there
> are failures. This is because we don't know if the failure is transient or
> permanent. The right tool to use is the kafka-reassign-partitions tool. It
> hasn't been thoroughly tested tough. We hope to harden it in the final
> 0.8.0 release.
> You can also replace a broker with a new server by keeping the same broker
> id. When the new server starts up, it will replica data from the leader.
> You know the data is fully replicated when both replicas are in ISR.
> Thanks,
> Jun
> On Mon, Jul 22, 2013 at 2:14 AM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:
> > I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones
> > (better hardware).  I'm using a replication factor of 2.
> >
> > I'm thinking the plan should be to spin up the 3 new nodes, and operate
> as
> > a 5 node cluster for a while.  Then first remove 1 of the old nodes, and
> > wait for the partitions on the removed node to get replicated to the
> other
> > nodes.  Then, do the same for the other old node.
> >
> > Does this sound sensible?
> >
> > How does the cluster decide when to re-replicate partitions that are on a
> > node that is no longer available?  Does it only happen if/when new
> messages
> > arrive for that partition?  Is it on a partition by partition basis?
> >
> > Or is it a cluster-level decision that a broker is no longer valid, in
> > which case all affected partitions would immediately get replicated to
> new
> > brokers as needed?
> >
> > I'm just wondering how I will know when it will be safe to take down my
> > second old node, after the first one is removed, etc.
> >
> > Thanks,
> >
> > Jason
> >