Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Copy availability when broker goes down?


+
Chris Curtin 2013-03-04, 16:33
Copy link to this message
-
Re: Copy availability when broker goes down?
Chris,

Thanks for reporting the issues and running those tests.

1. For problem 1, if this is the output of topic metadata request after
shutting down a broker that leads no partitions, then that is a bug. Please
can you file a bug and describe a reproducible test case there ?
2. For problem 2, we always try to make the preferred replica (1st replica
in the list of all replicas for a partition) the leader, if it is
available. We intended to spread the preferred replica for all partitions
for a topic evenly across the brokers. If this is not happening, we need to
look into it. Please can you file a bug and describe your test case there ?
3. For a machine that is down, for some time or long time, it is taken out
of ISR. When it starts back up again, it has to bootstrap from the current
leader.
4. If you have a new machine that you want to add to the cluster, you might
want to reassign some replicas for partitions to the new broker. We have a
tool (that has not been thoroughly tested yet) that allows you to do that.

Thanks,
Neha
On Mon, Mar 4, 2013 at 8:32 AM, Chris Curtin <[EMAIL PROTECTED]> wrote:

> Hi,
>
> (Hmm, take 2. Apache's spam filter doesn't like the word to describe the
> copy of the data. 'R - E -P -L -I -C -A' so it blocked it from sending!
> Using 'copy' below to mean that concept)
>
> I’m running 0.8.0 with HEAD from end of January (not the merge you guys did
> last night).
>
> I’m testing how the producer responds to loss of brokers, what errors are
> produced etc. and noticed some strange things as I shutdown servers in my
> cluster.
>
> Setup:
> 4 node cluster
> 1 topic, 3 copies in the set
> 10 partitions numbered 0-9
>
> State of the cluster is determined using TopicMetadataRequest.
>
> When I start with a full cluster (2nd column is the partition id, next is
> leader, then the copy set and ISR):
>
> Java: 0:vrd03.atlnp1 R:[  vrd03.atlnp1 vrd04.atlnp1 vrd01.atlnp1] I:[
> vrd03.atlnp1 vrd04.atlnp1 vrd01.atlnp1]
> Java: 1:vrd04.atlnp1 R:[  vrd04.atlnp1 vrd01.atlnp1 vrd02.atlnp1] I:[
> vrd04.atlnp1 vrd01.atlnp1 vrd02.atlnp1]
> Java: 2:vrd03.atlnp1 R:[  vrd01.atlnp1 vrd02.atlnp1 vrd03.atlnp1] I:[
> vrd03.atlnp1 vrd01.atlnp1 vrd02.atlnp1]
> Java: 3:vrd03.atlnp1 R:[  vrd02.atlnp1 vrd03.atlnp1 vrd04.atlnp1] I:[
> vrd03.atlnp1 vrd04.atlnp1 vrd02.atlnp1]
> Java: 4:vrd03.atlnp1 R:[  vrd03.atlnp1 vrd01.atlnp1 vrd02.atlnp1] I:[
> vrd03.atlnp1 vrd01.atlnp1 vrd02.atlnp1]
> Java: 5:vrd03.atlnp1 R:[  vrd04.atlnp1 vrd02.atlnp1 vrd03.atlnp1] I:[
> vrd03.atlnp1 vrd04.atlnp1 vrd02.atlnp1]
> Java: 6:vrd03.atlnp1 R:[  vrd01.atlnp1 vrd03.atlnp1 vrd04.atlnp1] I:[
> vrd03.atlnp1 vrd04.atlnp1 vrd01.atlnp1]
> Java: 7:vrd04.atlnp1 R:[  vrd02.atlnp1 vrd04.atlnp1 vrd01.atlnp1] I:[
> vrd04.atlnp1 vrd01.atlnp1 vrd02.atlnp1]
> Java: 8:vrd03.atlnp1 R:[  vrd03.atlnp1 vrd02.atlnp1 vrd04.atlnp1] I:[
> vrd03.atlnp1 vrd04.atlnp1 vrd02.atlnp1]
> Java: 9:vrd03.atlnp1 R:[  vrd04.atlnp1 vrd03.atlnp1 vrd01.atlnp1] I:[
> vrd03.atlnp1 vrd04.atlnp1 vrd01.atlnp1]
>
> When I stop vrd01, which isn’t leader on any:
>
> Java: 0:vrd03.atlnp1 R:[ ] I:[]
> Java: 1:vrd04.atlnp1 R:[ ] I:[]
> Java: 2:vrd03.atlnp1 R:[ ] I:[]
> Java: 3:vrd03.atlnp1 R:[  vrd02.atlnp1 vrd03.atlnp1 vrd04.atlnp1] I:[
> vrd03.atlnp1 vrd04.atlnp1 vrd02.atlnp1]
> Java: 4:vrd03.atlnp1 R:[ ] I:[]
> Java: 5:vrd03.atlnp1 R:[  vrd04.atlnp1 vrd02.atlnp1 vrd03.atlnp1] I:[
> vrd03.atlnp1 vrd04.atlnp1 vrd02.atlnp1]
> Java: 6:vrd03.atlnp1 R:[ ] I:[]
> Java: 7:vrd04.atlnp1 R:[ ] I:[]
> Java: 8:vrd03.atlnp1 R:[  vrd03.atlnp1 vrd02.atlnp1 vrd04.atlnp1] I:[
> vrd03.atlnp1 vrd04.atlnp1 vrd02.atlnp1]
> Java: 9:vrd03.atlnp1 R:[ ] I:[]
>
> Does this mean that none of the partitions that used to have a copy on
> vrd01 are updating ANY of the copies?
>
> I ran another test, again starting with a full cluster and all partitions
> had a full set of copies. When I stop the broker which was leader for 9 of
> the 10 partitions, the leaders were all elected on one machine instead of
> the set of 3. Should the leaders have been better spread out? Also the

 
+
Jun Rao 2013-03-04, 18:15
+
Chris Curtin 2013-03-04, 18:22
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB