I am wondering if my leader broker crash, how to get it back to ISR after restart kafak ?
In the initial status the kafka-list-topic.sh shows: topic: failover-test partition: 0 leader: 0 replicas: 0,1 isr: 0,1
If I terminate the leader and kafka-list-topic.sh shows: topic: failover-test partition: 0 leader: 1 replicas: 0,1 isr: 1 Is there any document can explain what is the procedure to get my broker0 back to isr ? Thanks!
Once the broker is restarted, the controller broker will send it a list of partitions that it should follow. The broker starts fetching from the respective leaders and enters the ISR. Depending on the duration of shutdown, the broker can take some time to enter ISR.
Thanks, Neha On Aug 20, 2013 4:26 AM, "James Wu" <[EMAIL PROTECTED]> wrote:
We have 3 brokers in our kafka cluster (1,2,3). Broker 2 somehow is not in isr. I restarted it and it did not help at all. And we notice in many case we have to restart the whole cluster to get it back. This is our top priority concern currently.
Here is the log after the restart:
[2013-08-21 16:17:18,992] INFO Registered broker 2 at path /brokers/ids/2 with address xxxx:1234. (kafka.utils.ZkUtils$) [2013-08-21 16:17:18,992] INFO [Kafka Server 2], Connecting to ZK: xxxx:1234, yyyy:1234, zzzz:1234 (kafka.server.KafkaServer) [2013-08-21 16:17:19,061] INFO Will not load MX4J, mx4j-tools.jar is not in th e classpath (kafka.utils.Mx4jLoader$) [2013-08-21 16:17:19,072] INFO conflict in /controller data: 2 stored data: 3 (kafka.utils.ZkUtils$) [2013-08-21 16:17:19,082] INFO [Kafka Server 2], started (kafka.server.KafkaSe rver) [2013-08-21 16:17:49,774] INFO Closing socket connection to /123.456.789. (kafka.network.Processor) ......
The controller is the broker that has the ActiveControllerCount jmx value of 1. At any point of time, only one broker in a Kafka cluster should have a value of 1 for this jmx mbean.
I personally find it very complex to find the replica fetcher thread's lag for a particular partition that is under replicated. I think we should have a tool that will take in a topic, partition and zookeeper url and give the lag for all the replicas for that partition. I will file a JIRA for this.
Thanks, Neha On Wed, Aug 21, 2013 at 1:41 PM, Yu, Libo <[EMAIL PROTECTED]> wrote:
Neha Narkhede 2013-08-22, 00:22
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation projects and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext