Hi, I have two questions regarding the kafka broker setup.
1. Assuming i have a 4-broker and 2-zookeeper (running in quorum mode) setup, if topicA-partition0 has the leader set to broker4, can I change the leader to other broker without killing the current leader?
2. What is the latency of switching to a different leader when the current leader is down? Do we configure it using the consumer property - refresh.leader.backoff.ms
3. What is the best practice of dynamically adding a new node to a kafka cluster? Should i bring up the node, and then increase the replication factor for the existing topic(s)? thanks in advance, Cal
1 - no, unless broker4 is not the preferred leader. (The preferred leader is the first broker in the assigned replica list). If a non-preferred replica is the current leader you can run the PreferredReplicaLeaderElection admin command to move the leader. 2 - The actual leader movement (on leader failover) is fairly low - probably of the order of tens of ms. However, clients (producers, consumers) may take longer to detect that (it needs to get back an error response, handle an exception, issue a metadata request, get the response to find the new leader, and all that can add up but it should not be terribly high - I'm guessing on the order of a few hundred ms to a second or so). 3 - That should work, although the admin command for adding more partitions to a topic is currently being developed. On Mon, Jul 8, 2013 at 11:02 PM, Calvin Lei <[EMAIL PROTECTED]> wrote:
Thanks you have me enough pointers to dig deeper. And I tested the fault tolerance by shutting down brokers randomly.
What I noticed is if I shutdown brokers while my producer and consumer are still running, they recover fine. However, if I shutdown a lead broker without a running producer, I can't seem to start the producer afterwards without restarting the previous lead broker. Is this expected? On Jul 9, 2013 10:28 AM, "Joel Koshy" <[EMAIL PROTECTED]> wrote:
Not really - if you shutdown a leader broker (and assuming your replication factor is > 1) then the other assigned replica will be elected as the new leader. The producer would then look up metadata, find the new leader and send requests to it. What do you see in the logs?
On Tue, Jul 9, 2013 at 1:44 PM, Calvin Lei <[EMAIL PROTECTED]> wrote:
I will try to reproduce it. it was sporadic. My set up was a topic with 1 partition and replication factor = 3. If i kill the console producer and then shut down the leader broker, a new leader is elected. If I again kill the new lead, I dont see the last broker be elected as a leader. Then i tried starting the console producer, i started seeing errors. On Tue, Jul 9, 2013 at 6:14 PM, Joel Koshy <[EMAIL PROTECTED]> wrote:
Joel, So i was able to reproduce the issue that I experienced. Please see the steps below. 1. Set up a 3-zookeeper and 6-broker cluster. Setup one topic with 2 partitions, with replication factor set to 3. 2. Setup and run the console consumer, consuming messages from that topic. 3. Produce a few messages to confirm the consumer is working. 4. Stop the consumer. 5. Shutdown (uncontrolled) the lead broker in one of the partition. 6. Shutdown one of the zookeeper. 7. Run the list topic script to confirm a new leader has been elected 8. Bring up the console consumer again. 9. Console consumer won't start because of error in rebalancing (when fetching topic metadata). Error: Java.util.NoSuchElementException: Key Not Found (5). Trace: Client.Util.Scala:67
Where broker 5 was the lead broker I shut down. I am using 0.8 beta.
thanks, Cal On Tue, Jul 9, 2013 at 11:20 PM, Calvin Lei <[EMAIL PROTECTED]> wrote:
I apologize for not being able to get to this sooner. I don't think I can reproduce the full scenario exactly as I don't have exclusive access to so many machines, but I tried it locally and couldn't reproduce it. Any chance you can reproduce it with a smaller deployment? Is step 6 required? Would you mind pasting the full stack trace that you saw?
Joel On Wed, Jul 10, 2013 at 11:10 PM, Joel Koshy <[EMAIL PROTECTED]> wrote:
thanks Joel for looking into it. I will try to reproduce it. I don't think the second zookeeper is needed because i ran into it the first time just by shutting down the topic leaders.
Cal On Tue, Jul 16, 2013 at 2:38 AM, Joel Koshy <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation projects and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext