Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Questions regarding broker


Copy link to this message
-
Re: Questions regarding broker
Calvin Lei 2013-07-11, 04:15
Joel,
   So i was able to reproduce the issue that I experienced. Please see the
steps below.
1. Set up a 3-zookeeper and 6-broker cluster. Setup one topic with 2
partitions, with replication factor set to 3.
2. Setup and run the console consumer, consuming messages from that topic.
3. Produce a few messages to confirm the consumer is working.
4. Stop the consumer.
5. Shutdown (uncontrolled) the lead broker in one of the partition.
6. Shutdown one of the zookeeper.
7. Run the list topic script to confirm a new leader has been elected
8. Bring up the console consumer again.
9. Console consumer won't start because of error in rebalancing (when
fetching topic metadata).
     Error: Java.util.NoSuchElementException: Key Not Found (5).
     Trace: Client.Util.Scala:67

Where broker 5 was the lead broker I shut down. I am using 0.8 beta.

thanks,
Cal
On Tue, Jul 9, 2013 at 11:20 PM, Calvin Lei <[EMAIL PROTECTED]> wrote:

> I will try to reproduce it. it was sporadic. My set up was a topic with 1
> partition and replication factor = 3.
> If i kill the console producer and then shut down the leader broker, a new
> leader is elected. If I again kill the new lead, I dont see the last broker
> be elected as a leader. Then i tried starting the console producer, i
> started seeing errors.
>
>
>
>
> On Tue, Jul 9, 2013 at 6:14 PM, Joel Koshy <[EMAIL PROTECTED]> wrote:
>
>> Not really - if you shutdown a leader broker (and assuming your
>> replication factor is > 1) then the other assigned replica will be
>> elected as the new leader. The producer would then look up metadata,
>> find the new leader and send requests to it. What do you see in the
>> logs?
>>
>> Joel
>>
>> On Tue, Jul 9, 2013 at 1:44 PM, Calvin Lei <[EMAIL PROTECTED]> wrote:
>> > Thanks you have me enough pointers to dig deeper. And I tested the fault
>> > tolerance by shutting down brokers randomly.
>> >
>> > What I noticed is if I shutdown brokers while my producer and consumer
>> are
>> > still running, they recover fine. However, if I shutdown a lead broker
>> > without a running producer, I can't seem to start the producer
>> afterwards
>> > without restarting the previous lead broker. Is this expected?
>> > On Jul 9, 2013 10:28 AM, "Joel Koshy" <[EMAIL PROTECTED]> wrote:
>> >
>> >> For 1 I forgot to add - there is an admin tool to reassign replicas
>> but it
>> >> would take longer than leader failover.
>> >>
>> >> Joel
>> >>
>> >> On Tuesday, July 9, 2013, Joel Koshy wrote:
>> >>
>> >> > 1 - no, unless broker4 is not the preferred leader. (The preferred
>> >> > leader is the first broker in the assigned replica list). If a
>> >> > non-preferred replica is the current leader you can run the
>> >> > PreferredReplicaLeaderElection admin command to move the leader.
>> >> > 2 - The actual leader movement (on leader failover) is fairly low -
>> >> > probably of the order of tens of ms. However, clients (producers,
>> >> > consumers) may take longer to detect that (it needs to get back an
>> >> > error response, handle an exception, issue a metadata request, get
>> the
>> >> > response to find the new leader, and all that can add up but it
>> should
>> >> > not be terribly high - I'm guessing on the order of a few hundred ms
>> >> > to a second or so).
>> >> > 3 - That should work, although the admin command for adding more
>> >> > partitions to a topic is currently being developed.
>> >> >
>> >> >
>> >> > On Mon, Jul 8, 2013 at 11:02 PM, Calvin Lei <[EMAIL PROTECTED]>
>> wrote:
>> >> > > Hi,
>> >> > >     I have two questions regarding the kafka broker setup.
>> >> > >
>> >> > > 1. Assuming i have a 4-broker and 2-zookeeper (running in quorum
>> mode)
>> >> > > setup, if topicA-partition0 has the leader set to broker4, can I
>> change
>> >> > the
>> >> > > leader to other broker without killing the current leader?
>> >> > >
>> >> > > 2. What is the latency of switching to a different leader when the
>> >> > current
>> >> > > leader is down? Do we configure it using the consumer property -