Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Questions regarding broker


Copy link to this message
-
Re: Questions regarding broker
thanks Joel for looking into it. I will try to reproduce it. I don't think
the second zookeeper is needed because i ran into it the first time just by
shutting down the topic leaders.

Cal
On Tue, Jul 16, 2013 at 2:38 AM, Joel Koshy <[EMAIL PROTECTED]> wrote:

> Hey Calvin,
>
> I apologize for not being able to get to this sooner. I don't think I
> can reproduce the full scenario exactly as I don't have exclusive
> access to so many machines, but I tried it locally and couldn't
> reproduce it. Any chance you can reproduce it with a smaller
> deployment? Is step 6 required? Would you mind pasting the full stack
> trace that you saw?
>
> Thanks,
>
> Joel
>
>
>
>
> On Wed, Jul 10, 2013 at 11:10 PM, Joel Koshy <[EMAIL PROTECTED]> wrote:
> > Ok thanks - I'll go through this tomorrow.
> >
> > Joel
> >
> > On Wed, Jul 10, 2013 at 9:14 PM, Calvin Lei <[EMAIL PROTECTED]> wrote:
> >> Joel,
> >>    So i was able to reproduce the issue that I experienced. Please see
> the
> >> steps below.
> >> 1. Set up a 3-zookeeper and 6-broker cluster. Setup one topic with 2
> >> partitions, with replication factor set to 3.
> >> 2. Setup and run the console consumer, consuming messages from that
> topic.
> >> 3. Produce a few messages to confirm the consumer is working.
> >> 4. Stop the consumer.
> >> 5. Shutdown (uncontrolled) the lead broker in one of the partition.
> >> 6. Shutdown one of the zookeeper.
> >> 7. Run the list topic script to confirm a new leader has been elected
> >> 8. Bring up the console consumer again.
> >> 9. Console consumer won't start because of error in rebalancing (when
> >> fetching topic metadata).
> >>      Error: Java.util.NoSuchElementException: Key Not Found (5).
> >>      Trace: Client.Util.Scala:67
> >>
> >> Where broker 5 was the lead broker I shut down. I am using 0.8 beta.
> >>
> >> thanks,
> >> Cal
> >>
> >>
> >> On Tue, Jul 9, 2013 at 11:20 PM, Calvin Lei <[EMAIL PROTECTED]> wrote:
> >>
> >>> I will try to reproduce it. it was sporadic. My set up was a topic
> with 1
> >>> partition and replication factor = 3.
> >>> If i kill the console producer and then shut down the leader broker, a
> new
> >>> leader is elected. If I again kill the new lead, I dont see the last
> broker
> >>> be elected as a leader. Then i tried starting the console producer, i
> >>> started seeing errors.
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Jul 9, 2013 at 6:14 PM, Joel Koshy <[EMAIL PROTECTED]>
> wrote:
> >>>
> >>>> Not really - if you shutdown a leader broker (and assuming your
> >>>> replication factor is > 1) then the other assigned replica will be
> >>>> elected as the new leader. The producer would then look up metadata,
> >>>> find the new leader and send requests to it. What do you see in the
> >>>> logs?
> >>>>
> >>>> Joel
> >>>>
> >>>> On Tue, Jul 9, 2013 at 1:44 PM, Calvin Lei <[EMAIL PROTECTED]> wrote:
> >>>> > Thanks you have me enough pointers to dig deeper. And I tested the
> fault
> >>>> > tolerance by shutting down brokers randomly.
> >>>> >
> >>>> > What I noticed is if I shutdown brokers while my producer and
> consumer
> >>>> are
> >>>> > still running, they recover fine. However, if I shutdown a lead
> broker
> >>>> > without a running producer, I can't seem to start the producer
> >>>> afterwards
> >>>> > without restarting the previous lead broker. Is this expected?
> >>>> > On Jul 9, 2013 10:28 AM, "Joel Koshy" <[EMAIL PROTECTED]> wrote:
> >>>> >
> >>>> >> For 1 I forgot to add - there is an admin tool to reassign replicas
> >>>> but it
> >>>> >> would take longer than leader failover.
> >>>> >>
> >>>> >> Joel
> >>>> >>
> >>>> >> On Tuesday, July 9, 2013, Joel Koshy wrote:
> >>>> >>
> >>>> >> > 1 - no, unless broker4 is not the preferred leader. (The
> preferred
> >>>> >> > leader is the first broker in the assigned replica list). If a
> >>>> >> > non-preferred replica is the current leader you can run the
> >>>> >> > PreferredReplicaLeaderElection admin command to move the leader.
> >>>> >> > 2 - The actual leader movement (on leader failover) is fairly