Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Questions regarding broker


Copy link to this message
-
Re: Questions regarding broker
thanks Joel for looking into it. I will try to reproduce it. I don't think
the second zookeeper is needed because i ran into it the first time just by
shutting down the topic leaders.

Cal
On Tue, Jul 16, 2013 at 2:38 AM, Joel Koshy <[EMAIL PROTECTED]> wrote:

> Hey Calvin,
>
> I apologize for not being able to get to this sooner. I don't think I
> can reproduce the full scenario exactly as I don't have exclusive
> access to so many machines, but I tried it locally and couldn't
> reproduce it. Any chance you can reproduce it with a smaller
> deployment? Is step 6 required? Would you mind pasting the full stack
> trace that you saw?
>
> Thanks,
>
> Joel
>
>
>
>
> On Wed, Jul 10, 2013 at 11:10 PM, Joel Koshy <[EMAIL PROTECTED]> wrote:
> > Ok thanks - I'll go through this tomorrow.
> >
> > Joel
> >
> > On Wed, Jul 10, 2013 at 9:14 PM, Calvin Lei <[EMAIL PROTECTED]> wrote:
> >> Joel,
> >>    So i was able to reproduce the issue that I experienced. Please see
> the
> >> steps below.
> >> 1. Set up a 3-zookeeper and 6-broker cluster. Setup one topic with 2
> >> partitions, with replication factor set to 3.
> >> 2. Setup and run the console consumer, consuming messages from that
> topic.
> >> 3. Produce a few messages to confirm the consumer is working.
> >> 4. Stop the consumer.
> >> 5. Shutdown (uncontrolled) the lead broker in one of the partition.
> >> 6. Shutdown one of the zookeeper.
> >> 7. Run the list topic script to confirm a new leader has been elected
> >> 8. Bring up the console consumer again.
> >> 9. Console consumer won't start because of error in rebalancing (when
> >> fetching topic metadata).
> >>      Error: Java.util.NoSuchElementException: Key Not Found (5).
> >>      Trace: Client.Util.Scala:67
> >>
> >> Where broker 5 was the lead broker I shut down. I am using 0.8 beta.
> >>
> >> thanks,
> >> Cal
> >>
> >>
> >> On Tue, Jul 9, 2013 at 11:20 PM, Calvin Lei <[EMAIL PROTECTED]> wrote:
> >>
> >>> I will try to reproduce it. it was sporadic. My set up was a topic
> with 1
> >>> partition and replication factor = 3.
> >>> If i kill the console producer and then shut down the leader broker, a
> new
> >>> leader is elected. If I again kill the new lead, I dont see the last
> broker
> >>> be elected as a leader. Then i tried starting the console producer, i
> >>> started seeing errors.
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Jul 9, 2013 at 6:14 PM, Joel Koshy <[EMAIL PROTECTED]>
> wrote:
> >>>
> >>>> Not really - if you shutdown a leader broker (and assuming your
> >>>> replication factor is > 1) then the other assigned replica will be
> >>>> elected as the new leader. The producer would then look up metadata,
> >>>> find the new leader and send requests to it. What do you see in the
> >>>> logs?
> >>>>
> >>>> Joel
> >>>>
> >>>> On Tue, Jul 9, 2013 at 1:44 PM, Calvin Lei <[EMAIL PROTECTED]> wrote:
> >>>> > Thanks you have me enough pointers to dig deeper. And I tested the
> fault
> >>>> > tolerance by shutting down brokers randomly.
> >>>> >
> >>>> > What I noticed is if I shutdown brokers while my producer and
> consumer
> >>>> are
> >>>> > still running, they recover fine. However, if I shutdown a lead
> broker
> >>>> > without a running producer, I can't seem to start the producer
> >>>> afterwards
> >>>> > without restarting the previous lead broker. Is this expected?
> >>>> > On Jul 9, 2013 10:28 AM, "Joel Koshy" <[EMAIL PROTECTED]> wrote:
> >>>> >
> >>>> >> For 1 I forgot to add - there is an admin tool to reassign replicas
> >>>> but it
> >>>> >> would take longer than leader failover.
> >>>> >>
> >>>> >> Joel
> >>>> >>
> >>>> >> On Tuesday, July 9, 2013, Joel Koshy wrote:
> >>>> >>
> >>>> >> > 1 - no, unless broker4 is not the preferred leader. (The
> preferred
> >>>> >> > leader is the first broker in the assigned replica list). If a
> >>>> >> > non-preferred replica is the current leader you can run the
> >>>> >> > PreferredReplicaLeaderElection admin command to move the leader.
> >>>> >> > 2 - The actual leader movement (on leader failover) is fairly

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB