Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Data loss in case of request.required.acks set to -1


Copy link to this message
-
Re: Data loss in case of request.required.acks set to -1
Is it possible to expose programmatically, the number of brokers in ISR for
each partition?  We could make this a gating thing before shutting down a
broker gracefully, to make sure things are in good shape.....I guess
controlled shutdown assures this anyway, in a sense.....

Jason
On Mon, Dec 23, 2013 at 2:22 PM, Guozhang Wang <[EMAIL PROTECTED]> wrote:

> Hanish,
>
> Originally when you create the two partitions their leadership should be
> evenly distributed to two brokers, i.e. one broker get one partition.
> But from your case broker 1 is the leader for both partition 1 and 0, and
> from the replica list broker 0 should be originally the leader for
> partition1 since the leader of a partition should be the first one in the
> replica list.
>
> This means broker 0 was bounced or halted (e.g. by a GC, etc) before, and
> hence the leadership of partition 1 migrates to broker 1, and also it is
> still catching up after the bounce since it is not in isr for any
> partitions yet. In this case, when you bounce broker 1, broker 0 which is
> not in ISR will be selected as the new leader for both and hence cause data
> loss.
>
> If you are doing experiments on rolling bounce of say N replication factor,
> one thing to do is wait for the isr to have at least 2 brokers before
> bouncing the next one, otherwise data loss will not be guaranteed even if
> number of replicas is larger than 2.
>
> If you want to read more I would recommend this blog about Kafka's
> guarantee:
>
> http://blog.empathybox.com/post/62279088548/a-few-notes-on-kafka-and-jepsen
>
> Guozhang
>
>
>
>
> On Sun, Dec 22, 2013 at 10:38 PM, Hanish Bansal <
> [EMAIL PROTECTED]> wrote:
>
> > Hi Guazhang,
> >
> > When both nodes are alive then topic isr status is:
> >
> > topic: test-trunk111    partition: 0    leader: 0    replicas: 1,0
>  isr:
> > 0
> > topic: test-trunk111    partition: 1    leader: 0    replicas: 0,1
>  isr:
> > 0
> >
> > Now as the leader node is broker-0 so when i am producing the data then
> > meanwhile kill the leader node.
> > After leader goes down, topic isr status is:
> >
> > topic: test-trunk111    partition: 0    leader: 1    replicas: 1,0
>  isr:
> > 1
> > topic: test-trunk111    partition: 1    leader: 1    replicas: 0,1
>  isr:
> > 1
> >
> > Now after all data produced when i consumed the data, there is some data
> > loss.
> >
> > *Also in controller logs there is entry like:*
> >
> > [2013-12-23 10:25:07,648] DEBUG [OfflinePartitionLeaderSelector]: No
> broker
> > in ISR is alive for [test-trunk111,1]. Pick the leader from the alive
> > assigned replicas: 1 (kafka.controller.OfflinePartitionLeaderSelector)
> > [2013-12-23 10:25:07,648] WARN [OfflinePartitionLeaderSelector]: No
> broker
> > in ISR is alive for [test-trunk111,1]. Elect leader 1 from live brokers
> 1.
> > There's potential data loss.
> > (kafka.controller.OfflinePartitionLeaderSelector)
> > [2013-12-23 10:25:07,649] INFO [OfflinePartitionLeaderSelector]: Selected
> > new leader and ISR {"leader":1,"leader_epoch":1,"isr":[1]} for offline
> > partition [test-trunk111,1]
> > (kafka.controller.OfflinePartitionLeaderSelector)
> >
> > Is there any solution for this behaviour ?
> >
> >
> > On Fri, Dec 20, 2013 at 7:27 PM, Guozhang Wang <[EMAIL PROTECTED]>
> wrote:
> >
> > > Hanish,
> > >
> > > One thing you can check is when you kill one of the brokers, is the
> other
> > > broker on the ISR last of the partition that killed broker is hosting.
> > This
> > > can be done using the kafka-topics tool.
> > >
> > > Also you can check if the controller log if there is any entry like "No
> > > broker in ISR is alive for %s. Elect leader %d from live brokers %s.
> > > There's potential data loss."
> > >
> > > Guozhang
> > >
> > >
> > > On Fri, Dec 20, 2013 at 9:11 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> > >
> > > > Could you reproduce this easily? If so, could you file a jira and
> > > describe
> > > > the steps?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB