+1 for that knob on a per topic basis, choosing consistency over availability would open kafka to more use cases no?
Sent from my iPhone
On Aug 22, 2013, at 1:59 PM, Neha Narkhede <[EMAIL PROTECTED]> wrote:
> Kafka replication aims to guarantee that committed writes are not lost. In
> other words, as long as leader can be transitioned to a broker that was in
> the ISR, no data will be lost. For increased availability, if there are no
> other brokers in the ISR, we fall back to electing a broker that is not
> caught up with the current leader, as the new leader. IMO, this is the real
> problem that the post is complaining about.
> Let me explain his test in more detail-
> 1. The first part of the test partitions the leader (n1) from other brokers
> (n2-n5). The leader shrinks the ISR to just itself and ends up taking n
> writes. This is not a problem all by itself. Once the partition is
> resolved, n2-n5 would catch up from the leader and no writes will be lost,
> since n1 would continue to serve as the leader.
> 2. The problem starts in the second part of the test where it partitions
> the leader (n1) from zookeeper. This causes the unclean leader election
> (mentioned above), which causes Kafka to lose data.
> We thought about this while designing replication, but never ended up
> including the feature that would allow some applications to pick
> consistency over availability. Basically, we could let applications pick
> some topics for which the controller will never attempt unclean leader
> election. The result is that Kafka would reject writes and mark the
> partition offline, instead of moving leadership to a broker that is not in
> ISR, and losing the writes.
> I think if we included this knob, the tests that aphyr (jepsen) ran, would
> make more sense.
> On Thu, Aug 22, 2013 at 12:50 PM, Scott Clasen <[EMAIL PROTECTED]> wrote:
>> So looks like there is a jespen post coming on kafka 0.8 replication, based
>> on this thats circulating on twitter. https://www.refheap.com/17932/raw
>> Understanding that kafka isnt designed particularly to be partition
>> tolerant, the result is not completely surprising.
>> But my question is, is there something that can be done about the lost
>> From my understanding when broker n1 comes back on line, currently what
>> will happen is that the messages that were only on n1 will be
>> truncated/tossed while n1 is coming back to ISR. Please correct me if this
>> is not accurate.
>> Would it instead be possible to do something else with them, like sending
>> them to an internal lost messages topic, or log file where some manual
>> intervenion could be done on them, or a configuration property like
>> replay.truncated.messages=true could be set where the broker would send the
>> lost messages back onto the topic after ISR?