Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - question about isr


Copy link to this message
-
Re: question about isr
Ian Friedman 2014-05-16, 13:44
This seems similar to behavior we’re seeing. At some point one of our brokers (id 1) just gives up and starts throwing those errors and kafka-topics no longer lists it as a ISR. However the logs for that broker say something very odd:

[2014-05-09 10:16:00,248] INFO Partition [callbackServiceTopic-High,8] on broker 1: Cached zkVersion [10] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
[2014-05-09 10:16:00,248] INFO Partition [callbackServiceTopic,3] on broker 1: Shrinking ISR for partition [callbackServiceTopic,3] from 1,2,3 to 1 (kafka.cluster.Partition)
[2014-05-09 10:16:00,251] ERROR Conditional update of path /brokers/topics/callbackServiceTopic/partitions/3/state with data {"controller_epoch":4,"leader":1,"version":1,"leader_epoch":4,"isr":[1]} and expected version 9 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/callbackServiceTopic/partitions/3/state (kafka.utils.ZkUtils$)
[2014-05-09 10:16:00,251] INFO Partition [callbackServiceTopic,3] on broker 1: Cached zkVersion [9] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
[2014-05-09 10:16:00,251] INFO Partition [callbackServiceTopic-High,31] on broker 1: Shrinking ISR for partition [callbackServiceTopic-High,31] from 1,2,3 to 1 (kafka.cluster.Partition)
[2014-05-09 10:16:00,255] ERROR Conditional update of path /brokers/topics/callbackServiceTopic-High/partitions/31/state with data {"controller_epoch":4,"leader":1,"version":1,"leader_epoch":4,"isr":[1]} and expected version 9 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/callbackServiceTopic-High/partitions/31/state (kafka.utils.ZkUtils$)
[2014-05-09 10:16:00,255] INFO Partition [callbackServiceTopic-High,31] on broker 1: Cached zkVersion [9] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
[2014-05-09 10:16:00,255] INFO Partition [callbackServiceTopic-Low,3] on broker 1: Shrinking ISR for partition [callbackServiceTopic-Low,3] from 1,2,3 to 1 (kafka.cluster.Partition)
[2014-05-09 10:16:00,258] ERROR Conditional update of path /brokers/topics/callbackServiceTopic-Low/partitions/3/state with data {"controller_epoch":4,"leader":1,"version":1,"leader_epoch":4,"isr":[1]} and expected version 9 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/callbackServiceTopic-Low/partitions/3/state (kafka.utils.ZkUtils$)

etc. And these errors continue every few seconds.

kafka-topics.sh —describe output:
Topic:callbackServiceTopic-High PartitionCount:50 ReplicationFactor:3 Configs:
Topic: callbackServiceTopic-High Partition: 0 Leader: 2 Replicas: 3,1,2 Isr: 2,3
Topic: callbackServiceTopic-High Partition: 1 Leader: 2 Replicas: 1,2,3 Isr: 2,3
Topic: callbackServiceTopic-High Partition: 2 Leader: 2 Replicas: 2,3,1 Isr: 2,3

I went and looked at one of those ZNodes in the zkCLI and found this:

[zk: localhost:2181(CONNECTED) 2] get /brokers/topics/callbackServiceTopic-High/partitions/31/state
{"controller_epoch":5,"leader":2,"version":1,"leader_epoch":5,"isr":[2,3]}

What does the version number there represent and how does it get out of sync? Should I restart broker 1? Is the fact that broker 1 is behind in leader_epoch significant?

Still trying to figure out Kafka operations :(
—Ian

 
On Apr 24, 2014, at 9:26 PM, 陈小军 <[EMAIL PROTECTED]> wrote: