-Re: ClientUtils.fetchTopicMetadata reports smaller ISR than ZkUtils.getLeaderIsrAndEpochForPartition
Sorry it's taken so long to reply, the issue went away after I reassigned
partitions. Now it's back.
I haven't checked JMX, because the brokers and zookeeper have been
reporting the same ISR for several hours.
Some more details:
The cluster/topic has
5 brokers (1, 4, 5, 7, 8)
15 partitions (0...14)
A single broker, 4, is the one missing from the ISR in every case. For
partitions where 4 is the leader (1, 6, 11), it is present in the ISR. For
partitions where 4 is not the leader (4, 8, 12), it is not present in the
ISR. Here's the output of my tool, showing assignment and ISR:
I haven't seen anything interesting in the logs, but I'm not entirely sure
what to look for. The cluster is currently in this state, and if it goes
like last time, this will persist until I reassign partitions.
What can I do in the meantime to track down the issue?
On Thu, Dec 5, 2013 at 12:55 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> Do you see any ISR churns on the brokers? You can check the ISR
> expand/shrink rate jmx.
> On Wed, Dec 4, 2013 at 3:53 PM, Ryan Berdeen <[EMAIL PROTECTED]> wrote:
> > I'm working on some monitoring tools for Kafka, and I've seen a couple of
> > clusters get into a state where ClientUtils.fetchTopicMetadata will show
> > that not all replicas are in the ISR.
> > At the same time, ZkUtils.getLeaderIsrAndEpochForPartition will show that
> > all all partitions are in the ISR, and
> > the "kafka.server":name="UnderReplicatedPartitions",type="ReplicaManager"
> > MBean will report 0.
> > What's going on? Is there something wrong with my controller, or should I
> > not be paying attention to ClientUtils.fetchTopicMetadata?
> > Thanks,
> > Ryan