Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Fetch request with correlation id 1171437 from client ReplicaFetcherThread-0-1 on partition [meetme,0] failed due to Leader not local for partition


Copy link to this message
-
Re: Fetch request with correlation id 1171437 from client ReplicaFetcherThread-0-1 on partition [meetme,0] failed due to Leader not local for partition
Just wanted to clarify: the topic.metadata.refresh.interval.ms would apply
to producers - and mainly with ack = 0. (If ack = 1, then a metadata
request would be issued on this exception although even with ack > 0 it is
useful to have the metadata refresh for refreshing information about how
many partitions are available.)

For replica fetchers (Vadim's case) the exceptions would persist for as
long as the new leader for the replica in question is elected. It should
not take too long. When the leader is elected, the controller will send out
an RPC to the new leaders and followers and the above exceptions will go
away.

Also, to answer your question: the "right" way to shutdown an 0.8 cluster
is to use controlled shutdown. That will not eliminate the exceptions, but
they are more for informative purposes and are non-fatal (i.e., the logging
can probably be improved a bit).

On Fri, Jun 28, 2013 at 11:47 AM, David DeMaagd <[EMAIL PROTECTED]>wrote:

> Unless I'm misreading something, that is controlled by the
> topic.metadata.refresh.interval.ms variable (defaults to 10 minutes),
> and I've not seen it run longer than that (unless there was other
> problems besides that going on).
>
> I would check the JMX values for things under
> "kafka.server":type="ReplicaManager",
> particularly UnderReplicatedPartitions and possibly the ISR
> Expand/Shrinks values - those could indicate a problem on the brokers
> that is preventing things from settling down completely.  Might also
> look and see if you are doing any heavy GCs (which can cause zookeeper
> connection issues, which would then complicate the ISR election stuff).
>
> --
> Dave DeMaagd
> [EMAIL PROTECTED] | 818 262 7958
>
> ([EMAIL PROTECTED] - Fri, Jun 28, 2013 at 11:32:42AM -0700)
> > David. What is the expected time frame for the exception to continue? Its
> > an hour has passed since short downtime and I still see the exception in
> > kafka service logs.
> >
> > Thanks,
> > Vadim
> >
> >
> > On Fri, Jun 28, 2013 at 11:25 AM, David DeMaagd <[EMAIL PROTECTED]
> >wrote:
> >
> > > Getting kafka.common.NotLeaderForPartitionException for a time after a
> > > node is brought back on line (especially if it is a short downtime) is
> > > normal - that is because the consumers have not yet completely picked
> up
> > > the new leader information.  If should settle shortly.
> > >
> > > --
> > > Dave DeMaagd
> > > [EMAIL PROTECTED] | 818 262 7958
> > >
> > > ([EMAIL PROTECTED] - Fri, Jun 28, 2013 at 11:08:46AM -0700)
> > > > I want to clarify that I restarted only one kafka node, all others
> were
> > > > running and did not require restart
> > > >
> > > >
> > > > On Fri, Jun 28, 2013 at 10:57 AM, Vadim Keylis <
> [EMAIL PROTECTED]
> > > >wrote:
> > > >
> > > > > Good morning. I have a cluster of 3 kafka nodes. They were both
> > > running at
> > > > > the time. I need it to make configuration change in the property
> file
> > > and
> > > > > restart kafka. I have not broker shutdown tool, but simple used
> pkill
> > > -TERM
> > > > > -u ${KAFKA_USER} -f kafka.Kafka. That suddenly cause the
>  exception.
> > > How to
> > > > > avoid this issue in the future? What's the right way to shutdown
> kafka
> > > to
> > > > > prevent Not Leder Exception
> > > > >
> > > > > Thanks so much in advance,
> > > > > Vadim
> > > > >
> > > > >
> > > > >
> > > > > [2013-06-28 10:46:53,281] WARN [KafkaApi-1] Fetch request with
> > > correlation
> > > > > id 1171435 from client ReplicaFetcherThread-0-1 on partition
> [meetme,0]
> > > > > failed due to Leader not local for partition [meetme,0] on broker 1
> > > > > (kafka.server.KafkaApis)
> > > > > [2013-06-28 10:46:53,282] WARN [KafkaApi-1] Fetch request with
> > > correlation
> > > > > id 1171436 from client ReplicaFetcherThread-0-1 on partition
> [meetme,0]
> > > > > failed due to Leader not local for partition [meetme,0] on broker 1
> > > > > (kafka.server.KafkaApis)
> > > > > [2013-06-28 10:46:53,448] WARN [ReplicaFetcherThread-0-2], error

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB