Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Fetch request with correlation id 1171437 from client ReplicaFetcherThread-0-1 on partition [meetme,0] failed due to Leader not local for partition


Copy link to this message
-
Re: Fetch request with correlation id 1171437 from client ReplicaFetcherThread-0-1 on partition [meetme,0] failed due to Leader not local for partition
Leader election occurs when brokers are bounced or lose their
zookeeper registration. Do you have a state-change.log on your
brokers? Also can you see what's in the following zk paths:
get /brokers/topics/meetme
get /brokers/topics/meetme/partitions/0/state

On Fri, Jun 28, 2013 at 1:40 PM, Vadim Keylis <[EMAIL PROTECTED]> wrote:
> Joel. My problem after your explanation is that leader for some reason did
> not get elected and exception is been thrown for hours now. What is the
> best way to force leader creation for that partition?
>
> Vadim
>
>
> On Fri, Jun 28, 2013 at 12:26 PM, Joel Koshy <[EMAIL PROTECTED]> wrote:
>
>> Just wanted to clarify: the topic.metadata.refresh.interval.ms would apply
>> to producers - and mainly with ack = 0. (If ack = 1, then a metadata
>> request would be issued on this exception although even with ack > 0 it is
>> useful to have the metadata refresh for refreshing information about how
>> many partitions are available.)
>>
>> For replica fetchers (Vadim's case) the exceptions would persist for as
>> long as the new leader for the replica in question is elected. It should
>> not take too long. When the leader is elected, the controller will send out
>> an RPC to the new leaders and followers and the above exceptions will go
>> away.
>>
>> Also, to answer your question: the "right" way to shutdown an 0.8 cluster
>> is to use controlled shutdown. That will not eliminate the exceptions, but
>> they are more for informative purposes and are non-fatal (i.e., the logging
>> can probably be improved a bit).
>>
>>
>>
>> On Fri, Jun 28, 2013 at 11:47 AM, David DeMaagd <[EMAIL PROTECTED]
>> >wrote:
>>
>> > Unless I'm misreading something, that is controlled by the
>> > topic.metadata.refresh.interval.ms variable (defaults to 10 minutes),
>> > and I've not seen it run longer than that (unless there was other
>> > problems besides that going on).
>> >
>> > I would check the JMX values for things under
>> > "kafka.server":type="ReplicaManager",
>> > particularly UnderReplicatedPartitions and possibly the ISR
>> > Expand/Shrinks values - those could indicate a problem on the brokers
>> > that is preventing things from settling down completely.  Might also
>> > look and see if you are doing any heavy GCs (which can cause zookeeper
>> > connection issues, which would then complicate the ISR election stuff).
>> >
>> > --
>> > Dave DeMaagd
>> > [EMAIL PROTECTED] | 818 262 7958
>> >
>> > ([EMAIL PROTECTED] - Fri, Jun 28, 2013 at 11:32:42AM -0700)
>> > > David. What is the expected time frame for the exception to continue?
>> Its
>> > > an hour has passed since short downtime and I still see the exception
>> in
>> > > kafka service logs.
>> > >
>> > > Thanks,
>> > > Vadim
>> > >
>> > >
>> > > On Fri, Jun 28, 2013 at 11:25 AM, David DeMaagd <[EMAIL PROTECTED]
>> > >wrote:
>> > >
>> > > > Getting kafka.common.NotLeaderForPartitionException for a time after
>> a
>> > > > node is brought back on line (especially if it is a short downtime)
>> is
>> > > > normal - that is because the consumers have not yet completely picked
>> > up
>> > > > the new leader information.  If should settle shortly.
>> > > >
>> > > > --
>> > > > Dave DeMaagd
>> > > > [EMAIL PROTECTED] | 818 262 7958
>> > > >
>> > > > ([EMAIL PROTECTED] - Fri, Jun 28, 2013 at 11:08:46AM -0700)
>> > > > > I want to clarify that I restarted only one kafka node, all others
>> > were
>> > > > > running and did not require restart
>> > > > >
>> > > > >
>> > > > > On Fri, Jun 28, 2013 at 10:57 AM, Vadim Keylis <
>> > [EMAIL PROTECTED]
>> > > > >wrote:
>> > > > >
>> > > > > > Good morning. I have a cluster of 3 kafka nodes. They were both
>> > > > running at
>> > > > > > the time. I need it to make configuration change in the property
>> > file
>> > > > and
>> > > > > > restart kafka. I have not broker shutdown tool, but simple used
>> > pkill
>> > > > -TERM
>> > > > > > -u ${KAFKA_USER} -f kafka.Kafka. That suddenly cause the
>> >  exception.

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB