Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Instances became unresponsive


Copy link to this message
-
Instances became unresponsive
Somehow I am getting my instances of kafka to crash. I started kafka
instances one by one and they started successfully. Later it some how two
of 3 instances became completely unresponsive. The process is running, but
connnection over jmx or taking heat dump not possible. The last one some
what resposnive.
I am not sure how server get to this state. Is there anything I can monitor
to predict instances about to crash. What are ways to recover without data
loss? What am I doing wrong to get to this state. Please advise.
I poke around error logs on hosts that are not responsive and here are the
errors I found. One that I have not listed LeaderNotFoundExceotion.

 The most puzzling is about zookeeper as it was not redeployed or updated.
[2013-08-26 12:14:35,357] ERROR [KafkaApi-5] Error while fetching metadata
for partition [self_reactivation,0] (kafka.server.KafkaApis)
kafka.common.ReplicaNotAvailableException
        at
kafka.server.KafkaApis$$anonfun$17$$anonfun$20.apply(KafkaApis.scala:471)
        at
kafka.server.KafkaApis$$anonfun$17$$anonfun$20.apply(KafkaApis.scala:456)
        at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233)
        at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233)
        at
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59)
        at scala.collection.immutable.List.foreach(List.scala:76)
        at
scala.collection.TraversableLike$class.map(TraversableLike.scala:233)
in server.log
[2013-08-26 21:00:51,942] ERROR Conditional update of path
/brokers/topics/meetme/partitions/12/state with data {
"controller_epoch":6, "isr":[ 5 ], "leader":5, "leader_epoch":1,
"version":1 } and expected version 2 failed due to
org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode =
BadVersion for /brokers/topics/meetme/partitions/12/state
(kafka.utils.ZkUtils$)
[2013-08-26 21:00:51,943] INFO Partition [meetme,12] on broker 5: Cached
zkVersion [2] not equal to that in zookeeper, skip updating ISR
(kafka.cluster.Partition)
[2013-08-26 21:00:51,990] INFO Partition [meetme,4] on broker 5: Shrinking
ISR for partition [meetme,4] from 5,4 to 5 (kafka.cluster.Partition)
[2013-08-26 21:00:51,993] ERROR Conditional update of path
/brokers/topics/meetme/partitions/4/state with data { "controller_epoch":6,
"isr":[ 5 ], "leader":5, "leader_epoch":1, "version":1 } and expected
version 2 failed due to
org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode =
BadVersion for /brokers/topics/meetme/partitions/4/state
(kafka.utils.ZkUtils$)
[2013-08-26 21:00:51,993] INFO Partition [meetme,4] on broker 5: Cached
zkVersion [2] not equal to that in zookeeper, skip updating ISR
(kafka.cluster.Partition)
[2013-08-26 21:00:52,103] INFO Partition [meetme,6] on broker 5: Shrinking
ISR for partition [meetme,6] from 5,4 to 5 (kafka.cluster.Partition)
[2013-08-26 21:00:52,107] ERROR Conditional update of path
/brokers/topics/meetme/partitions/6/state with data { "controller_epoch":6,
"isr":[ 5 ], "leader":5, "leader_epoch":2, "version":1 } and expected
version 3 failed due to
org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode =
BadVersion for /brokers/topics/meetme/partitions/6/state
(kafka.utils.ZkUtils$)
[2013-08-26 21:00:52,107] INFO Partition [meetme,6] on broker 5: Cached
zkVersion [3] not equal to that in zookeeper, skip updating ISR
(kafka.cluster.Partition)

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB