Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Controlled shutdown failure, retry settings


Copy link to this message
-
Re: Controlled shutdown failure, retry settings
I've filed: https://issues.apache.org/jira/browse/KAFKA-1108
On Tue, Oct 29, 2013 at 4:29 PM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:

> Here's another exception I see during controlled shutdown (this time there
> was not an unclean shutdown problem).  Should I be concerned about this
> exception? Is any data loss possible with this?  This one happened after
> the first "Retrying controlled shutdown after the previous attempt
> failed..." message.  The controlled shutdown subsequently succeeded without
> another retry (but with a few more of these exceptions).
>
> Again, there was no "Remaining partitions to move..." message before the
> retrying message, so I assume the retry happens after an IOException (which
> is not logged in KafkaServer.controlledShutdown).
>
> 2013-10-29 20:03:31,883  INFO [kafka-request-handler-4]
> controller.ReplicaStateMachine - [Replica state machine on controller 10]:
> Invoking state change to OfflineReplica for replicas
> PartitionAndReplica(mytopic,0,10)
> 2013-10-29 20:03:31,883 ERROR [kafka-request-handler-4] change.logger -
> Controller 10 epoch 190 initiated state change of replica 10 for partition
> [mytopic,0] to OfflineReplica failed
> java.lang.AssertionError: assertion failed: Replica 10 for partition
> [mytopic,0] should be in the NewReplica,OnlineReplica states before moving
> to OfflineReplica state. Instead it is in OfflineReplica state
>         at scala.Predef$.assert(Predef.scala:91)
>         at
> kafka.controller.ReplicaStateMachine.assertValidPreviousStates(ReplicaStateMachine.scala:209)
>         at
> kafka.controller.ReplicaStateMachine.handleStateChange(ReplicaStateMachine.scala:167)
>         at
> kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$2.apply(ReplicaStateMachine.scala:89)
>         at
> kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$2.apply(ReplicaStateMachine.scala:89)
>         at scala.collection.immutable.Set$Set1.foreach(Set.scala:81)
>         at
> kafka.controller.ReplicaStateMachine.handleStateChanges(ReplicaStateMachine.scala:89)
>         at
> kafka.controller.KafkaController$$anonfun$shutdownBroker$4$$anonfun$apply$2.apply(KafkaController.scala:199)
>         at
> kafka.controller.KafkaController$$anonfun$shutdownBroker$4$$anonfun$apply$2.apply(KafkaController.scala:184)
>         at scala.Option.foreach(Option.scala:121)
>         at
> kafka.controller.KafkaController$$anonfun$shutdownBroker$4.apply(KafkaController.scala:184)
>         at
> kafka.controller.KafkaController$$anonfun$shutdownBroker$4.apply(KafkaController.scala:180)
>         at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57)
>         at
> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43)
>         at
> kafka.controller.KafkaController.shutdownBroker(KafkaController.scala:180)
>         at
> kafka.server.KafkaApis.handleControlledShutdownRequest(KafkaApis.scala:133)
>         at kafka.server.KafkaApis.handle(KafkaApis.scala:72)
>         at
> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:42)
>         at java.lang.Thread.run(Thread.java:662)
>
> Jason
>
>
> On Fri, Oct 25, 2013 at 11:51 PM, Jason Rosenberg <[EMAIL PROTECTED]>wrote:
>
>>
>>
>> On Fri, Oct 25, 2013 at 9:16 PM, Joel Koshy <[EMAIL PROTECTED]> wrote:
>>
>>>
>>> Unclean shutdown could result in data loss - since you are moving
>>> leadership to a replica that has fallen out of ISR. i.e., it's log end
>>> offset is behind the last committed message to this partition.
>>>
>>>
>> But if data is written with 'request.required.acks=-1', no data should be
>> lost, no?  Or will partitions be truncated wholesale after an unclean
>> shutdown?
>>
>>
>>
>>>
>>> Take a look at the packaged log4j.properties file. The controller's
>>> partition/replica state machines and its channel manager which
>>> sends/receives leaderandisr requests/responses to brokers uses a
>>> stateChangeLogger. The replica managers on all brokers also use this