Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # dev - Review Request 14865: Patch for KAFKA-1097


Copy link to this message
-
Re: Review Request 14865: Patch for KAFKA-1097
Neha Narkhede 2013-10-24, 18:14


> On Oct. 24, 2013, 12:30 a.m., Guozhang Wang wrote:
> > kafka-patch-review.py, line 100
> > <https://reviews.apache.org/r/14865/diff/1/?file=369530#file369530line100>
> >
> >     Should this be included in this RB?

Actually I wanted to improve the tool to print the branch against which the rb is created. But I need to revise the patch, this particular diff to kafka-patch-review.py can be ignored
> On Oct. 24, 2013, 12:30 a.m., Guozhang Wang wrote:
> > core/src/main/scala/kafka/cluster/Partition.scala, line 211
> > <https://reviews.apache.org/r/14865/diff/1/?file=369526#file369526line211>
> >
> >     Previously we do not update assignedReplicas map upon receiving the LeaderAndISR request, was that OK?

Yes, since it only changed once when the partition was created and never changed after that. The exception is partition reassignment, increasing replication factor and deleting topic. So we need to fix it
> On Oct. 24, 2013, 12:30 a.m., Guozhang Wang wrote:
> > core/src/main/scala/kafka/controller/ReplicaStateMachine.scala, line 137
> > <https://reviews.apache.org/r/14865/diff/1/?file=369528#file369528line137>
> >
> >     The changes in logic of expanding ISR should have already solved the problem. Do we need to do this here?

Yes, we need to let the leader know of the shrunk assigned replica list. So we need the controller to write the ISR and then send the shrunk assigned replica list with the shrunk ISR to the leader.
> On Oct. 24, 2013, 12:30 a.m., Guozhang Wang wrote:
> > core/src/main/scala/kafka/controller/ReplicaStateMachine.scala, line 181
> > <https://reviews.apache.org/r/14865/diff/1/?file=369528#file369528line181>
> >
> >     Ditto as above.

This is required for correctness. Since the controller is actively trying to let the replica to stop, it has to send the stop replica request. It also reduces the probability of the replica unnecessarily trying to fetch from the leader and re-enter ISR
- Neha
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14865/#review27433
-----------------------------------------------------------
On Oct. 23, 2013, 4:34 a.m., Neha Narkhede wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14865/
> -----------------------------------------------------------
>
> (Updated Oct. 23, 2013, 4:34 a.m.)
>
>
> Review request for kafka.
>
>
> Bugs: KAFKA-1097
>     https://issues.apache.org/jira/browse/KAFKA-1097
>
>
> Repository: kafka
>
>
> Description
> -------
>
> KAFKA-1097 Race condition while reassigning low throughput partition leads to incorrect ISR information in zookeeper; The changes include 1) Adding the ISR shrink logic as part of the OfflineReplica -> NonExistentReplica state change 2) Adding a safety check on the broker where it only expands the ISR if the replica is in the assigned replica list 3) Updating the assigned replica list on the broker on every makeLeader request and also on makeFollower request for safety, though that's not strictly required. These changes will ensure that the ISR is shrunk by the controller and the leader has an updated assigned replica list. So even if a replica sends a fetch request after the ISR is shrunk by the controller, the broker will not be able to update the ISR until it receives the next LeaderAndIsrRequest (which notifies it of the latest zkVersion of the partition state path) that also contains the shrunk ISR and assigned replica list. Using that the broker will avoid expanding the ISR i
 f the replica is not present in the new assigned replica list
>
>
> Diffs
> -----
>
>   core/src/main/scala/kafka/cluster/Partition.scala 60f3ed4e88b08d0dcf2ea84259ae6dbe9d7c3a2d
>   core/src/main/scala/kafka/controller/KafkaController.scala 88d130f55997b72a8590e4cfe92857a7320e70d5
>   core/src/main/scala/kafka/controller/ReplicaStateMachine.scala 212c05d65dcdc147e55f90875bacc940e30342bf