KAFKA-14144: Compare AlterPartition LeaderAndIsr before fencing partition epoch #12489

splett2 · 2022-08-06T15:08:45Z

What

This PR fixes an AlterPartition regression introduced in #12032

When an AlterPartition request succeeds, the partition epoch gets bumped. In Zk controller mode the sender also relies on the AlterPartition response to be informed of the new partition epoch.
If the sender times out the request before a response is sent, the sender will have a stale partition epoch compared to the ZK controller state and will be fenced on subsequent AlterPartition request attempts. The sender will not receive an updated partition epoch until it receives a LeaderAndIsr request for controller-initiated ISR changes.

Testing

The existing AlterPartition idempotency test did not catch this regression because the test uses the current partitionEpoch for the AlterPartition request. We update the test to try the request with various partition epochs.

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

hachikuji · 2022-08-08T22:38:32Z

core/src/test/scala/unit/kafka/controller/ControllerIntegrationTest.scala

+    // epoch), expect it to succeed while the partition epoch remains the same
+    sendAndVerifyAlterPartitionResponse(oldLeaderAndIsr.partitionEpoch)
+    sendAndVerifyAlterPartitionResponse(newPartitionEpoch)
+    sendAndVerifyAlterPartitionResponse(newPartitionEpoch + 1)


I do find it a little odd that the partition epoch is ignored completely when the ISR matches the desired state. We do have the leader epoch check, so at least we can ensure that an old leader won't be mislead into thinking that its change was successfully applied. How about a case when the request is sent to an old controller? Suppose a scenario like this:

Controller A has leader=1, isr=[1,2], partition epoch=10

Controller B is elected

Leader sends AlterPartition(epoch=10) to B to remove 2 from ISR => partition epoch = 11

Leader sends AlterPartition(epoch=11) to A to add 2 back to the ISR => A accepts, but there is no bump

I think this case is ruled out because the leader has to find the new controller and then revert back. The controller epoch probably would catch that case. What if we add a restart between steps 3 and 4? Would it be possible to find the old controller after restarting? Probably not, but I think I'd sleep better if we could at least reject requests where the partition epoch is greater than what the controller has in its cache. Does that make sense?

Seems fairly reasonable to me.

hachikuji

Thanks for the patch. LGTM.

…ition epoch (#12489) This PR fixes an AlterPartition regression introduced in #12032 When an AlterPartition request succeeds, the partition epoch gets bumped. In Zk controller mode the sender also relies on the AlterPartition response to be informed of the new partition epoch. If the sender times out the request before a response is sent, the sender will have a stale partition epoch compared to the ZK controller state and will be fenced on subsequent AlterPartition request attempts. The sender will not receive an updated partition epoch until it receives a LeaderAndIsr request for controller-initiated ISR changes. Reviewers: Jason Gustafson <jason@confluent.io>

…(10 August 2022) Trivial conflict in gradle/dependencies.gradle due to the newer Netty version in confluentinc/kafka. * apache-github/trunk: MINOR: Upgrade gradle to 7.5.1 and bump other build/test dependencies (apache#12495) KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode (apache#12487) KAFKA-14114: Add Metadata Error Related Metrics MINOR: BrokerMetadataSnapshotter must avoid exceeding batch size (apache#12486) MINOR: Upgrade mockito test dependencies (apache#12460) KAFKA-14144:; Compare AlterPartition LeaderAndIsr before fencing partition epoch (apache#12489) KAFKA-14134: Replace EasyMock with Mockito for WorkerConnectorTest (apache#12472) MINOR: Update scala version in bin scripts to 2.13.8 (apache#12477) KAFKA-14104; Add CRC validation when iterating over Metadata Log Records (apache#12457) MINOR: add :server-common test dependency to :storage (apache#12488) KAFKA-14107: Upgrade Jetty version for CVE fixes (apache#12440) KAFKA-14124: improve quorum controller fault handling (apache#12447)

* apache-github/trunk: (447 commits) KAFKA-13959: Controller should unfence Broker with busy metadata log (apache#12274) KAFKA-10199: Expose read only task from state updater (apache#12497) KAFKA-14154; Return NOT_CONTROLLER from AlterPartition if leader is ahead of controller (apache#12506) KAFKA-13986; Brokers should include node.id in fetches to metadata quorum (apache#12498) KAFKA-14163; Retry compilation after zinc compile cache error (apache#12507) Remove duplicate common.message.* from clients:test jar file (apache#12407) KAFKA-13060: Replace EasyMock and PowerMock with Mockito in WorkerGroupMemberTest.java (apache#12484) Fix the rate window size calculation for edge cases (apache#12184) MINOR: Upgrade gradle to 7.5.1 and bump other build/test dependencies (apache#12495) KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode (apache#12487) KAFKA-14114: Add Metadata Error Related Metrics MINOR: BrokerMetadataSnapshotter must avoid exceeding batch size (apache#12486) MINOR: Upgrade mockito test dependencies (apache#12460) KAFKA-14144:; Compare AlterPartition LeaderAndIsr before fencing partition epoch (apache#12489) KAFKA-14134: Replace EasyMock with Mockito for WorkerConnectorTest (apache#12472) MINOR: Update scala version in bin scripts to 2.13.8 (apache#12477) KAFKA-14104; Add CRC validation when iterating over Metadata Log Records (apache#12457) MINOR: add :server-common test dependency to :storage (apache#12488) KAFKA-14107: Upgrade Jetty version for CVE fixes (apache#12440) KAFKA-14124: improve quorum controller fault handling (apache#12447) ...

Compare AlterPartition LeaderAndIsr before fencing partition epoch

a6da5f5

splett2 force-pushed the KAFKA-14144 branch from bd772f2 to a6da5f5 Compare August 7, 2022 02:38

hachikuji reviewed Aug 8, 2022

View reviewed changes

Fence AlterPartition requests with greater partition epochs

77005e3

hachikuji approved these changes Aug 9, 2022

View reviewed changes

hachikuji merged commit 62c2880 into apache:trunk Aug 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KAFKA-14144: Compare AlterPartition LeaderAndIsr before fencing partition epoch #12489

KAFKA-14144: Compare AlterPartition LeaderAndIsr before fencing partition epoch #12489

splett2 commented Aug 6, 2022 •

edited

Loading

hachikuji Aug 8, 2022 •

edited

Loading

splett2 Aug 9, 2022

hachikuji left a comment

KAFKA-14144: Compare AlterPartition LeaderAndIsr before fencing partition epoch #12489

KAFKA-14144: Compare AlterPartition LeaderAndIsr before fencing partition epoch #12489

Conversation

splett2 commented Aug 6, 2022 • edited Loading

What

Testing

Committer Checklist (excluded from commit message)

hachikuji Aug 8, 2022 • edited Loading

Choose a reason for hiding this comment

splett2 Aug 9, 2022

Choose a reason for hiding this comment

hachikuji left a comment

Choose a reason for hiding this comment

splett2 commented Aug 6, 2022 •

edited

Loading

hachikuji Aug 8, 2022 •

edited

Loading