-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-13911: Fix the rate window size calculation for edge cases #12184
Conversation
fd62c15
to
ceed4a6
Compare
@ijuma perhaps you might want to take a look at this? The test failures are unrelated to the changes and are known flaky failures. |
@showuon perhaps you want to take a look at this? |
@dajac @guozhangwang please review when you get a chance. |
@tombentley @mimaison Would you like to review this one please? |
@divijvaidya I am sorry but I did not have the time to look into this one yet and I will be away for the next three weeks. I may be able to take a look when I come back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
## Problem Implementation of connection creation rate quotas in Kafka is dependent on two configurations: [quota.window.num](https://kafka.apache.org/documentation.html#brokerconfigs_quota.window.num) AND [quota.window.size.seconds](https://kafka.apache.org/documentation.html#brokerconfigs_quota.window.size.seconds) The minimum possible values of these configuration is 1 as per the documentation. However, when we set 1 as the configuration value, we can hit a situation where rate is calculated as NaN (and hence, leads to exceptions). This specific scenario occurs when an event is recorded at the start of a sample window. ## Solution This patch fixes this edge case by ensuring that the windowSize over which Rate is calculated is at least 1ms (even if it is calculated at the start of the sample window). ## Test Added a unit test which fails before the patch and passes after the patch Reviewers: Ismael Juma <ismael@juma.me.uk>, David Mao <dmao@confluent.io>
* apache-github/trunk: (447 commits) KAFKA-13959: Controller should unfence Broker with busy metadata log (apache#12274) KAFKA-10199: Expose read only task from state updater (apache#12497) KAFKA-14154; Return NOT_CONTROLLER from AlterPartition if leader is ahead of controller (apache#12506) KAFKA-13986; Brokers should include node.id in fetches to metadata quorum (apache#12498) KAFKA-14163; Retry compilation after zinc compile cache error (apache#12507) Remove duplicate common.message.* from clients:test jar file (apache#12407) KAFKA-13060: Replace EasyMock and PowerMock with Mockito in WorkerGroupMemberTest.java (apache#12484) Fix the rate window size calculation for edge cases (apache#12184) MINOR: Upgrade gradle to 7.5.1 and bump other build/test dependencies (apache#12495) KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode (apache#12487) KAFKA-14114: Add Metadata Error Related Metrics MINOR: BrokerMetadataSnapshotter must avoid exceeding batch size (apache#12486) MINOR: Upgrade mockito test dependencies (apache#12460) KAFKA-14144:; Compare AlterPartition LeaderAndIsr before fencing partition epoch (apache#12489) KAFKA-14134: Replace EasyMock with Mockito for WorkerConnectorTest (apache#12472) MINOR: Update scala version in bin scripts to 2.13.8 (apache#12477) KAFKA-14104; Add CRC validation when iterating over Metadata Log Records (apache#12457) MINOR: add :server-common test dependency to :storage (apache#12488) KAFKA-14107: Upgrade Jetty version for CVE fixes (apache#12440) KAFKA-14124: improve quorum controller fault handling (apache#12447) ...
Problem
Implementation of connection creation rate quotas in Kafka is dependent on two configurations:
quota.window.num AND quota.window.size.seconds
The minimum possible values of these configuration is 1 as per the documentation. However, when we set 1 as the configuration value, we can hit a situation where rate is calculated as NaN (and hence, leads to exceptions). This specific scenario occurs when an event is recorded at the start of a sample window.
Solution
This patch fixes this edge case by ensuring that the windowSize over which Rate is calculated is at least 1ms (even if it is calculated at the start of the sample window).
Test
Added a unit test which fails before the patch and passes after the patch
Committer Checklist (excluded from commit message)