Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-13911: Fix the rate window size calculation for edge cases #12184

Merged
merged 1 commit into from
Aug 10, 2022

Conversation

divijvaidya
Copy link
Contributor

Problem

Implementation of connection creation rate quotas in Kafka is dependent on two configurations:
quota.window.num AND quota.window.size.seconds

The minimum possible values of these configuration is 1 as per the documentation. However, when we set 1 as the configuration value, we can hit a situation where rate is calculated as NaN (and hence, leads to exceptions). This specific scenario occurs when an event is recorded at the start of a sample window.

Solution

This patch fixes this edge case by ensuring that the windowSize over which Rate is calculated is at least 1ms (even if it is calculated at the start of the sample window).

Test

Added a unit test which fails before the patch and passes after the patch

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@divijvaidya divijvaidya force-pushed the KAFKA-13911 branch 3 times, most recently from fd62c15 to ceed4a6 Compare May 23, 2022 09:26
@divijvaidya
Copy link
Contributor Author

@ijuma perhaps you might want to take a look at this? The test failures are unrelated to the changes and are known flaky failures.

@divijvaidya
Copy link
Contributor Author

@showuon perhaps you want to take a look at this?

@divijvaidya
Copy link
Contributor Author

@dajac @guozhangwang please review when you get a chance.

@divijvaidya
Copy link
Contributor Author

@tombentley @mimaison Would you like to review this one please?

@dajac
Copy link
Member

dajac commented Jul 13, 2022

@divijvaidya I am sorry but I did not have the time to look into this one yet and I will be away for the next three weeks. I may be able to take a look when I come back.

@jsancio jsancio added the 3.3 label Aug 5, 2022
Copy link
Contributor

@splett2 splett2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

@ijuma ijuma merged commit f17928e into apache:trunk Aug 10, 2022
ijuma pushed a commit that referenced this pull request Aug 10, 2022
## Problem
Implementation of connection creation rate quotas in Kafka is dependent on two configurations:
[quota.window.num](https://kafka.apache.org/documentation.html#brokerconfigs_quota.window.num) AND [quota.window.size.seconds](https://kafka.apache.org/documentation.html#brokerconfigs_quota.window.size.seconds)

The minimum possible values of these configuration is 1 as per the documentation. However, when we set 1 as the configuration value, we can hit a situation where rate is calculated as NaN (and hence, leads to exceptions). This specific scenario occurs when an event is recorded at the start of a sample window.

## Solution
This patch fixes this edge case by ensuring that the windowSize over which Rate is calculated is at least 1ms (even if it is calculated at the start of the sample window).

## Test
Added a unit test which fails before the patch and passes after the patch

Reviewers: Ismael Juma <ismael@juma.me.uk>, David Mao <dmao@confluent.io>
ijuma added a commit to franz1981/kafka that referenced this pull request Aug 12, 2022
* apache-github/trunk: (447 commits)
  KAFKA-13959: Controller should unfence Broker with busy metadata log (apache#12274)
  KAFKA-10199: Expose read only task from state updater (apache#12497)
  KAFKA-14154; Return NOT_CONTROLLER from AlterPartition if leader is ahead of controller (apache#12506)
  KAFKA-13986; Brokers should include node.id in fetches to metadata quorum (apache#12498)
  KAFKA-14163; Retry compilation after zinc compile cache error (apache#12507)
  Remove duplicate common.message.* from clients:test jar file (apache#12407)
  KAFKA-13060: Replace EasyMock and PowerMock with Mockito in WorkerGroupMemberTest.java (apache#12484)
  Fix the rate window size calculation for edge cases (apache#12184)
  MINOR: Upgrade gradle to 7.5.1 and bump other build/test dependencies (apache#12495)
  KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode (apache#12487)
  KAFKA-14114: Add Metadata Error Related Metrics
  MINOR: BrokerMetadataSnapshotter must avoid exceeding batch size (apache#12486)
  MINOR: Upgrade mockito test dependencies (apache#12460)
  KAFKA-14144:; Compare AlterPartition LeaderAndIsr before fencing partition epoch (apache#12489)
  KAFKA-14134: Replace EasyMock with Mockito for WorkerConnectorTest (apache#12472)
  MINOR: Update scala version in bin scripts to 2.13.8 (apache#12477)
  KAFKA-14104; Add CRC validation when iterating over Metadata Log Records (apache#12457)
  MINOR: add :server-common test dependency to :storage (apache#12488)
  KAFKA-14107: Upgrade Jetty version for CVE fixes (apache#12440)
  KAFKA-14124: improve quorum controller fault handling (apache#12447)
  ...
@divijvaidya divijvaidya deleted the KAFKA-13911 branch August 22, 2022 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants