Skip to content

KAFKA-13474: Allow reconfiguration of SSL certs for broker to controller connection#12381

Merged
showuon merged 3 commits intoapache:trunkfrom
divijvaidya:KAFKA-13474
Jul 9, 2022
Merged

KAFKA-13474: Allow reconfiguration of SSL certs for broker to controller connection#12381
showuon merged 3 commits intoapache:trunkfrom
divijvaidya:KAFKA-13474

Conversation

@divijvaidya
Copy link
Member

Scenario
The scenario is explained in details in https://issues.apache.org/jira/browse/KAFKA-13474 but as a summary:
When a certificate is rotated on a broker via dynamic configuration and the previous certificate expires, the broker to controller connection starts failing with SSL Handshake failed.

Why
A similar fix was earlier performed in #6721 but when BrokerToControllerChannelManager was introduced in v2.7, we didn't enable dynamic reconfiguration for it's channel.

Summary of testing strategy (including rationale)
Add a test which fails prior to the fix done in the PR and succeeds afterwards. The bug wasn't caught earlier because there was no test coverage to validate the scenario.

Note: I would suggest that we backport this fix to all versions until 2.7

Copy link
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@divijvaidya , this is an important issue IMO. Thanks for the fix. Overall LGTM. Left some comments about test.

@showuon
Copy link
Member

showuon commented Jul 7, 2022

Sorry, looks like the match case still needs default case:

[Error] /home/jenkins/workspace/Kafka_kafka-pr_PR-12381/core/src/main/scala/kafka/server/BrokerToControllerChannelManager.scalaː190ː7: match may not be exhaustive.

@divijvaidya
Copy link
Member Author

Sorry, looks like the match case still needs default case:

[Error] /home/jenkins/workspace/Kafka_kafka-pr_PR-12381/core/src/main/scala/kafka/server/BrokerToControllerChannelManager.scalaː190ː7: match may not be exhaustive.

My bad Luke! I did a hasty change yesterday without verifying the build. I have fixed it now.

I appreciate your quick responses on this PR. If you get time please take a look at my other open PRs as well: https://github.com/apache/kafka/pulls/divijvaidya

@divijvaidya
Copy link
Member Author

@showuon please take a look again when you get a chance. The test failures are unrelated.

Copy link
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the fix!

@showuon showuon merged commit fc6e91e into apache:trunk Jul 9, 2022
showuon pushed a commit that referenced this pull request Jul 9, 2022
…ler connection (#12381)

What:
When a certificate is rotated on a broker via dynamic configuration and the previous certificate expires, the broker to controller connection starts failing with SSL Handshake failed.

Why:
A similar fix was earlier performed in #6721 but when BrokerToControllerChannelManager was introduced in v2.7, we didn't enable dynamic reconfiguration for it's channel.

Summary of testing strategy (including rationale)
Add a test which fails prior to the fix done in the PR and succeeds afterwards. The bug wasn't caught earlier because there was no test coverage to validate the scenario.

Reviewers: Luke Chen <showuon@gmail.com>
@showuon
Copy link
Member

showuon commented Jul 9, 2022

back port back to 3.2 branch.

@divijvaidya
Copy link
Member Author

Hey @showuon
How do we make the decision on what version do we want to backport a bug to? This bug exists in versions >= 2.7.

showuon pushed a commit that referenced this pull request Jul 12, 2022
…ler connection (#12381)

What:
When a certificate is rotated on a broker via dynamic configuration and the previous certificate expires, the broker to controller connection starts failing with SSL Handshake failed.

Why:
A similar fix was earlier performed in #6721 but when BrokerToControllerChannelManager was introduced in v2.7, we didn't enable dynamic reconfiguration for it's channel.

Summary of testing strategy (including rationale)
Add a test which fails prior to the fix done in the PR and succeeds afterwards. The bug wasn't caught earlier because there was no test coverage to validate the scenario.

Reviewers: Luke Chen <showuon@gmail.com>
@showuon
Copy link
Member

showuon commented Jul 12, 2022

@divijvaidya , usually we backported to the previous version only since patch release usually has one only. I just backported back to 3.1 branch, too. It took me some time to fix the conflict and make sure test works. If you think it should backport to versions >= 2.7, welcome to submit PRs against each version, I'll help review and merge them. Thanks.

ijuma added a commit to confluentinc/kafka that referenced this pull request Aug 3, 2022
> $ git merge-base apache-github/3.3 apache-github/trunk
> 23c92ce

> $ git show 23c92ce
> commit 23c92ce
> Author: SC <pch838811@gmail.com>
> Date:   Mon Jul 11 11:36:56 2022 +0900
>
>    MINOR: Use String#format for niceMemoryUnits result (apache#12389)
>
>    Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>

* commit '23c92ce79366e86ca719e5e51c550c27324acd83':
  MINOR: Use String#format for niceMemoryUnits result (apache#12389)
  KAFKA-14055; Txn markers should not be removed by matching records in the offset map (apache#12390)
  KAFKA-13474: Allow reconfiguration of SSL certs for broker to controller connection (apache#12381)
  KAFKA-13996: log.cleaner.io.max.bytes.per.second can be changed dynamically (apache#12296)
  KAFKA-13983: Fail the creation with "/" in resource name in zk ACL (apache#12359)
  KAFKA-12943: update aggregating documentation (apache#12091)
  KAFKA-13846: Follow up PR to address review comments (apache#12297)
@divijvaidya divijvaidya deleted the KAFKA-13474 branch September 30, 2022 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants