-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add broker reauthentication [KIP-368] #4301
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. On broker fail, reset the timer and the reauth flag. 2. rktrans state should be set to NULL after freeing for all the providers. 3. Add a reauth flag to prevent socket_connection_setup_timeout_ms related timeouts which should not occur in reauths 4. Test fixes (timeout)
milindl
force-pushed
the
dev_kip-368-sasl-reauth
branch
from
June 14, 2023 05:39
68868fd
to
318f071
Compare
emasab
reviewed
Jun 14, 2023
Co-authored-by: Emanuele Sabellico <esabellico@confluent.io>
emasab
requested changes
Jun 14, 2023
emasab
approved these changes
Jun 14, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
7 tasks
7 tasks
lpsinger
added a commit
to lpsinger/gcn-kafka-python
that referenced
this pull request
Aug 23, 2023
This version improves reliability of refreshing the authentication of long-lived sessions. See confluentinc/librdkafka#4301.
lpsinger
added a commit
to nasa-gcn/gcn-kafka-python
that referenced
this pull request
Aug 23, 2023
This version improves reliability of refreshing the authentication of long-lived sessions. See confluentinc/librdkafka#4301.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Thanks @vctoriawu for starting this in #3754 .
This builds up on that and changes it as per the comments, and the internal discussion with Magnus and the rest of the team.
Here's how we're doing the reauthentication:
session_lifetime_ms
in the SaslAuthenticate response for a brokerrkb
, start a timer for that broker at 90% of that.rkb
.rd_kafka_broker_op_serve
, and we set max_inflight request to 1, and change the broker state into one of reauth.rd_kafka_broker_thread_main
, we do some cleanup for the preexisting SASL state, and just do Auth exactly the same as the normal way (when we do it the first time around). This takes care of resetting max_inflight to the correct value, too.As the KIP and the discussion in #3754 points out, we can't send anything between the auth requests.
Setting max_inflight to 1 means that only one request may be in flight, and since the Sasl* requests have a high priority in the queue (RD_KAFKA_PRIO_FLASH), they will actually hold the other requests back till authentication is complete. Setting it to 1 also means that any requests already in flight will await responses before the auth sequence starts.
For OAUTHBEARER, the token itself has an expiry time. There are two cases here:
In this case, the session_lifetime_ms in the SaslAuthenticate response is set to the time left for the token's expiry. Since our OAUTHBEARER callback runs at 80% of (time left to token's expiry) and our reauth runs at 90% of (time left to token's expiry), we'll refresh the token before the reauth.
It's somewhat trickier than that, because
next_token_refresh_time := client_time + 0.8*(token_expiry - client_time)
andreauth_time := client_time + 0.9*session_lifetime_ms
. Since session_lifetime_ms is calculated on the server, it might have some drift between the clocks, as well as well as discount the time it takes for the communication between the client/server. But it's expected that even iftoken_expiry - client_time != session_lifetime_ms
, the 0.8/0.9 factors will make up for it, as typical token refresh/re-authentication times are on the order of hours (and not seconds).The broker returns connections.max.reauth.ms as the session_lifetime_ms. Since the token has a later expiry, we just use the same token to reauthenticate.
Adds tests (0142. 0141 is already there in the KIP-881)
Also fixes a bug where while using OAUTHBEARER with the unsecure token would not have the callback queue initialized.
Also fixes several bugs with the interactive broker script which caused it to completely ignore any user-provided config, and I added to the documentation as well.
A new parameter has been added to the interactive broker which accepts a reauth time for the broker. The default value of this has been set to a non-zero number. This will serve to provide some extra coverage in long running tests - it is intentionally so.