-
Notifications
You must be signed in to change notification settings - Fork 848
Fix hosting.config reload #9046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ccf7dc0 to
eadf861
Compare
Contributor
|
[approve ci docs] |
Contributor
|
[approve ci autest] |
SolidWallOfCode
previously approved these changes
Sep 26, 2022
Member
SolidWallOfCode
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder about line 284, though - there doesn't seem to be a provision for a failure on a reload.
randall
reviewed
Oct 6, 2022
eadf861 to
571bc7d
Compare
randall
approved these changes
Oct 17, 2022
Member
Author
|
[approve ci autest] |
zwoop
pushed a commit
that referenced
this pull request
Nov 1, 2022
(cherry picked from commit 026d906)
Contributor
|
Cherry-picked to v9.2.x |
SolidWallOfCode
pushed a commit
to SolidWallOfCode/trafficserver
that referenced
this pull request
Nov 15, 2022
masaori335
pushed a commit
to masaori335/trafficserver
that referenced
this pull request
Feb 21, 2023
* asf/9.2.x: Updated ChangeLog Add docs for strategies.yaml hash_string (apache#9026) Fix hosting.config reloading (apache#9046) Remove unnecessary, dangerous casts from SET_HANDLER and SET_CONTINUATION invocations. (apache#9129) Remove deprecated ld option (--add-needed) (apache#9141)
masaori335
added a commit
to masaori335/trafficserver
that referenced
this pull request
Mar 7, 2023
This reverts commit 158e25e.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an alternative to #8664. That PR is fine in-itself and fixes a bug around the the atomic swap not actually reloading. But fixing that bug so that reloading hosting.config works exposes another deeper bug: the reloaded object has no synchronization around it.
Not only is the pointer itself never read with synchronization, even if it were, the object itself is never synchronized. So even if the pointer were atomic, the object reads could get different variables from different objects, when callers that need multiple variables really need them to come from a single object. Worse, if a request takes longer than the arbitrary delete timer, it could access freed memory.
This fixes the object to be synchronized, and all access behind a mutex. The reload swap and bad drive removal acquire a write lock, and all other accesses (none of which modify the object) acquire a read lock.
I really don't like adding a mutex to the request path. But the only other options I see are an incredibly complex and bug-prone lock-free solution, or not making it possible to reload hosting.config. As much as I dislike it, the mutex should never have contention except on reload or drive failure. So it's probably actually faster and fewer atomic instructions than even a lock-free solution.
This also adds some helper classes, to make it impossible to accidentally forget to acquire or release the mutex, or modify the object with a readlock, via RAII idioms.
Recommend backporting to 9.2.
Fixes #7220