-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: properly close fd when return-fd op cancelled failed #318
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ihciah
force-pushed
the
fix-lifecycle
branch
5 times, most recently
from
October 30, 2024 14:50
5ed4fd3
to
7f00dd4
Compare
ihciah
force-pushed
the
fix-lifecycle
branch
3 times, most recently
from
October 31, 2024 08:37
b2e8dbe
to
221708e
Compare
Lzzzzzt
requested changes
Nov 5, 2024
Lzzzzzt
approved these changes
Nov 5, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
har23k
approved these changes
Nov 6, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What Problem This PR Fixed
In a readiness-based synchronous non-blocking I/O model, canceling a pending I/O call is side-effect-free because the syscall hasn’t actually been executed. In other words, the syscall’s execution state is definitive: it has either been executed or not, with no intermediate state (such as “in execution”).
However, with asynchronous I/O like io-uring, canceling I/O and executing I/O are no longer mutually exclusive. When canceling an I/O operation, it may either be successfully canceled or already completed.
Previously, we used traits like AsyncReadRent/AsyncWriteRent, which transfer buffer ownership, to address buffer memory safety. When a Future is dropped (indicating I/O cancellation), the buffer is transferred to a global state and only cleaned up when the corresponding CQE returns.
I also designed a CancellableIO trait, which works by first pushing a CancelOp when canceling cancellable I/O and then awaiting the original Op’s CQE, expecting it to return with either a cancellation success or the syscall result within a short timeframe.
For standard I/O interfaces, when cancellation occurs, the runtime will push a CancelOp. However, this alone is insufficient: CancelOp does not guarantee that the Op is canceled. If the Op returns a file descriptor (fd), we cannot ignore the possibility of a failed cancellation. Similar to the previous solutions, we need to await the Op result and close its corresponding fd.
Solution
Once we get fd from the kernel, convert it to a wrapped type instantly, and this wrapped type can guarntee the fd is closed when dropped.
How to know if the result in the CQE is fd? We can only know it with
user_data
. Since a slab lookup is essential, saving a is_fd mark inside is the cheapest way I can think about. One may think about putting logic inside the op related bottom half, but in this way fd may leak if the future not be polled. We can only wrap it with RAII when callingtick
, which means a runtime cost is unavoidable.Affected Cases
Accept and Open op are affected. If users use these ops with timeout(note the timeout may also allpied to the outer layer future), or with any style which not poll the op until ready, and use io_uring as the backend, then they may be affected, there may be fd leaks when the future cancelled.
epoll/kqueue/iocp backends are not affected.
Further reading
@ethe wrote an article explains this issue in more detail: Async Rust is not safe with io_uring
Additional Notes
This PR also fixed another minor issue.
Some background:
async-cancel
feature is enabled, the runtime will try to cancel the op by pushing a AsyncCancel op.Problem: The close op will be cancelled because of dropped, which is not what we want.
Solution: So I added a const bool for each op to hint if the cancel is desired(intended leaving without waiting or not want the op executed any more). For Close op, it is false and for others like Read op, it is true.