-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
io: make EPOLLERR
awake AsyncFd::readable
#4444
Conversation
Currently, if a user is awaiting on a `AsyncFd` becoming readable and we get a polling error (which may happen in some cases, see the referenced issue), the awaiter will never be woken. Ideally, they would be notified of this error in some way. The solution here involves communicating to the `Readiness` object that there was an error, and when that happens, the `Readiness` future will return a synthetic error that boils down to "there was a polling error". If the user wants to find more information about the underlying error, they would have to find some other way (not clear how yet? as pointed out in the issue, for sockets one could use `SO_ERROR`).
tokio/src/io/driver/scheduled_io.rs
Outdated
// should we wakeup all waiters in case of error? I guess that | ||
// depends on the semantics of EPOLLERR but I don't know yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to wake up all of them on error.
// the information is read via the `curr` in the scheduled io | ||
if ready.is_error() { | ||
return Poll::Ready(Err( | ||
std::io::Error::new(std::io::ErrorKind::Other, "Polling error") | ||
)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could get a better error for the user here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in trying to find the underlying cause of the polling error? The issue contained some discussion on that. See #4349 (comment) and #4349 (comment)
The way I see it this interface would just return something along the lines of "Polling error" and then the user of this interface (who knows what kind of FD they are working with) would do whatever is needed to recover the underlying error (using SO_ERROR
for example, if dealing with a socket).
I wish there was an ErrorKind::PollingError
though, but maybe it's just that I don't know enough rust error handling to know how to expose this error to the user of the interface?
// TODO: I'm not sure why this safety claim is true. What about | ||
// it being `Done` prevents it from being shared? As far as I can tell | ||
// the `wake0` method in the scheduledio doesn't care about this being | ||
// Done? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's a bit unclear, but I think it's correct. How can it transition to Done
while being shared?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me get back to you on that, I have to think through it
I've added some throwaway code to implement this functionality, which I already implemented for `readable()`. I haven't been able to test it yet, but that's the next thing I plan to do. After coming up with a satisfactory testing story, I will think through each of the TODOs one by one
I've implemented the functionality (epollerr awaking waiters) for the My current plan is to find a satisfactory way to test these changes. It doesn't seem immediately obvious how to do it, because the only way I know to trigger a Please let me know if you have any thoughts |
I think I've found a way to test this automatically! By searching
So this looks very promising! I will try this over the following days |
That was wrong. It's true that if the counter's value is 0xfffffffff then polling on the eventfd will return EPOLLERR, but it's very hard to write that value to the eventfd in the first place. According to the man page
I will search for other ways to trigger EPOLLER |
I also include the code for a couple of manual tests I've introduced to verify my changes work as expected. Ideally we would have good automated unit tests but I haven't found reliable ways to reproduce `EPOLLERR` that do not require `root`
Added an early return that simplifies things a bit IMO and simplified an if/else if/else branch below
Poll::Ready(Ok(ReadyEvent { | ||
tick: TICK.unpack(curr) as u8, | ||
ready, | ||
ready: direction.mask(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah this change is not correct. I thought I was simplifying the if/else if/else
but it's not a correct transformation. Will fix
I've addressed a few of the I have not found any way to reliably reproduce an Before investing more in these clunky manual tests, I would like to hear if you have any tips on how to test this. Some ideas that I have:
|
Maybe I can add unit tests in the |
If adding unit tests will allow you to test it, then that's fine. |
(this is still very much WIP) I think I might be able to unit test part of the changes. I've written one such test in https://github.com/tokio-rs/tokio/pull/4444/files#diff-585bbdb06d4f023035fc2520bc464acb733f7f212921936f13c9873798cbfc92R602 , and will keep trying to add more tests over the following days |
Ok. Let me know when you need additional review. |
I'm running into the same issue. And I think an |
I'm seeing the same issue. I've a Since there is no easy workaround, has there been any progress on this PR? |
doing some cross-linking here. In #5566, I've now opened an additional PR that reports error readiness, see #5781. That means that with What it doesn't do yet is interact with the polling functions (e.g. |
I'm going to close this as solved by #5781. But let me know on the issue if I've missed something. |
Ref: #4349
Motivation
Currently, if a user is awaiting on a
AsyncFd
becoming readable and weget a polling error (which may happen in some cases, see the referenced
issue), the awaiter will never be woken. Ideally, they would be notified
of this error in some way.
Solution
The solution here involves communicating to the
Readiness
object thatthere was an error, and when that happens, the
Readiness
future willreturn a synthetic error that boils down to "there was a polling error".
If the user wants to find more information about the underlying error,
they would have to find some other way (not clear how yet? as pointed
out in the issue, for sockets one could use
SO_ERROR
).Open questions
EPOLLERR
does not wake upAsyncFd::readable()
#4349 but that requiressudo
which might not be great for automated testing. It's also kind of an involved setup. I will have to look into this.EPOLLERR
also triggersis_write_closed
in the mio event. I don't know if we should notify writers of an error or just of "write closed".