Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

X11: Retry Interrupted event polls once #1952

Closed
wants to merge 1 commit into from
Closed

X11: Retry Interrupted event polls once #1952

wants to merge 1 commit into from

Conversation

jfrimmel
Copy link

@jfrimmel jfrimmel commented May 27, 2021

As the description of std::io::ErrorKind::Interrupted states:

Interrupted operations can typically be retried.

If there is a poll, which fails for being interrupted, it may succeed on a retry. Therefore this commit retries such an interrupted operation once. The background is, that entering sleep mode will cause interrupted events, which leads to panics on wake-up. This commit handles that case gracefully.

Fixes #1947.

  • Tested on all platforms changed (only X11 backend affected)
  • Compilation warnings were addressed
  • cargo fmt has been run on this branch
  • cargo doc builds successfully
  • Added an entry to CHANGELOG.md if knowledge of this change could be valuable to users
  • Updated documentation to reflect any user-facing changes, including notes of platform-specific behavior
  • Created or updated an example program if it would help users understand this functionality (not relevant for this bug-fix)
  • Updated feature matrix, if new features were added or implemented (none)

@jfrimmel
Copy link
Author

I don't know, if such a bugfix should be noted in the changelog or documentation. If the answer is yes, I'd be happy to amend this PR.

@ArturKovacs
Copy link
Contributor

Thanks for the PR! Yeah add a changelog entry, but this seems to be good otherwise.

@maroider maroider added DS - x11 C - waiting on maintainer A maintainer must review this code labels May 27, 2021
Copy link
Contributor

@ArturKovacs ArturKovacs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again!

@ArturKovacs
Copy link
Contributor

@francesca64, two web related tests that don't exist anymore are still being required to pass; this prevents PRs from getting merged. The stable test that replaced these is

stable, wasm32-unknown-unknown, windows-latest, web

If I'm not mistaken you can change this under Settings / Branches / Rules / "Require status checks to pass before merging"

CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Member

@maroider maroider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ArturKovacs
Copy link
Contributor

I cancelled the most recent workflow, because it will stall on the non-existing checks anyways.

@maroider maroider added C - waiting on author Waiting for a response or another PR and removed C - waiting on maintainer A maintainer must review this code labels May 28, 2021
@jfrimmel
Copy link
Author

Hi, is there anything missing (except for the CI run)? Or should only the CI be re-started?

@ArturKovacs
Copy link
Contributor

The PR is all good. We are just waiting for Francesca to update the repo settings, and then we can re-start the ci checks.

@maroider

This comment has been minimized.

@ArturKovacs ArturKovacs mentioned this pull request Jun 3, 2021
8 tasks
@francesca64
Copy link
Member

I've updated the branch rules! You should be good to go.

@maroider
Copy link
Member

maroider commented Jun 21, 2021

You may want to rebase, @jfrimmel, as CHANGELOG.md has conflicts now.

As the description of `std::io::ErrorKind::Interrupted` states:
> Interrupted operations can typically be retried.

If there is a poll, which fails for being interrupted, it may succeed on
a retry. Therefore this commit retries such an interrupted operation
once.

The background is, that entering sleep mode will cause interrupted
events, which leads to panics on wake-up. This commit handles that case
gracefully.
@jfrimmel
Copy link
Author

I've rebased onto the current master (i.e. there are no actual code changes)

@hcsch
Copy link

hcsch commented Jun 30, 2021

Looping until the polling succeeds might be a better choice. I recently had an issue with the polling code as it is in v0.25 (and the current master) where debugging a multi-threaded application causes the code to panic due to an interrupt. This happens because the debugger stops all threads when a breakpoint is hit, even outside of the thread the event loop is running in. This seems to reliably interrupts syscalls.

I tried my MCVE from #1972 with your changed code that re-polls once in case of an interrupt and was still able to get a panic due to an unwrapped interrupt error. Admittedly it was a lot harder and probably would happen rarely if ever in the wild, but I'd say it is better to rule it out entirely.

Something like the following should work (I checked that with my example and couldn't get it to panic due to an interrupt, which makes sense):

if !self.event_processor.poll() {
    while let Err(e) = self.poll.poll(&mut events, timeout) {
        match e.kind() {
            std::io::ErrorKind::Interrupted => (), // retry polling
            _ => Err(e).expect("polling failed irrecoverably"),
        }
    }
    events.clear();
}

Using a looping approach like above would fix #1972 for good (at least for this instance of unwrapping io::Result<T>s).

@ArturKovacs
Copy link
Contributor

@hcsch convinced me that it would indeed be better to poll until it returns with something else than "Interrupted". I would also add a debug log message so that if somehow an application gets stuck in an endless polling loop, the developer would know why that happens. Like so:

if !self.event_processor.poll() {
    while let Err(e) = self.poll.poll(&mut events, timeout) {
        match e.kind() {
            // retry polling
            std::io::ErrorKind::Interrupted => debug!("Poll returned 'Interrupted'. Retrying."),
            _ => Err(e).expect("polling failed irrecoverably"),
        }
    }
    events.clear();
}

@hcsch
Copy link

hcsch commented Nov 20, 2021

It seems that with e9d5b20 the code won't panic on interrupts anymore and will retry in the next iteration of the main loop (unlike with what was discussed here, where it would retry immediately). This PR should therefore maybe also be closed (as the issue #1972 was with this commit).

@kchibisov kchibisov closed this Nov 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C - waiting on author Waiting for a response or another PR DS - x11
Development

Successfully merging this pull request may close these issues.

Panic with ControlFlow::Wait when computer goes to sleep on Linux/X11
6 participants