-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
recover when OS fails to spawn a new thread #4485
Conversation
Avoid panicking when the OS reaches the limit of the number of threads / processes and the error is temporary. Spawning a new thread is not mandatory to make progress as long as there is a least one thread in the pool already processing the task queue. Fixes: tokio-rs#2309
I looked at how I could test this, but it let me puzzled... |
Co-authored-by: Eliza Weisman <eliza@buoyant.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me!
Co-authored-by: Eliza Weisman <eliza@buoyant.io>
There are some CI failures that you will have to investigate. |
Looks like OutOfMemory was added in 1.54 and therefore it fails on 1.49. I confirmed that the fix is working on MacOS and this is a non-issue on Windows since there doesn't seem to be a limit of the number of threads (available memory is the limit). I tried on a VM and I couldn't trigger the bug. About the test, I wonder if something like this would work on the CI:
I wonder how accurate and stable a test like this would be. I doubt it is worth. Ideas ? |
OutOfMemory is not available in 1.49 and I can't find a reproducible scenario anyway. The fix works on Windows and mac OS so removing the comment.
It could be worth using
I'm a bit skeptical about running a test like this on CI, it seems potentially flaky. This code seems simple enough that I'm not sure if it's really worth adding such a complex test or not. If we are going to do this, though, I think it's better to lower the thread limit using |
I agree it's fine.
I'm very skeptical too, using If it's fine for you like this, it's fine for me. Thanks for the review. |
@hawkw is this ok to merge ? |
I was hoping to get a second approval from another maintainer, just to make sure I haven't missed anything, but I think it seems good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No concerns from me.
# 1.18.0 (April 27, 2022) This release adds a number of new APIs in `tokio::net`, `tokio::signal`, and `tokio::sync`. In addition, it adds new unstable APIs to `tokio::task` (`Id`s for uniquely identifying a task, and `AbortHandle` for remotely cancelling a task), as well as a number of bugfixes. ### Fixed - blocking: add missing `#[track_caller]` for `spawn_blocking` ([#4616]) - macros: fix `select` macro to process 64 branches ([#4519]) - net: fix `try_io` methods not calling Mio's `try_io` internally ([#4582]) - runtime: recover when OS fails to spawn a new thread ([#4485]) ### Added - macros: support setting a custom crate name for `#[tokio::main]` and `#[tokio::test]` ([#4613]) - net: add `UdpSocket::peer_addr` ([#4611]) - net: add `try_read_buf` method for named pipes ([#4626]) - signal: add `SignalKind` `Hash`/`Eq` impls and `c_int` conversion ([#4540]) - signal: add support for signals up to `SIGRTMAX` ([#4555]) - sync: add `watch::Sender::send_modify` method ([#4310]) - sync: add `broadcast::Receiver::len` method ([#4542]) - sync: add `watch::Receiver::same_channel` method ([#4581]) - sync: implement `Clone` for `RecvError` types ([#4560]) ### Changed - update `nix` to 0.24, limit features ([#4631]) - update `mio` to 0.8.1 ([#4582]) - macros: rename `tokio::select!`'s internal `util` module ([#4543]) - runtime: use `Vec::with_capacity` when building runtime ([#4553]) ### Documented - improve docs for `tokio_unstable` ([#4524]) - runtime: include more documentation for thread_pool/worker ([#4511]) - runtime: update `Handle::current`'s docs to mention `EnterGuard` ([#4567]) - time: clarify platform specific timer resolution ([#4474]) - signal: document that `Signal::recv` is cancel-safe ([#4634]) - sync: `UnboundedReceiver` close docs ([#4548]) ### Unstable The following changes only apply when building with `--cfg tokio_unstable`: - task: add `task::Id` type ([#4630]) - task: add `AbortHandle` type for cancelling tasks in a `JoinSet` ([#4530], [#4640]) - task: fix missing `doc(cfg(...))` attributes for `JoinSet` ([#4531]) - task: fix broken link in `AbortHandle` RustDoc ([#4545]) - metrics: add initial IO driver metrics ([#4507]) [#4616]: #4616 [#4519]: #4519 [#4582]: #4582 [#4485]: #4485 [#4613]: #4613 [#4611]: #4611 [#4626]: #4626 [#4540]: #4540 [#4555]: #4555 [#4310]: #4310 [#4542]: #4542 [#4581]: #4581 [#4560]: #4560 [#4631]: #4631 [#4582]: #4582 [#4543]: #4543 [#4553]: #4553 [#4524]: #4524 [#4511]: #4511 [#4567]: #4567 [#4474]: #4474 [#4634]: #4634 [#4548]: #4548 [#4630]: #4630 [#4530]: #4530 [#4640]: #4640 [#4531]: #4531 [#4545]: #4545 [#4507]: #4507
# 1.18.0 (April 27, 2022) This release adds a number of new APIs in `tokio::net`, `tokio::signal`, and `tokio::sync`. In addition, it adds new unstable APIs to `tokio::task` (`Id`s for uniquely identifying a task, and `AbortHandle` for remotely cancelling a task), as well as a number of bugfixes. ### Fixed - blocking: add missing `#[track_caller]` for `spawn_blocking` ([#4616]) - macros: fix `select` macro to process 64 branches ([#4519]) - net: fix `try_io` methods not calling Mio's `try_io` internally ([#4582]) - runtime: recover when OS fails to spawn a new thread ([#4485]) ### Added - macros: support setting a custom crate name for `#[tokio::main]` and `#[tokio::test]` ([#4613]) - net: add `UdpSocket::peer_addr` ([#4611]) - net: add `try_read_buf` method for named pipes ([#4626]) - signal: add `SignalKind` `Hash`/`Eq` impls and `c_int` conversion ([#4540]) - signal: add support for signals up to `SIGRTMAX` ([#4555]) - sync: add `watch::Sender::send_modify` method ([#4310]) - sync: add `broadcast::Receiver::len` method ([#4542]) - sync: add `watch::Receiver::same_channel` method ([#4581]) - sync: implement `Clone` for `RecvError` types ([#4560]) ### Changed - update `nix` to 0.24, limit features ([#4631]) - update `mio` to 0.8.1 ([#4582]) - macros: rename `tokio::select!`'s internal `util` module ([#4543]) - runtime: use `Vec::with_capacity` when building runtime ([#4553]) ### Documented - improve docs for `tokio_unstable` ([#4524]) - runtime: include more documentation for thread_pool/worker ([#4511]) - runtime: update `Handle::current`'s docs to mention `EnterGuard` ([#4567]) - time: clarify platform specific timer resolution ([#4474]) - signal: document that `Signal::recv` is cancel-safe ([#4634]) - sync: `UnboundedReceiver` close docs ([#4548]) ### Unstable The following changes only apply when building with `--cfg tokio_unstable`: - task: add `task::Id` type ([#4630]) - task: add `AbortHandle` type for cancelling tasks in a `JoinSet` ([#4530], [#4640]) - task: fix missing `doc(cfg(...))` attributes for `JoinSet` ([#4531]) - task: fix broken link in `AbortHandle` RustDoc ([#4545]) - metrics: add initial IO driver metrics ([#4507]) [#4616]: #4616 [#4519]: #4519 [#4582]: #4582 [#4485]: #4485 [#4613]: #4613 [#4611]: #4611 [#4626]: #4626 [#4540]: #4540 [#4555]: #4555 [#4310]: #4310 [#4542]: #4542 [#4581]: #4581 [#4560]: #4560 [#4631]: #4631 [#4582]: #4582 [#4543]: #4543 [#4553]: #4553 [#4524]: #4524 [#4511]: #4511 [#4567]: #4567 [#4474]: #4474 [#4634]: #4634 [#4548]: #4548 [#4630]: #4630 [#4530]: #4530 [#4640]: #4640 [#4531]: #4531 [#4545]: #4545 [#4507]: #4507
# 1.18.0 (April 27, 2022) This release adds a number of new APIs in `tokio::net`, `tokio::signal`, and `tokio::sync`. In addition, it adds new unstable APIs to `tokio::task` (`Id`s for uniquely identifying a task, and `AbortHandle` for remotely cancelling a task), as well as a number of bugfixes. ### Fixed - blocking: add missing `#[track_caller]` for `spawn_blocking` ([#4616]) - macros: fix `select` macro to process 64 branches ([#4519]) - net: fix `try_io` methods not calling Mio's `try_io` internally ([#4582]) - runtime: recover when OS fails to spawn a new thread ([#4485]) ### Added - macros: support setting a custom crate name for `#[tokio::main]` and `#[tokio::test]` ([#4613]) - net: add `UdpSocket::peer_addr` ([#4611]) - net: add `try_read_buf` method for named pipes ([#4626]) - signal: add `SignalKind` `Hash`/`Eq` impls and `c_int` conversion ([#4540]) - signal: add support for signals up to `SIGRTMAX` ([#4555]) - sync: add `watch::Sender::send_modify` method ([#4310]) - sync: add `broadcast::Receiver::len` method ([#4542]) - sync: add `watch::Receiver::same_channel` method ([#4581]) - sync: implement `Clone` for `RecvError` types ([#4560]) ### Changed - update `nix` to 0.24, limit features ([#4631]) - update `mio` to 0.8.1 ([#4582]) - macros: rename `tokio::select!`'s internal `util` module ([#4543]) - runtime: use `Vec::with_capacity` when building runtime ([#4553]) ### Documented - improve docs for `tokio_unstable` ([#4524]) - runtime: include more documentation for thread_pool/worker ([#4511]) - runtime: update `Handle::current`'s docs to mention `EnterGuard` ([#4567]) - time: clarify platform specific timer resolution ([#4474]) - signal: document that `Signal::recv` is cancel-safe ([#4634]) - sync: `UnboundedReceiver` close docs ([#4548]) ### Unstable The following changes only apply when building with `--cfg tokio_unstable`: - task: add `task::Id` type ([#4630]) - task: add `AbortHandle` type for cancelling tasks in a `JoinSet` ([#4530], [#4640]) - task: fix missing `doc(cfg(...))` attributes for `JoinSet` ([#4531]) - task: fix broken link in `AbortHandle` RustDoc ([#4545]) - metrics: add initial IO driver metrics ([#4507]) [#4616]: #4616 [#4519]: #4519 [#4582]: #4582 [#4485]: #4485 [#4613]: #4613 [#4611]: #4611 [#4626]: #4626 [#4540]: #4540 [#4555]: #4555 [#4310]: #4310 [#4542]: #4542 [#4581]: #4581 [#4560]: #4560 [#4631]: #4631 [#4582]: #4582 [#4543]: #4543 [#4553]: #4553 [#4524]: #4524 [#4511]: #4511 [#4567]: #4567 [#4474]: #4474 [#4634]: #4634 [#4548]: #4548 [#4630]: #4630 [#4530]: #4530 [#4640]: #4640 [#4531]: #4531 [#4545]: #4545 [#4507]: #4507 Signed-off-by: Eliza Weisman <eliza@buoyant.io>
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [tokio](https://tokio.rs) ([source](https://github.com/tokio-rs/tokio)) | dependencies | minor | `1.17.0` -> `1.18.0` | | [tokio](https://tokio.rs) ([source](https://github.com/tokio-rs/tokio)) | dev-dependencies | minor | `1.17.0` -> `1.18.0` | --- ### Release Notes <details> <summary>tokio-rs/tokio</summary> ### [`v1.18.0`](https://github.com/tokio-rs/tokio/releases/tokio-1.18.0) [Compare Source](tokio-rs/tokio@tokio-1.17.0...tokio-1.18.0) ##### 1.18.0 (April 27, 2022) This release adds a number of new APIs in `tokio::net`, `tokio::signal`, and `tokio::sync`. In addition, it adds new unstable APIs to `tokio::task` (`Id`s for uniquely identifying a task, and `AbortHandle` for remotely cancelling a task), as well as a number of bugfixes. ##### Fixed - blocking: add missing `#[track_caller]` for `spawn_blocking` ([#​4616](tokio-rs/tokio#4616)) - macros: fix `select` macro to process 64 branches ([#​4519](tokio-rs/tokio#4519)) - net: fix `try_io` methods not calling Mio's `try_io` internally ([#​4582](tokio-rs/tokio#4582)) - runtime: recover when OS fails to spawn a new thread ([#​4485](tokio-rs/tokio#4485)) ##### Added - macros: support setting a custom crate name for `#[tokio::main]` and `#[tokio::test]` ([#​4613](tokio-rs/tokio#4613)) - net: add `UdpSocket::peer_addr` ([#​4611](tokio-rs/tokio#4611)) - net: add `try_read_buf` method for named pipes ([#​4626](tokio-rs/tokio#4626)) - signal: add `SignalKind` `Hash`/`Eq` impls and `c_int` conversion ([#​4540](tokio-rs/tokio#4540)) - signal: add support for signals up to `SIGRTMAX` ([#​4555](tokio-rs/tokio#4555)) - sync: add `watch::Sender::send_modify` method ([#​4310](tokio-rs/tokio#4310)) - sync: add `broadcast::Receiver::len` method ([#​4542](tokio-rs/tokio#4542)) - sync: add `watch::Receiver::same_channel` method ([#​4581](tokio-rs/tokio#4581)) - sync: implement `Clone` for `RecvError` types ([#​4560](tokio-rs/tokio#4560)) ##### Changed - update `mio` to 0.8.1 ([#​4582](tokio-rs/tokio#4582)) - macros: rename `tokio::select!`'s internal `util` module ([#​4543](tokio-rs/tokio#4543)) - runtime: use `Vec::with_capacity` when building runtime ([#​4553](tokio-rs/tokio#4553)) ##### Documented - improve docs for `tokio_unstable` ([#​4524](tokio-rs/tokio#4524)) - runtime: include more documentation for thread_pool/worker ([#​4511](tokio-rs/tokio#4511)) - runtime: update `Handle::current`'s docs to mention `EnterGuard` ([#​4567](tokio-rs/tokio#4567)) - time: clarify platform specific timer resolution ([#​4474](tokio-rs/tokio#4474)) - signal: document that `Signal::recv` is cancel-safe ([#​4634](tokio-rs/tokio#4634)) - sync: `UnboundedReceiver` close docs ([#​4548](tokio-rs/tokio#4548)) ##### Unstable The following changes only apply when building with `--cfg tokio_unstable`: - task: add `task::Id` type ([#​4630](tokio-rs/tokio#4630)) - task: add `AbortHandle` type for cancelling tasks in a `JoinSet` ([#​4530](tokio-rs/tokio#4530)], \[[#​4640](tokio-rs/tokio#4640)) - task: fix missing `doc(cfg(...))` attributes for `JoinSet` ([#​4531](tokio-rs/tokio#4531)) - task: fix broken link in `AbortHandle` RustDoc ([#​4545](tokio-rs/tokio#4545)) - metrics: add initial IO driver metrics ([#​4507](tokio-rs/tokio#4507)) </details> --- ### Configuration 📅 **Schedule**: At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, click this checkbox. --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). Co-authored-by: cabr2-bot <cabr2.help@gmail.com> Reviewed-on: https://codeberg.org/Calciumdibromid/CaBr2/pulls/1327 Reviewed-by: crapStone <crapstone@noreply.codeberg.org> Co-authored-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org> Co-committed-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>
Motivation
As described in #2309 the runtime panics when trying to spawn a new thread and the OS have reaches the limit of threads / processes.
This can be observed on a Linux machine decreasing the limit with
ulimit -u
and running the program in #2309.As noted by @Darksonn there is a pool of threads, and failing to spawn a new one is not mandatory for the program to make progress.
Solution
Ignore the temporary errors when failing to spawn a new thread from the OS.
The task will be naturally scheduled by threads already in the pool.
Fixes: #2309