-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
process: use blocking threadpool for child stdio I/O #4824
Conversation
4a11f75
to
74a78e6
Compare
#[derive(Debug)] | ||
pub(crate) struct ChildStdio { | ||
// Used for accessing the raw handle, even if the io version is busy | ||
raw: Arc<StdFile>, | ||
// For doing I/O operations asynchronously | ||
io: Blocking<ArcFile>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like you could drop the Arc
stuff if you just stored the raw handle next to the io
field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::process::Stdio
wants ownership of the handle (and will close it on drop) which might cause issues if the handle is still being written to in the background. Using the Arc allows us to potentially avoid a syscall to duplicate the handle if it doesn't have any pending I/O on it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could implement a try_unwrap
on the Blocking
type to get the same behavior without the Arc
. But I don't mind either way, so I will approve it and let you decide if you want to implement that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did consider adding try_unwrap
to Blocking
, the tricky bit is that it moves the handle to and from the blocking threadpool, so if there is a pending I/O operation we cannot access anything. The conversion from ChildStd*
to Stdio
are synchronous, so we cannot even internally .await
things. Hence the Arc
approach allows us to at least clone the handle if needed
I'm going to leave it as is for now since we can always revisit it later as needed. Thanks for the review!
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [tokio](https://tokio.rs) ([source](https://github.com/tokio-rs/tokio)) | dependencies | minor | `1.20.1` -> `1.21.0` | | [tokio](https://tokio.rs) ([source](https://github.com/tokio-rs/tokio)) | dev-dependencies | minor | `1.20.1` -> `1.21.0` | --- ### Release Notes <details> <summary>tokio-rs/tokio</summary> ### [`v1.21.0`](https://github.com/tokio-rs/tokio/releases/tag/tokio-1.21.0) [Compare Source](tokio-rs/tokio@tokio-1.20.1...tokio-1.21.0) ##### 1.21.0 (September 2, 2022) This release is the first release of Tokio to intentionally support WASM. The `sync,macros,io-util,rt,time` features are stabilized on WASM. Additionally the wasm32-wasi target is given unstable support for the `net` feature. ##### Added - net: add `device` and `bind_device` methods to TCP/UDP sockets ([#​4882]) - net: add `tos` and `set_tos` methods to TCP and UDP sockets ([#​4877]) - net: add security flags to named pipe `ServerOptions` ([#​4845]) - signal: add more windows signal handlers ([#​4924]) - sync: add `mpsc::Sender::max_capacity` method ([#​4904]) - sync: implement Weak version of `mpsc::Sender` ([#​4595]) - task: add `LocalSet::enter` ([#​4765]) - task: stabilize `JoinSet` and `AbortHandle` ([#​4920]) - tokio: add `track_caller` to public APIs ([#​4805], [#​4848], [#​4852]) - wasm: initial support for `wasm32-wasi` target ([#​4716]) ##### Fixed - miri: improve miri compatibility by avoiding temporary references in `linked_list::Link` impls ([#​4841]) - signal: don't register write interest on signal pipe ([#​4898]) - sync: add `#[must_use]` to lock guards ([#​4886]) - sync: fix hang when calling `recv` on closed and reopened broadcast channel ([#​4867]) - task: propagate attributes on task-locals ([#​4837]) ##### Changed - fs: change panic to error in `File::start_seek` ([#​4897]) - io: reduce syscalls in `poll_read` ([#​4840]) - process: use blocking threadpool for child stdio I/O ([#​4824]) - signal: make `SignalKind` methods const ([#​4956]) ##### Internal changes - rt: extract `basic_scheduler::Config` ([#​4935]) - rt: move I/O driver into `runtime` module ([#​4942]) - rt: rename internal scheduler types ([#​4945]) ##### Documented - chore: fix typos and grammar ([#​4858], [#​4894], [#​4928]) - io: fix typo in `AsyncSeekExt::rewind` docs ([#​4893]) - net: add documentation to `try_read()` for zero-length buffers ([#​4937]) - runtime: remove incorrect panic section for `Builder::worker_threads` ([#​4849]) - sync: doc of `watch::Sender::send` improved ([#​4959]) - task: add cancel safety docs to `JoinHandle` ([#​4901]) - task: expand on cancellation of `spawn_blocking` ([#​4811]) - time: clarify that the first tick of `Interval::tick` happens immediately ([#​4951]) ##### Unstable - rt: add unstable option to disable the LIFO slot ([#​4936]) - task: fix incorrect signature in `Builder::spawn_on` ([#​4953]) - task: make `task::Builder::spawn*` methods fallible ([#​4823]) [#​4595]: tokio-rs/tokio#4595 [#​4716]: tokio-rs/tokio#4716 [#​4765]: tokio-rs/tokio#4765 [#​4805]: tokio-rs/tokio#4805 [#​4811]: tokio-rs/tokio#4811 [#​4823]: tokio-rs/tokio#4823 [#​4824]: tokio-rs/tokio#4824 [#​4837]: tokio-rs/tokio#4837 [#​4840]: tokio-rs/tokio#4840 [#​4841]: tokio-rs/tokio#4841 [#​4845]: tokio-rs/tokio#4845 [#​4848]: tokio-rs/tokio#4848 [#​4849]: tokio-rs/tokio#4849 [#​4852]: tokio-rs/tokio#4852 [#​4858]: tokio-rs/tokio#4858 [#​4867]: tokio-rs/tokio#4867 [#​4877]: tokio-rs/tokio#4877 [#​4882]: tokio-rs/tokio#4882 [#​4886]: tokio-rs/tokio#4886 [#​4893]: tokio-rs/tokio#4893 [#​4894]: tokio-rs/tokio#4894 [#​4897]: tokio-rs/tokio#4897 [#​4898]: tokio-rs/tokio#4898 [#​4901]: tokio-rs/tokio#4901 [#​4904]: tokio-rs/tokio#4904 [#​4920]: tokio-rs/tokio#4920 [#​4924]: tokio-rs/tokio#4924 [#​4928]: tokio-rs/tokio#4928 [#​4935]: tokio-rs/tokio#4935 [#​4936]: tokio-rs/tokio#4936 [#​4937]: tokio-rs/tokio#4937 [#​4942]: tokio-rs/tokio#4942 [#​4945]: tokio-rs/tokio#4945 [#​4951]: tokio-rs/tokio#4951 [#​4953]: tokio-rs/tokio#4953 [#​4956]: tokio-rs/tokio#4956 [#​4959]: tokio-rs/tokio#4959 </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, click this checkbox. --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzMi4xODcuMCIsInVwZGF0ZWRJblZlciI6IjMyLjE4Ny4wIn0=--> Co-authored-by: cabr2-bot <cabr2.help@gmail.com> Reviewed-on: https://codeberg.org/Calciumdibromid/CaBr2/pulls/1532 Reviewed-by: crapStone <crapstone@noreply.codeberg.org> Co-authored-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org> Co-committed-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>
Fixes #26179. The original error reported in that issue is fixed on canary, but in local testing on my windows machine, `next build` would just hang forever. After some digging, what happens is that at some point in next build, readFile promises (from `fs/promises` ) just never resolve, and so next hangs. It turns out the issue is saturating tokio's blocking task thread pool. We previously limited the number of blocking threads to 32, and at some point those threads are all in use and there's no thread available for the file reads. What's taking up all of those threads? The answer turns out to be `tokio::process`. On windows, child process stdio uses the blocking threadpool: tokio-rs/tokio#4824. When you poll the child's stdio on windows, it spawns a blocking task per poll, and calls `std::io::Read::read` in the blocking context. That call can block until data is available. Putting it all together, what happens is that Next.js spawns `2 * the number of CPU cores` deno child subprocesses to do work. We implement `child_process` with `tokio::process`. When the child processes' stdio get polled, blocking tasks get spawned, and those blocking tasks might block until data is available. So if you have 16 cores (as I do), there are going to be potentially >32 blocking task threadpool threads taken just by the child processes. That leaves no room for other tasks to make progress --- To fix this, for now, increase the size of the blocking threadpool on windows. 4 * the number of CPU cores should be enough to leave room for other tasks to make progress. Longer term, this can be fixed more properly when we handroll our own subprocess code (needed for detached processes and additional pipes on windows).
Fixes #26179. The original error reported in that issue is fixed on canary, but in local testing on my windows machine, `next build` would just hang forever. After some digging, what happens is that at some point in next build, readFile promises (from `fs/promises` ) just never resolve, and so next hangs. It turns out the issue is saturating tokio's blocking task thread pool. We previously limited the number of blocking threads to 32, and at some point those threads are all in use and there's no thread available for the file reads. What's taking up all of those threads? The answer turns out to be `tokio::process`. On windows, child process stdio uses the blocking threadpool: tokio-rs/tokio#4824. When you poll the child's stdio on windows, it spawns a blocking task per poll, and calls `std::io::Read::read` in the blocking context. That call can block until data is available. Putting it all together, what happens is that Next.js spawns `2 * the number of CPU cores` deno child subprocesses to do work. We implement `child_process` with `tokio::process`. When the child processes' stdio get polled, blocking tasks get spawned, and those blocking tasks might block until data is available. So if you have 16 cores (as I do), there are going to be potentially >32 blocking task threadpool threads taken just by the child processes. That leaves no room for other tasks to make progress --- To fix this, for now, increase the size of the blocking threadpool on windows. 4 * the number of CPU cores should be enough to leave room for other tasks to make progress. Longer term, this can be fixed more properly when we handroll our own subprocess code (needed for detached processes and additional pipes on windows).
We can't mix async pipes and child process stdio on Windows. This switches our Windows process implementation to use the blocking threadpool instead of using an async named pipe as we did in the past.
Refs #4802