Skip to content

Conversation

alexcrichton
Copy link
Member

This commit is a change to the WASI-specific implementation of the std::fs::remove_dir_all function. Specifically it changes how directory entries are read of a directory-being-deleted to specifically buffer them all into a Vec before actually proceeding to delete anything. This is necessary to fix an interaction with how the WASIp1 fd_readdir API works to have everything work out in the face of mutations while reading a directory.

The basic problem is that fd_readdir, the WASIp1 API for reading directories, is not a stateful read of a directory but instead a "seekable" read of a directory. Its cookie argument enables seeking anywhere within the directory at any time to read further entries. Native host implementations do not have this ability, however, which means that this seeking property must be achieved by re-reading the directory. The problem with this is that WASIp1 has under-specified semantics around what should happen if a directory is mutated between two calls to fd_readdir. In essence there's not really any possible implementation in hosts except to read the entire directory and support seeking through the already-read list. This implementation is not possible in the WASIp1-to-WASIp2 adapter that is primarily used to create components for the wasm32-wasip2 target where it has constrained memory requirements and can't buffer up arbitrarily sized directories. There's some more detailed discussion at bytecodealliance/wasmtime#11701 (comment) as well.

The WASIp1 API definitions are effectively "dead" now at the standards level meaning that fd_readdir won't be changing nor will a replacement be coming. For the wasm32-wasip2 target this will get fixed once filesystem APIs are updated to use WASIp2 directly instead of WASIp1, making this buffering unnecessary. In essence while this is a hack it's sort of the least invasive thing that works everywhere for now. I don't think this is viable to fix in hosts so guests compiled to wasm are going to have to work around it by not relying on any guarantees about what happens to a directory if it's mutated between reads.

This commit is a change to the WASI-specific implementation of the
`std::fs::remove_dir_all` function. Specifically it changes how
directory entries are read of a directory-being-deleted to specifically
buffer them all into a `Vec` before actually proceeding to delete
anything. This is necessary to fix an interaction with how the WASIp1
`fd_readdir` API works to have everything work out in the face of
mutations while reading a directory.

The basic problem is that `fd_readdir`, the WASIp1 API for reading
directories, is not a stateful read of a directory but instead a
"seekable" read of a directory. Its `cookie` argument enables seeking
anywhere within the directory at any time to read further entries.
Native host implementations do not have this ability, however, which
means that this seeking property must be achieved by re-reading the
directory. The problem with this is that WASIp1 has under-specified
semantics around what should happen if a directory is mutated between
two calls to `fd_readdir`. In essence there's not really any possible
implementation in hosts except to read the entire directory and support
seeking through the already-read list. This implementation is not
possible in the WASIp1-to-WASIp2 adapter that is primarily used to
create components for the `wasm32-wasip2` target where it has
constrained memory requirements and can't buffer up arbitrarily sized
directories.

The WASIp1 API definitions are effectively "dead" now at the standards
level meaning that `fd_readdir` won't be changing nor will a replacement
be coming. For the `wasm32-wasip2` target this will get fixed once
filesystem APIs are updated to use WASIp2 directly instead of WASIp1,
making this buffering unnecessary. In essence while this is a hack it's
sort of the least invasive thing that works everywhere for now. I don't
think this is viable to fix in hosts so guests compiled to wasm are
going to have to work around it by not relying on any guarantees about
what happens to a directory if it's mutated between reads.
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Sep 17, 2025
@rustbot
Copy link
Collaborator

rustbot commented Sep 17, 2025

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Copy link
Contributor

@juntyr juntyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @alexcrichton!

I assume that once wasip2 has a separate implementation than wasip1 the fix would only be retained for wasip1?

View changes since this review

@alexcrichton
Copy link
Member Author

Correct yeah, this'll only affect WASIp1 in the long run.

@bors: r=juntyr

@bors
Copy link
Collaborator

bors commented Sep 18, 2025

📌 Commit f900758 has been approved by juntyr

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 18, 2025
jhpratt added a commit to jhpratt/rust that referenced this pull request Sep 19, 2025
…-buffer, r=juntyr

std: Fix WASI implementation of `remove_dir_all`

This commit is a change to the WASI-specific implementation of the `std::fs::remove_dir_all` function. Specifically it changes how directory entries are read of a directory-being-deleted to specifically buffer them all into a `Vec` before actually proceeding to delete anything. This is necessary to fix an interaction with how the WASIp1 `fd_readdir` API works to have everything work out in the face of mutations while reading a directory.

The basic problem is that `fd_readdir`, the WASIp1 API for reading directories, is not a stateful read of a directory but instead a "seekable" read of a directory. Its `cookie` argument enables seeking anywhere within the directory at any time to read further entries. Native host implementations do not have this ability, however, which means that this seeking property must be achieved by re-reading the directory. The problem with this is that WASIp1 has under-specified semantics around what should happen if a directory is mutated between two calls to `fd_readdir`. In essence there's not really any possible implementation in hosts except to read the entire directory and support seeking through the already-read list. This implementation is not possible in the WASIp1-to-WASIp2 adapter that is primarily used to create components for the `wasm32-wasip2` target where it has constrained memory requirements and can't buffer up arbitrarily sized directories. There's some more detailed discussion at bytecodealliance/wasmtime#11701 (comment) as well.

The WASIp1 API definitions are effectively "dead" now at the standards level meaning that `fd_readdir` won't be changing nor will a replacement be coming. For the `wasm32-wasip2` target this will get fixed once filesystem APIs are updated to use WASIp2 directly instead of WASIp1, making this buffering unnecessary. In essence while this is a hack it's sort of the least invasive thing that works everywhere for now. I don't think this is viable to fix in hosts so guests compiled to wasm are going to have to work around it by not relying on any guarantees about what happens to a directory if it's mutated between reads.
bors added a commit that referenced this pull request Sep 19, 2025
Rollup of 8 pull requests

Successful merges:

 - #146229 (Automatically switch to lto-fat when flag RUSTFLAGS="- Zautodiff=Enable" is set)
 - #146615 (rustc_codegen_llvm: Feature Conversion Tidying)
 - #146638 (`rustc_next_trait_solver`: canonical out of `EvalCtxt`)
 - #146663 (Allow windows resource compiler to be overridden)
 - #146691 (std: Fix WASI implementation of `remove_dir_all`)
 - #146709 (stdarch subtree update)
 - #146731 (test: Use SVG for terminal url test)
 - #146738 (Fix tidy spellchecking on Windows)

r? `@ghost`
`@rustbot` modify labels: rollup
Zalathar added a commit to Zalathar/rust that referenced this pull request Sep 19, 2025
…-buffer, r=juntyr

std: Fix WASI implementation of `remove_dir_all`

This commit is a change to the WASI-specific implementation of the `std::fs::remove_dir_all` function. Specifically it changes how directory entries are read of a directory-being-deleted to specifically buffer them all into a `Vec` before actually proceeding to delete anything. This is necessary to fix an interaction with how the WASIp1 `fd_readdir` API works to have everything work out in the face of mutations while reading a directory.

The basic problem is that `fd_readdir`, the WASIp1 API for reading directories, is not a stateful read of a directory but instead a "seekable" read of a directory. Its `cookie` argument enables seeking anywhere within the directory at any time to read further entries. Native host implementations do not have this ability, however, which means that this seeking property must be achieved by re-reading the directory. The problem with this is that WASIp1 has under-specified semantics around what should happen if a directory is mutated between two calls to `fd_readdir`. In essence there's not really any possible implementation in hosts except to read the entire directory and support seeking through the already-read list. This implementation is not possible in the WASIp1-to-WASIp2 adapter that is primarily used to create components for the `wasm32-wasip2` target where it has constrained memory requirements and can't buffer up arbitrarily sized directories. There's some more detailed discussion at bytecodealliance/wasmtime#11701 (comment) as well.

The WASIp1 API definitions are effectively "dead" now at the standards level meaning that `fd_readdir` won't be changing nor will a replacement be coming. For the `wasm32-wasip2` target this will get fixed once filesystem APIs are updated to use WASIp2 directly instead of WASIp1, making this buffering unnecessary. In essence while this is a hack it's sort of the least invasive thing that works everywhere for now. I don't think this is viable to fix in hosts so guests compiled to wasm are going to have to work around it by not relying on any guarantees about what happens to a directory if it's mutated between reads.
bors added a commit that referenced this pull request Sep 19, 2025
Rollup of 10 pull requests

Successful merges:

 - #146229 (Automatically switch to lto-fat when flag RUSTFLAGS="- Zautodiff=Enable" is set)
 - #146484 (rustdoc-search: JavaScript optimization based on Firefox Profiler output)
 - #146541 (std: simplify host lookup)
 - #146615 (rustc_codegen_llvm: Feature Conversion Tidying)
 - #146638 (`rustc_next_trait_solver`: canonical out of `EvalCtxt`)
 - #146663 (Allow windows resource compiler to be overridden)
 - #146691 (std: Fix WASI implementation of `remove_dir_all`)
 - #146709 (stdarch subtree update)
 - #146738 (Fix tidy spellchecking on Windows)
 - #146740 (miri subtree update)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 20b6f75 into rust-lang:master Sep 19, 2025
10 checks passed
@rustbot rustbot added this to the 1.92.0 milestone Sep 19, 2025
rust-timer added a commit that referenced this pull request Sep 19, 2025
Rollup merge of #146691 - alexcrichton:wasip1-remove-dir-all-buffer, r=juntyr

std: Fix WASI implementation of `remove_dir_all`

This commit is a change to the WASI-specific implementation of the `std::fs::remove_dir_all` function. Specifically it changes how directory entries are read of a directory-being-deleted to specifically buffer them all into a `Vec` before actually proceeding to delete anything. This is necessary to fix an interaction with how the WASIp1 `fd_readdir` API works to have everything work out in the face of mutations while reading a directory.

The basic problem is that `fd_readdir`, the WASIp1 API for reading directories, is not a stateful read of a directory but instead a "seekable" read of a directory. Its `cookie` argument enables seeking anywhere within the directory at any time to read further entries. Native host implementations do not have this ability, however, which means that this seeking property must be achieved by re-reading the directory. The problem with this is that WASIp1 has under-specified semantics around what should happen if a directory is mutated between two calls to `fd_readdir`. In essence there's not really any possible implementation in hosts except to read the entire directory and support seeking through the already-read list. This implementation is not possible in the WASIp1-to-WASIp2 adapter that is primarily used to create components for the `wasm32-wasip2` target where it has constrained memory requirements and can't buffer up arbitrarily sized directories. There's some more detailed discussion at bytecodealliance/wasmtime#11701 (comment) as well.

The WASIp1 API definitions are effectively "dead" now at the standards level meaning that `fd_readdir` won't be changing nor will a replacement be coming. For the `wasm32-wasip2` target this will get fixed once filesystem APIs are updated to use WASIp2 directly instead of WASIp1, making this buffering unnecessary. In essence while this is a hack it's sort of the least invasive thing that works everywhere for now. I don't think this is viable to fix in hosts so guests compiled to wasm are going to have to work around it by not relying on any guarantees about what happens to a directory if it's mutated between reads.
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Sep 20, 2025
Rollup of 10 pull requests

Successful merges:

 - rust-lang/rust#146229 (Automatically switch to lto-fat when flag RUSTFLAGS="- Zautodiff=Enable" is set)
 - rust-lang/rust#146484 (rustdoc-search: JavaScript optimization based on Firefox Profiler output)
 - rust-lang/rust#146541 (std: simplify host lookup)
 - rust-lang/rust#146615 (rustc_codegen_llvm: Feature Conversion Tidying)
 - rust-lang/rust#146638 (`rustc_next_trait_solver`: canonical out of `EvalCtxt`)
 - rust-lang/rust#146663 (Allow windows resource compiler to be overridden)
 - rust-lang/rust#146691 (std: Fix WASI implementation of `remove_dir_all`)
 - rust-lang/rust#146709 (stdarch subtree update)
 - rust-lang/rust#146738 (Fix tidy spellchecking on Windows)
 - rust-lang/rust#146740 (miri subtree update)

r? `@ghost`
`@rustbot` modify labels: rollup
Muscraft pushed a commit to Muscraft/rust that referenced this pull request Sep 24, 2025
…-buffer, r=juntyr

std: Fix WASI implementation of `remove_dir_all`

This commit is a change to the WASI-specific implementation of the `std::fs::remove_dir_all` function. Specifically it changes how directory entries are read of a directory-being-deleted to specifically buffer them all into a `Vec` before actually proceeding to delete anything. This is necessary to fix an interaction with how the WASIp1 `fd_readdir` API works to have everything work out in the face of mutations while reading a directory.

The basic problem is that `fd_readdir`, the WASIp1 API for reading directories, is not a stateful read of a directory but instead a "seekable" read of a directory. Its `cookie` argument enables seeking anywhere within the directory at any time to read further entries. Native host implementations do not have this ability, however, which means that this seeking property must be achieved by re-reading the directory. The problem with this is that WASIp1 has under-specified semantics around what should happen if a directory is mutated between two calls to `fd_readdir`. In essence there's not really any possible implementation in hosts except to read the entire directory and support seeking through the already-read list. This implementation is not possible in the WASIp1-to-WASIp2 adapter that is primarily used to create components for the `wasm32-wasip2` target where it has constrained memory requirements and can't buffer up arbitrarily sized directories. There's some more detailed discussion at bytecodealliance/wasmtime#11701 (comment) as well.

The WASIp1 API definitions are effectively "dead" now at the standards level meaning that `fd_readdir` won't be changing nor will a replacement be coming. For the `wasm32-wasip2` target this will get fixed once filesystem APIs are updated to use WASIp2 directly instead of WASIp1, making this buffering unnecessary. In essence while this is a hack it's sort of the least invasive thing that works everywhere for now. I don't think this is viable to fix in hosts so guests compiled to wasm are going to have to work around it by not relying on any guarantees about what happens to a directory if it's mutated between reads.
Muscraft pushed a commit to Muscraft/rust that referenced this pull request Sep 24, 2025
Rollup of 10 pull requests

Successful merges:

 - rust-lang#146229 (Automatically switch to lto-fat when flag RUSTFLAGS="- Zautodiff=Enable" is set)
 - rust-lang#146484 (rustdoc-search: JavaScript optimization based on Firefox Profiler output)
 - rust-lang#146541 (std: simplify host lookup)
 - rust-lang#146615 (rustc_codegen_llvm: Feature Conversion Tidying)
 - rust-lang#146638 (`rustc_next_trait_solver`: canonical out of `EvalCtxt`)
 - rust-lang#146663 (Allow windows resource compiler to be overridden)
 - rust-lang#146691 (std: Fix WASI implementation of `remove_dir_all`)
 - rust-lang#146709 (stdarch subtree update)
 - rust-lang#146738 (Fix tidy spellchecking on Windows)
 - rust-lang#146740 (miri subtree update)

r? `@ghost`
`@rustbot` modify labels: rollup
github-actions bot pushed a commit to model-checking/verify-rust-std that referenced this pull request Oct 9, 2025
…-buffer, r=juntyr

std: Fix WASI implementation of `remove_dir_all`

This commit is a change to the WASI-specific implementation of the `std::fs::remove_dir_all` function. Specifically it changes how directory entries are read of a directory-being-deleted to specifically buffer them all into a `Vec` before actually proceeding to delete anything. This is necessary to fix an interaction with how the WASIp1 `fd_readdir` API works to have everything work out in the face of mutations while reading a directory.

The basic problem is that `fd_readdir`, the WASIp1 API for reading directories, is not a stateful read of a directory but instead a "seekable" read of a directory. Its `cookie` argument enables seeking anywhere within the directory at any time to read further entries. Native host implementations do not have this ability, however, which means that this seeking property must be achieved by re-reading the directory. The problem with this is that WASIp1 has under-specified semantics around what should happen if a directory is mutated between two calls to `fd_readdir`. In essence there's not really any possible implementation in hosts except to read the entire directory and support seeking through the already-read list. This implementation is not possible in the WASIp1-to-WASIp2 adapter that is primarily used to create components for the `wasm32-wasip2` target where it has constrained memory requirements and can't buffer up arbitrarily sized directories. There's some more detailed discussion at bytecodealliance/wasmtime#11701 (comment) as well.

The WASIp1 API definitions are effectively "dead" now at the standards level meaning that `fd_readdir` won't be changing nor will a replacement be coming. For the `wasm32-wasip2` target this will get fixed once filesystem APIs are updated to use WASIp2 directly instead of WASIp1, making this buffering unnecessary. In essence while this is a hack it's sort of the least invasive thing that works everywhere for now. I don't think this is viable to fix in hosts so guests compiled to wasm are going to have to work around it by not relying on any guarantees about what happens to a directory if it's mutated between reads.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants