-
Notifications
You must be signed in to change notification settings - Fork 13.9k
std: Fix WASI implementation of remove_dir_all
#146691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std: Fix WASI implementation of remove_dir_all
#146691
Conversation
This commit is a change to the WASI-specific implementation of the `std::fs::remove_dir_all` function. Specifically it changes how directory entries are read of a directory-being-deleted to specifically buffer them all into a `Vec` before actually proceeding to delete anything. This is necessary to fix an interaction with how the WASIp1 `fd_readdir` API works to have everything work out in the face of mutations while reading a directory. The basic problem is that `fd_readdir`, the WASIp1 API for reading directories, is not a stateful read of a directory but instead a "seekable" read of a directory. Its `cookie` argument enables seeking anywhere within the directory at any time to read further entries. Native host implementations do not have this ability, however, which means that this seeking property must be achieved by re-reading the directory. The problem with this is that WASIp1 has under-specified semantics around what should happen if a directory is mutated between two calls to `fd_readdir`. In essence there's not really any possible implementation in hosts except to read the entire directory and support seeking through the already-read list. This implementation is not possible in the WASIp1-to-WASIp2 adapter that is primarily used to create components for the `wasm32-wasip2` target where it has constrained memory requirements and can't buffer up arbitrarily sized directories. The WASIp1 API definitions are effectively "dead" now at the standards level meaning that `fd_readdir` won't be changing nor will a replacement be coming. For the `wasm32-wasip2` target this will get fixed once filesystem APIs are updated to use WASIp2 directly instead of WASIp1, making this buffering unnecessary. In essence while this is a hack it's sort of the least invasive thing that works everywhere for now. I don't think this is viable to fix in hosts so guests compiled to wasm are going to have to work around it by not relying on any guarantees about what happens to a directory if it's mutated between reads.
rustbot has assigned @Mark-Simulacrum. Use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @alexcrichton!
I assume that once wasip2 has a separate implementation than wasip1 the fix would only be retained for wasip1?
Correct yeah, this'll only affect WASIp1 in the long run. @bors: r=juntyr |
…-buffer, r=juntyr std: Fix WASI implementation of `remove_dir_all` This commit is a change to the WASI-specific implementation of the `std::fs::remove_dir_all` function. Specifically it changes how directory entries are read of a directory-being-deleted to specifically buffer them all into a `Vec` before actually proceeding to delete anything. This is necessary to fix an interaction with how the WASIp1 `fd_readdir` API works to have everything work out in the face of mutations while reading a directory. The basic problem is that `fd_readdir`, the WASIp1 API for reading directories, is not a stateful read of a directory but instead a "seekable" read of a directory. Its `cookie` argument enables seeking anywhere within the directory at any time to read further entries. Native host implementations do not have this ability, however, which means that this seeking property must be achieved by re-reading the directory. The problem with this is that WASIp1 has under-specified semantics around what should happen if a directory is mutated between two calls to `fd_readdir`. In essence there's not really any possible implementation in hosts except to read the entire directory and support seeking through the already-read list. This implementation is not possible in the WASIp1-to-WASIp2 adapter that is primarily used to create components for the `wasm32-wasip2` target where it has constrained memory requirements and can't buffer up arbitrarily sized directories. There's some more detailed discussion at bytecodealliance/wasmtime#11701 (comment) as well. The WASIp1 API definitions are effectively "dead" now at the standards level meaning that `fd_readdir` won't be changing nor will a replacement be coming. For the `wasm32-wasip2` target this will get fixed once filesystem APIs are updated to use WASIp2 directly instead of WASIp1, making this buffering unnecessary. In essence while this is a hack it's sort of the least invasive thing that works everywhere for now. I don't think this is viable to fix in hosts so guests compiled to wasm are going to have to work around it by not relying on any guarantees about what happens to a directory if it's mutated between reads.
Rollup of 8 pull requests Successful merges: - #146229 (Automatically switch to lto-fat when flag RUSTFLAGS="- Zautodiff=Enable" is set) - #146615 (rustc_codegen_llvm: Feature Conversion Tidying) - #146638 (`rustc_next_trait_solver`: canonical out of `EvalCtxt`) - #146663 (Allow windows resource compiler to be overridden) - #146691 (std: Fix WASI implementation of `remove_dir_all`) - #146709 (stdarch subtree update) - #146731 (test: Use SVG for terminal url test) - #146738 (Fix tidy spellchecking on Windows) r? `@ghost` `@rustbot` modify labels: rollup
…-buffer, r=juntyr std: Fix WASI implementation of `remove_dir_all` This commit is a change to the WASI-specific implementation of the `std::fs::remove_dir_all` function. Specifically it changes how directory entries are read of a directory-being-deleted to specifically buffer them all into a `Vec` before actually proceeding to delete anything. This is necessary to fix an interaction with how the WASIp1 `fd_readdir` API works to have everything work out in the face of mutations while reading a directory. The basic problem is that `fd_readdir`, the WASIp1 API for reading directories, is not a stateful read of a directory but instead a "seekable" read of a directory. Its `cookie` argument enables seeking anywhere within the directory at any time to read further entries. Native host implementations do not have this ability, however, which means that this seeking property must be achieved by re-reading the directory. The problem with this is that WASIp1 has under-specified semantics around what should happen if a directory is mutated between two calls to `fd_readdir`. In essence there's not really any possible implementation in hosts except to read the entire directory and support seeking through the already-read list. This implementation is not possible in the WASIp1-to-WASIp2 adapter that is primarily used to create components for the `wasm32-wasip2` target where it has constrained memory requirements and can't buffer up arbitrarily sized directories. There's some more detailed discussion at bytecodealliance/wasmtime#11701 (comment) as well. The WASIp1 API definitions are effectively "dead" now at the standards level meaning that `fd_readdir` won't be changing nor will a replacement be coming. For the `wasm32-wasip2` target this will get fixed once filesystem APIs are updated to use WASIp2 directly instead of WASIp1, making this buffering unnecessary. In essence while this is a hack it's sort of the least invasive thing that works everywhere for now. I don't think this is viable to fix in hosts so guests compiled to wasm are going to have to work around it by not relying on any guarantees about what happens to a directory if it's mutated between reads.
Rollup of 10 pull requests Successful merges: - #146229 (Automatically switch to lto-fat when flag RUSTFLAGS="- Zautodiff=Enable" is set) - #146484 (rustdoc-search: JavaScript optimization based on Firefox Profiler output) - #146541 (std: simplify host lookup) - #146615 (rustc_codegen_llvm: Feature Conversion Tidying) - #146638 (`rustc_next_trait_solver`: canonical out of `EvalCtxt`) - #146663 (Allow windows resource compiler to be overridden) - #146691 (std: Fix WASI implementation of `remove_dir_all`) - #146709 (stdarch subtree update) - #146738 (Fix tidy spellchecking on Windows) - #146740 (miri subtree update) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of #146691 - alexcrichton:wasip1-remove-dir-all-buffer, r=juntyr std: Fix WASI implementation of `remove_dir_all` This commit is a change to the WASI-specific implementation of the `std::fs::remove_dir_all` function. Specifically it changes how directory entries are read of a directory-being-deleted to specifically buffer them all into a `Vec` before actually proceeding to delete anything. This is necessary to fix an interaction with how the WASIp1 `fd_readdir` API works to have everything work out in the face of mutations while reading a directory. The basic problem is that `fd_readdir`, the WASIp1 API for reading directories, is not a stateful read of a directory but instead a "seekable" read of a directory. Its `cookie` argument enables seeking anywhere within the directory at any time to read further entries. Native host implementations do not have this ability, however, which means that this seeking property must be achieved by re-reading the directory. The problem with this is that WASIp1 has under-specified semantics around what should happen if a directory is mutated between two calls to `fd_readdir`. In essence there's not really any possible implementation in hosts except to read the entire directory and support seeking through the already-read list. This implementation is not possible in the WASIp1-to-WASIp2 adapter that is primarily used to create components for the `wasm32-wasip2` target where it has constrained memory requirements and can't buffer up arbitrarily sized directories. There's some more detailed discussion at bytecodealliance/wasmtime#11701 (comment) as well. The WASIp1 API definitions are effectively "dead" now at the standards level meaning that `fd_readdir` won't be changing nor will a replacement be coming. For the `wasm32-wasip2` target this will get fixed once filesystem APIs are updated to use WASIp2 directly instead of WASIp1, making this buffering unnecessary. In essence while this is a hack it's sort of the least invasive thing that works everywhere for now. I don't think this is viable to fix in hosts so guests compiled to wasm are going to have to work around it by not relying on any guarantees about what happens to a directory if it's mutated between reads.
Rollup of 10 pull requests Successful merges: - rust-lang/rust#146229 (Automatically switch to lto-fat when flag RUSTFLAGS="- Zautodiff=Enable" is set) - rust-lang/rust#146484 (rustdoc-search: JavaScript optimization based on Firefox Profiler output) - rust-lang/rust#146541 (std: simplify host lookup) - rust-lang/rust#146615 (rustc_codegen_llvm: Feature Conversion Tidying) - rust-lang/rust#146638 (`rustc_next_trait_solver`: canonical out of `EvalCtxt`) - rust-lang/rust#146663 (Allow windows resource compiler to be overridden) - rust-lang/rust#146691 (std: Fix WASI implementation of `remove_dir_all`) - rust-lang/rust#146709 (stdarch subtree update) - rust-lang/rust#146738 (Fix tidy spellchecking on Windows) - rust-lang/rust#146740 (miri subtree update) r? `@ghost` `@rustbot` modify labels: rollup
…-buffer, r=juntyr std: Fix WASI implementation of `remove_dir_all` This commit is a change to the WASI-specific implementation of the `std::fs::remove_dir_all` function. Specifically it changes how directory entries are read of a directory-being-deleted to specifically buffer them all into a `Vec` before actually proceeding to delete anything. This is necessary to fix an interaction with how the WASIp1 `fd_readdir` API works to have everything work out in the face of mutations while reading a directory. The basic problem is that `fd_readdir`, the WASIp1 API for reading directories, is not a stateful read of a directory but instead a "seekable" read of a directory. Its `cookie` argument enables seeking anywhere within the directory at any time to read further entries. Native host implementations do not have this ability, however, which means that this seeking property must be achieved by re-reading the directory. The problem with this is that WASIp1 has under-specified semantics around what should happen if a directory is mutated between two calls to `fd_readdir`. In essence there's not really any possible implementation in hosts except to read the entire directory and support seeking through the already-read list. This implementation is not possible in the WASIp1-to-WASIp2 adapter that is primarily used to create components for the `wasm32-wasip2` target where it has constrained memory requirements and can't buffer up arbitrarily sized directories. There's some more detailed discussion at bytecodealliance/wasmtime#11701 (comment) as well. The WASIp1 API definitions are effectively "dead" now at the standards level meaning that `fd_readdir` won't be changing nor will a replacement be coming. For the `wasm32-wasip2` target this will get fixed once filesystem APIs are updated to use WASIp2 directly instead of WASIp1, making this buffering unnecessary. In essence while this is a hack it's sort of the least invasive thing that works everywhere for now. I don't think this is viable to fix in hosts so guests compiled to wasm are going to have to work around it by not relying on any guarantees about what happens to a directory if it's mutated between reads.
Rollup of 10 pull requests Successful merges: - rust-lang#146229 (Automatically switch to lto-fat when flag RUSTFLAGS="- Zautodiff=Enable" is set) - rust-lang#146484 (rustdoc-search: JavaScript optimization based on Firefox Profiler output) - rust-lang#146541 (std: simplify host lookup) - rust-lang#146615 (rustc_codegen_llvm: Feature Conversion Tidying) - rust-lang#146638 (`rustc_next_trait_solver`: canonical out of `EvalCtxt`) - rust-lang#146663 (Allow windows resource compiler to be overridden) - rust-lang#146691 (std: Fix WASI implementation of `remove_dir_all`) - rust-lang#146709 (stdarch subtree update) - rust-lang#146738 (Fix tidy spellchecking on Windows) - rust-lang#146740 (miri subtree update) r? `@ghost` `@rustbot` modify labels: rollup
…-buffer, r=juntyr std: Fix WASI implementation of `remove_dir_all` This commit is a change to the WASI-specific implementation of the `std::fs::remove_dir_all` function. Specifically it changes how directory entries are read of a directory-being-deleted to specifically buffer them all into a `Vec` before actually proceeding to delete anything. This is necessary to fix an interaction with how the WASIp1 `fd_readdir` API works to have everything work out in the face of mutations while reading a directory. The basic problem is that `fd_readdir`, the WASIp1 API for reading directories, is not a stateful read of a directory but instead a "seekable" read of a directory. Its `cookie` argument enables seeking anywhere within the directory at any time to read further entries. Native host implementations do not have this ability, however, which means that this seeking property must be achieved by re-reading the directory. The problem with this is that WASIp1 has under-specified semantics around what should happen if a directory is mutated between two calls to `fd_readdir`. In essence there's not really any possible implementation in hosts except to read the entire directory and support seeking through the already-read list. This implementation is not possible in the WASIp1-to-WASIp2 adapter that is primarily used to create components for the `wasm32-wasip2` target where it has constrained memory requirements and can't buffer up arbitrarily sized directories. There's some more detailed discussion at bytecodealliance/wasmtime#11701 (comment) as well. The WASIp1 API definitions are effectively "dead" now at the standards level meaning that `fd_readdir` won't be changing nor will a replacement be coming. For the `wasm32-wasip2` target this will get fixed once filesystem APIs are updated to use WASIp2 directly instead of WASIp1, making this buffering unnecessary. In essence while this is a hack it's sort of the least invasive thing that works everywhere for now. I don't think this is viable to fix in hosts so guests compiled to wasm are going to have to work around it by not relying on any guarantees about what happens to a directory if it's mutated between reads.
This commit is a change to the WASI-specific implementation of the
std::fs::remove_dir_all
function. Specifically it changes how directory entries are read of a directory-being-deleted to specifically buffer them all into aVec
before actually proceeding to delete anything. This is necessary to fix an interaction with how the WASIp1fd_readdir
API works to have everything work out in the face of mutations while reading a directory.The basic problem is that
fd_readdir
, the WASIp1 API for reading directories, is not a stateful read of a directory but instead a "seekable" read of a directory. Itscookie
argument enables seeking anywhere within the directory at any time to read further entries. Native host implementations do not have this ability, however, which means that this seeking property must be achieved by re-reading the directory. The problem with this is that WASIp1 has under-specified semantics around what should happen if a directory is mutated between two calls tofd_readdir
. In essence there's not really any possible implementation in hosts except to read the entire directory and support seeking through the already-read list. This implementation is not possible in the WASIp1-to-WASIp2 adapter that is primarily used to create components for thewasm32-wasip2
target where it has constrained memory requirements and can't buffer up arbitrarily sized directories. There's some more detailed discussion at bytecodealliance/wasmtime#11701 (comment) as well.The WASIp1 API definitions are effectively "dead" now at the standards level meaning that
fd_readdir
won't be changing nor will a replacement be coming. For thewasm32-wasip2
target this will get fixed once filesystem APIs are updated to use WASIp2 directly instead of WASIp1, making this buffering unnecessary. In essence while this is a hack it's sort of the least invasive thing that works everywhere for now. I don't think this is viable to fix in hosts so guests compiled to wasm are going to have to work around it by not relying on any guarantees about what happens to a directory if it's mutated between reads.