Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Since Wasmtime 20, preopens of non-existent directories fail eagerly rather than on first access #8552

Closed
whitequark opened this issue May 4, 2024 · 11 comments · Fixed by #8572
Assignees
Labels
bug Incorrect behavior in the current implementation that needs fixing

Comments

@whitequark
Copy link
Contributor

whitequark commented May 4, 2024

Test Case

The simplest way to reproduce this is to create a Python virtual environment on a Windows machine and install yowasp-yosys in it. This application uses WASI preopens in the following way to make all of the files on the host available:

            for letter in "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ":
                wasi_cfg.preopen_dir(letter + ":\\", letter + ":")

Steps to Reproduce

  • Create a preopen with a non-existent drive letter as a target.

Expected Results

  • As in Wasmtime 19, using the preopen fails when it's actually accessed.

Actual Results

  • In Wasmtime 20, configuring WASI fails.

Versions and Environment

Wasmtime version or commit: 20.0.0

Operating system: Windows

Architecture: x64

Additional information

I can work around this easily enough (though the workaround is kind of cursed) but I'm filing this to make sure the change was intentional. I looked through release notes and PRs that seemed relevant but I'm not sure what has actually caused this change to happen. Is the behavior intentional?

@whitequark whitequark added the bug Incorrect behavior in the current implementation that needs fixing label May 4, 2024
whitequark added a commit to YoWASP/runtime-py that referenced this issue May 4, 2024
Since Wasmtime 20 (see bytecodealliance/wasmtime#8552), preopens of nonexistent drive letters eagerly fail.

This caused all YoWASP tools to become unusable on Windows.
@primoly
Copy link
Contributor

primoly commented May 6, 2024

The preopened_dir method of WasiCtxBuilder was changed in version 20 to return a Result and error when a host path can not be opened.

Version 19.0.2: https://docs.rs/wasmtime-wasi/19.0.2/wasmtime_wasi/struct.WasiCtxBuilder.html#method.preopened_dir

Version 20.0.0: https://docs.rs/wasmtime-wasi/20.0.0/wasmtime_wasi/struct.WasiCtxBuilder.html#method.preopened_dir

Related pull request: #8228

@whitequark
Copy link
Contributor Author

So just to make sure I understood this: the change is indeed intentional and the previous behavior is undesirable?

@primoly
Copy link
Contributor

primoly commented May 6, 2024

Sorry I just noticed that my previous response is not the real reason for the changed behaviour. That PR only changed the method from taking an open directory to taking a path and doing the opening itself.

The actual reason for your issue seems to be #8066 which changed from wasi-common, which stores pre-opens as a collection of paths, to wasmtime-wasi where they are handled as a list of open directories.

I wasn’t involved in any of these changes, so I don’t know if the the developers knew this would break certain things.

Personally I think the wasmtime-wasi behaviour makes more sense, not only because semantically a function which contains the word “open” presumably actually opens a file/dir. I see however how this is annoying in the case of Windows, where there isn’t one root folder that would always be available. Maybe either wasmtime-wasi or cap-std could provide a Windows-specific convenience function which returns a collection of all available drives. 🤔 But that still wouldn’t be able to handle new mounts on the fly. One possibility would be a function that constructs a pseudo root directory which on Windows contains all drive letters as subdirectories and on Unix would actually be /.

I’m sorry those are obviously not direct solution to your problem, I’m just writing down some ideas which could become the base for some future extension to the C API and based on that the Python API as well.

@pchickey pchickey self-assigned this May 6, 2024
@alexcrichton
Copy link
Member

Thanks for the report (as always!). And thanks @primoly for helping to track this down, I agree hat #8066 is what introduced this.

Digging in a bit further it appears there's some subtle interactions which should have been bugs in wasmtime-py's bindings of 19-and-earlier. Currently wasmtime-py's preopen_dir method calls the wasi_config_preopen_dir function in the C API. Notably though the wasmtime-py bindings do not, and have not ever, checked the return value. The function itself returns a bool indicating a success or not.

In Wasmtime 19 the function actually opened a directory and, on failure, returned false. The wasmtime-py bindings accidentally ignored this false value. In Wasmtime 20 the function no longer opens the directory, deferring the error to configuration creation time, where wasmtime-py does indeed check the error.

This I believe gives the impression that Wasmtime 19 was allowing preopening nonexistent directories and failing on access. It was actually failing the whole time but wasmtime-py was accidentally ignoring it. In Wasmtime 20 the logic happened to move to a method where wasmtime-py was checking the return value.


So then we're left with a question of what's the best behavior here. First the wasmtime-py bindings should update to handle the return value of this function no matter what. In my opinion there should be an error returned somewhere during configuration here and we shouldn't commit to the prior behavior of basically ignoring preopen failures. The next question is then when to return the error, either when the preopen is configured or when the context is made.

With wasmtime::Config we've opted to return errors from Engine::new where the Config is validated. This works out well I think because there's often interactions between configuration methods that aren't always obvious on each method in isolation. With that precedent I would lean towards deferring the error to when the wasi config is created as opposed to when the preopen is first registered.

@whitequark
Copy link
Contributor Author

With that precedent I would lean towards deferring the error to when the wasi config is created as opposed to when the preopen is first registered.

I understand the reasoning but this does make the pattern like the following a bit painful:

            for letter in "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ":
                wasi_cfg.preopen_dir(letter + ":\\", letter + ":")

Essentially, this check would be of little use to applications which actually expect an error for a preopen, and they would have to redo all the validation code (as I do now). Is this desired?

@alexcrichton
Copy link
Member

I wouldn't necessarily describe it as a desire to have embedders do more work per se, but it's also a good point that in the case of preopens there's not much interaction with other configuration options so it might make more sense to return an error immediately.

I think I might be misunderstanding something though. My assumption so far has been that this is a question of when the drive is opened and where the failure is then propagated. The stack overflow question you linked to though seems to indicate that just the act of opening a nonexistent drive causes issues and the purpose of the question is to avoid the open entirely. In Wasmtime 19, though, I believe that the drive was opened in Rust and if you didn't see any adverse effects I'm not sure if that stack overflow question is applicable? Do you think though that Python is opening directories a bit differently than was done in cap-std to open a directory?

@whitequark
Copy link
Contributor Author

The stack overflow question you linked to though seems to indicate that just the act of opening a nonexistent drive causes issues and the purpose of the question is to avoid the open entirely.

Oh sorry, I omitted a key bit of context here. I only linked the SO question because the way it solves getting the list of logical drives (the thing I actually want) is by going through ctypes.cdll.kernel32, which is about as weird in Python as adding a direct extern "C" to kernel32 in your Rust code, basically. The equivalent of winapi-rs or windows-rs is unfortunately too heavyweight to depend on in YoWASP/runtime-py (I think it might require MSVC at install time).

The errors people were getting there aren't really relevant for this use case, and the workaround itself is something I decided I'm just going to go ahead and ship after all. It's just kind of weird to do this? Not what you'd expect in idiomatic Python code, so I hesitated.

In the general case of Wasmtime, I'd be worried about e.g. os.access(path) returning True but the preopen failing, in some edge cases, therefore creating not just more work for embedders but work that is very nontrivial to get done, or which gets neglected for the less-used Windows platform.

@pchickey pchickey assigned alexcrichton and unassigned pchickey May 6, 2024
@alexcrichton
Copy link
Member

Ah ok makes sense. To confirm though you had no issues with the Wasmtime 19 version and prior? That I believe does attempt to actually open the directory which was failing and then silently being ignored. If that works then I can work on updating the builders and C API and such for this use case

@whitequark
Copy link
Contributor Author

To confirm though you had no issues with the Wasmtime 19 version and prior?

That's correct, no one had reported any issues and the code above was unconditionally run on Windows.

Thank you for the quick response!

alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue May 7, 2024
This commit updates the C API to use the `WasiCtxBuilder` directly
within `wasi_config_t` instead of buffering up the options separately.
This keeps the behavior of the Rust-based API more similar to the C API
and should also help resolve bytecodealliance#8552 due to errors being returned more
eagerly in the builder-based API.

This additionally makes some minor modifications to the C APIs here as
appropriate.

Close bytecodealliance#8552
@alexcrichton
Copy link
Member

With #8572 the C API should go back to returning errors immediately rather than being "buffered up" and I'll backport that to the 21.0.0 branch. Do you need this fix on the 20.0.0 branch as well? It'll be a bit wonky to backport so if it can be avoided I think that would be best, but if it is causing a headache to not be able to use 20.0.0 we can manage it.

@whitequark
Copy link
Contributor Author

Do you need this fix on the 20.0.0 branch as well? It'll be a bit wonky to backport so if it can be avoided I think that would be best, but if it is causing a headache to not be able to use 20.0.0 we can manage it.

Nope, no need for a backport. The workaround I have committed seems fine for my purposes; whenever I next bump wasmtime version I'll probably change it to use the new API, until then it lets me leave that version constraint relaxed.

github-merge-queue bot pushed a commit that referenced this issue May 7, 2024
* Use WASI builder directly in C API

This commit updates the C API to use the `WasiCtxBuilder` directly
within `wasi_config_t` instead of buffering up the options separately.
This keeps the behavior of the Rust-based API more similar to the C API
and should also help resolve #8552 due to errors being returned more
eagerly in the builder-based API.

This additionally makes some minor modifications to the C APIs here as
appropriate.

Close #8552

* Review comments
alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue May 7, 2024
* Use WASI builder directly in C API

This commit updates the C API to use the `WasiCtxBuilder` directly
within `wasi_config_t` instead of buffering up the options separately.
This keeps the behavior of the Rust-based API more similar to the C API
and should also help resolve bytecodealliance#8552 due to errors being returned more
eagerly in the builder-based API.

This additionally makes some minor modifications to the C APIs here as
appropriate.

Close bytecodealliance#8552

* Review comments
alexcrichton added a commit that referenced this issue May 7, 2024
* Use WASI builder directly in C API

This commit updates the C API to use the `WasiCtxBuilder` directly
within `wasi_config_t` instead of buffering up the options separately.
This keeps the behavior of the Rust-based API more similar to the C API
and should also help resolve #8552 due to errors being returned more
eagerly in the builder-based API.

This additionally makes some minor modifications to the C APIs here as
appropriate.

Close #8552

* Review comments
whitequark added a commit to YoWASP/runtime-py that referenced this issue Jun 25, 2024
Since Wasmtime 20, it is now an error to preopen a directory that is
not accessible by Wasmtime. In Wasmtime 20, the error would happen in
the store initialization code, which made it hard to work around, but
since Wasmtime 21, this happens at the preopen_dir() call. This was
changed in response to bytecodealliance/wasmtime#8552.

Preopens for relative paths are unaffected because if one of parents of
your cwd can't be opened by you you should probably do something about
it. This may be reconsidered in the future (perhaps with a warning) if
there are enough broken systems configured in this way to outweigh
the risk of ending up with certain paths that you can open being not
possible to use with YoWASP tools. (E.g. a -r+x directory is going to
be rejected by Wasmtime, but you could ordinarily use a path that
traverses it.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior in the current implementation that needs fixing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants