Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wiggle: adapt Wiggle guest slices for unsafe shared use #5229

Merged

Conversation

abrown
Copy link
Contributor

@abrown abrown commented Nov 8, 2022

When multiple threads can concurrently modify a WebAssembly shared memory, the underlying data for a Wiggle GuestSlice and GuestSliceMut could change due to access from other threads. This breaks Rust guarantees when &[T] and &mut [T] slices are handed out. This change modifies GuestPtr to make as_slice and as_slice_mut return an Option which is None when the underlying WebAssembly memory is shared.

But WASI implementations still need access to the underlying WebAssembly memory, both to read to it and write from it. This change adds new APIs:

  • GuestPtr::to_vec copies the bytes from WebAssembly memory (from which we can safely take a &[T])
  • GuestPtr::as_unsafe_slice_mut returns a wrapper struct from which we can unsafe-ly return a mutable slice (users must accept the unsafety of concurrently modifying a &mut [T])

This approach allows us to maintain Wiggle's borrow-checking infrastructure, which enforces the guarantee that Wiggle will not modify overlapping regions, e.g. This is important because the underlying system calls may expect this. Other threads may modify the same underlying region and this is impossible to prevent; at least Wiggle will not be able to do so.

Finally, the changes to Wiggle's API are propagated to all WASI implementations in Wasmtime. For now, code locations that attempt to get a guest slice will panic if the underlying memory is shared. Note that Wiggle is not enabled for shared memory (that will come later in something like #5054), but when it is, these panics will be clear indicators of locations that must be re-implemented in a thread-safe way.

@bjorn3
Copy link
Contributor

bjorn3 commented Nov 8, 2022

Would it make sense to have a method that returns &UnsafeCell<[T]> or &Cell<[T]>?

@github-actions github-actions bot added the wasi Issues pertaining to WASI label Nov 8, 2022
@github-actions
Copy link

github-actions bot commented Nov 8, 2022

Subscribe to Label Action

cc @kubkon

This issue or pull request has been labeled: "wasi"

Thus the following users have been cc'd because of the following labels:

  • kubkon: wasi

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@abrown abrown force-pushed the shmem-in-wiggle-copy-and-unsafe branch from e55f9c3 to 4d6ecb5 Compare November 8, 2022 23:03
@abrown
Copy link
Contributor Author

abrown commented Nov 8, 2022

Would it make sense to have a method that returns &UnsafeCell<[T]> or &Cell<[T]>?

I'm open to it; not sure if all of the functions exposed are necessary, though. Let's see what @alexcrichton and @pchickey think?

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of the Unsafe* wrapper you added here and I left a comment about how ideally I think we could use that to implement the safe bits and bobs.

Otherwise I think it should be able to implement:

impl<T> Deref for UnsafeGuestSlice<'_, T> {
    type Target = [UnsafeCell<T>];
    // ...
}

which could actually be relatively convenient to work with.

crates/wasi-common/src/snapshots/preview_0.rs Outdated Show resolved Hide resolved
crates/wiggle/src/lib.rs Outdated Show resolved Hide resolved
crates/wiggle/src/lib.rs Outdated Show resolved Hide resolved
crates/wiggle/src/lib.rs Outdated Show resolved Hide resolved
crates/wiggle/src/lib.rs Outdated Show resolved Hide resolved
crates/wiggle/src/lib.rs Outdated Show resolved Hide resolved
crates/wiggle/src/lib.rs Outdated Show resolved Hide resolved
@abrown
Copy link
Contributor Author

abrown commented Nov 9, 2022

Otherwise I think it should be able to implement:

impl<T> Deref for UnsafeGuestSlice<'_, T> {
    type Target = [UnsafeCell<T>];
    // ...
}

which could actually be relatively convenient to work with.

Thinking about this more, I don't think accessing T through an UnsafeCell is right: UnsafeCell::get_mut would give us a &mut T and the documentation of that method claims that "this call borrows the UnsafeCell mutably (at compile-time) which guarantees that we possess the only reference," which is incorrect. It's incorrect because UnsafeGuestSlice is pointing to shared memory which could be modified at any time by any WASI thread, outside of the Rust borrowing world. @alexcrichton, what do you think?

@alexcrichton
Copy link
Member

The get_mut method requires &mut self so we won't have to worry about that here since we'll never hand out &mut UnsafeCell<T>. By more-or-less handing out &[UnsafeCell<T>] we're basically saying "here's a 100% valid region of memory in terms of in-bounds, you have access to it, and it's all aligned correctly and whatnot -- but the contents of memory are ruled by other mechanisms"

The only way to interact with the contents of &UnsafeCell<T> is to have unsafe, which is the goal here, so I think it'll work out well. For unshared cases the wrapper methods to go to safe slices will work, and for shared cases it's clear that atomics/raw intrinsics/etc need to get used (and is something we'll document).

@abrown
Copy link
Contributor Author

abrown commented Nov 9, 2022

The get_mut method requires &mut self

Ah, yeah, I see what you mean; this makes more sense. I'm not sure how/when this would be used, though; I guess this would be yet another way to modify a slice besides the unsafe UnsafeGuestSlice::get_mut? (I guess for completeness we could also expose an unsafe UnsafeGuestSlice::get which would return a &[T]). If this isn't going to be used just yet, should we implement this in this PR or wait until it is needed?

@alexcrichton
Copy link
Member

Oh I don't think the Deref impl is really all that important to have, it's just a "might as well" sort of thing. I think though having an interim UnsafeGuestSlice get constructed on the way to a GuestSlice{,Mut} will exercise it and make it useful. Only for niche cases will UnsafeGuestSlice be useful. Note though we could have UnsafeGuestSlice::copy_from_slice as a safe method which only fails due to active borrows.

crates/wiggle/src/lib.rs Outdated Show resolved Hide resolved
crates/wiggle/src/lib.rs Show resolved Hide resolved
crates/wiggle/src/lib.rs Outdated Show resolved Hide resolved
@abrown abrown marked this pull request as ready for review November 9, 2022 23:19
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Nov 10, 2022
This commit is an attempt at improving the safety of using the return
value of the `SharedMemory::data` method. Previously this returned
`*mut [u8]` which, while correct, is unwieldy and unsafe to work with.
The new return value of `&[UnsafeCell<u8>]` has a few advantages:

* The lifetime of the returned data is now connected to the
  `SharedMemory` itself, removing the possibility for a class of errors
  of accidentally using the prior `*mut [u8]` beyond its original lifetime.

* It's not possibly to safely access `.len()` as opposed to requiring an
  `unsafe` dereference before.

* The data internally within the slice is now what retains the `unsafe`
  bits, namely indicating that accessing any memory inside of the
  contents returned is `unsafe` but addressing it is safe.

I was inspired by the `wiggle`-based discussion on bytecodealliance#5229 and felt it
appropriate to apply a similar change here.
crates/wiggle/src/lib.rs Outdated Show resolved Hide resolved
fitzgen pushed a commit that referenced this pull request Nov 10, 2022
This commit is an attempt at improving the safety of using the return
value of the `SharedMemory::data` method. Previously this returned
`*mut [u8]` which, while correct, is unwieldy and unsafe to work with.
The new return value of `&[UnsafeCell<u8>]` has a few advantages:

* The lifetime of the returned data is now connected to the
  `SharedMemory` itself, removing the possibility for a class of errors
  of accidentally using the prior `*mut [u8]` beyond its original lifetime.

* It's not possibly to safely access `.len()` as opposed to requiring an
  `unsafe` dereference before.

* The data internally within the slice is now what retains the `unsafe`
  bits, namely indicating that accessing any memory inside of the
  contents returned is `unsafe` but addressing it is safe.

I was inspired by the `wiggle`-based discussion on #5229 and felt it
appropriate to apply a similar change here.
@alexcrichton alexcrichton enabled auto-merge (squash) November 10, 2022 19:36
When multiple threads can concurrently modify a WebAssembly shared
memory, the underlying data for a Wiggle `GuestSlice` and
`GuestSliceMut` could change due to access from other threads. This
breaks Rust guarantees when `&[T]` and `&mut [T]` slices are handed out.
This change modifies `GuestPtr` to make `as_slice` and `as_slice_mut`
return an `Option` which is `None` when the underlying WebAssembly
memory is shared.

But WASI implementations still need access to the underlying WebAssembly
memory, both to read to it and write from it. This change adds new APIs:
- `GuestPtr::to_vec` copies the  bytes from WebAssembly memory (from
  which we can safely take a `&[T]`)
- `GuestPtr::as_unsafe_slice_mut` returns a wrapper `struct` from which
  we can  `unsafe`-ly return a mutable slice (users must accept the
  unsafety of concurrently modifying a `&mut [T]`)

This approach allows us to maintain Wiggle's borrow-checking
infrastructure, which enforces the guarantee that Wiggle will not modify
overlapping regions, e.g. This is important because the underlying
system calls may expect this. Though other threads may modify the same
underlying region, this is impossible to prevent; at least Wiggle will
not be able to do so.

Finally, the changes to Wiggle's API are propagated to all WASI
implementations in Wasmtime. For now, code locations that attempt to get
a guest slice will panic if the underlying memory is shared. Note that
Wiggle is not enabled for shared memory (that will come later in
something like bytecodealliance#5054), but when it is, these panics will be clear
indicators of locations that must be re-implemented in a thread-safe
way.
@abrown abrown force-pushed the shmem-in-wiggle-copy-and-unsafe branch from a0a836e to a6d8b36 Compare November 10, 2022 20:13
@alexcrichton alexcrichton merged commit 7717d8f into bytecodealliance:main Nov 10, 2022
@abrown abrown deleted the shmem-in-wiggle-copy-and-unsafe branch November 10, 2022 22:25
abrown added a commit to abrown/wasmtime that referenced this pull request Nov 14, 2022
This is an extension of bytecodealliance#5229 for the `&str` and `&mut str` types. As
documented there, we are attempting to maintain Rust guarantees for
slices that Wiggle hands out in the presence of WebAssembly shared
memory, in which case multiple threads could be modifying the underlying
data of the slice.

This change changes the API of `GuestPtr` to return an `Option` which is
`None` when attempting to view the WebAssembly data as a string and the
underlying WebAssembly memory is shared. This reuses the
`UnsafeGuestSlice` structure from bytecodealliance#5229 to do so and appropriately marks
the region as borrowed in Wiggle's manual borrow checker. Each original
call site in this project's WASI implementations is fixed up to `expect`
that a non-shared memory is used.  (Note that I can find no uses of
`GuestStrMut` in the WASI implementations).
alexcrichton pushed a commit that referenced this pull request Nov 14, 2022
* wiggle: adapt Wiggle strings for shared use

This is an extension of #5229 for the `&str` and `&mut str` types. As
documented there, we are attempting to maintain Rust guarantees for
slices that Wiggle hands out in the presence of WebAssembly shared
memory, in which case multiple threads could be modifying the underlying
data of the slice.

This change changes the API of `GuestPtr` to return an `Option` which is
`None` when attempting to view the WebAssembly data as a string and the
underlying WebAssembly memory is shared. This reuses the
`UnsafeGuestSlice` structure from #5229 to do so and appropriately marks
the region as borrowed in Wiggle's manual borrow checker. Each original
call site in this project's WASI implementations is fixed up to `expect`
that a non-shared memory is used.  (Note that I can find no uses of
`GuestStrMut` in the WASI implementations).

* wiggle: make `GuestStr*` containers wrappers of `GuestSlice*`

This change makes it possible to reuse the underlying logic in
`UnsafeGuestSlice` and the `GuestSlice*` implementations to continue to
expose the `GuestStr` and `GuestStrMut` types. These types now are
simple wrappers of their `GuestSlice*` variant. The UTF-8 validation
that distinguished `GuestStr*` now lives in the `TryFrom`
implementations for each type.
abrown added a commit to abrown/wasmtime that referenced this pull request Nov 14, 2022
`wiggle` looks for an exported `Memory` named `"memory"` to use for its
guest slices. This change allows it to use a `SharedMemory` if this is
the kind of memory used for the export.

It is `unsafe` to use shared memory in Wiggle because of broken Rust
guarantees: previously, Wiggle could hand out slices to WebAssembly
linear memory that could be concurrently modified by some other thread.
With the introduction of Wiggle's new `UnsafeGuestSlice` (bytecodealliance#5225, bytecodealliance#5229,
 bytecodealliance#5264), Wiggle should now correctly communicate its guarantees through
its API.
abrown added a commit to abrown/wasmtime that referenced this pull request Nov 15, 2022
`wiggle` looks for an exported `Memory` named `"memory"` to use for its
guest slices. This change allows it to use a `SharedMemory` if this is
the kind of memory used for the export.

It is `unsafe` to use shared memory in Wiggle because of broken Rust
guarantees: previously, Wiggle could hand out slices to WebAssembly
linear memory that could be concurrently modified by some other thread.
With the introduction of Wiggle's new `UnsafeGuestSlice` (bytecodealliance#5225, bytecodealliance#5229,
 bytecodealliance#5264), Wiggle should now correctly communicate its guarantees through
its API.
alexcrichton pushed a commit that referenced this pull request Nov 15, 2022
`wiggle` looks for an exported `Memory` named `"memory"` to use for its
guest slices. This change allows it to use a `SharedMemory` if this is
the kind of memory used for the export.

It is `unsafe` to use shared memory in Wiggle because of broken Rust
guarantees: previously, Wiggle could hand out slices to WebAssembly
linear memory that could be concurrently modified by some other thread.
With the introduction of Wiggle's new `UnsafeGuestSlice` (#5225, #5229,
 #5264), Wiggle should now correctly communicate its guarantees through
its API.
abrown added a commit to abrown/wasmtime that referenced this pull request Dec 9, 2022
Previous Wiggle changes (bytecodealliance#5229, bytecodealliance#5264) made it possible to access slices
of WebAssembly linear memory from Wiggle but limited the interface so
that slices of shared memory could only be accessed either `unsafe`-ly
or via copying (`GuestPtr::to_vec`). This change modifies `fd_write` to
unconditionally copy the bytes to be written before passing them to
Rust's standard library in an `IoSlice`. This is likely not the optimal
solution but enables further `wasi-threads` development. In the future
this commit should probably change to conditionally copy the bytes when
shared memory is detected and expand to fix up all the `expect` errors
of the same kind in `preview_1.rs`.
@abrown abrown added the wasm-proposal:threads Issues related to the WebAssembly threads proposal label Dec 12, 2022
abrown added a commit to abrown/wasmtime that referenced this pull request Dec 13, 2022
Previous Wiggle changes (bytecodealliance#5229, bytecodealliance#5264) made it possible to access slices
of WebAssembly linear memory from Wiggle but limited the interface so
that slices of shared memory could only be accessed either `unsafe`-ly
or via copying (`GuestPtr::to_vec`). This change modifies `fd_write` to
unconditionally copy the bytes to be written before passing them to
Rust's standard library in an `IoSlice`. This is likely not the optimal
solution but enables further `wasi-threads` development. In the future
this commit should probably change to conditionally copy the bytes when
shared memory is detected and expand to fix up all the `expect` errors
of the same kind in `preview_1.rs`.
abrown added a commit to abrown/wasmtime that referenced this pull request Dec 15, 2022
Previous Wiggle changes (bytecodealliance#5229, bytecodealliance#5264) made it possible to access slices
of WebAssembly linear memory from Wiggle but limited the interface so
that slices of shared memory could only be accessed either `unsafe`-ly
or via copying (`GuestPtr::to_vec`). This change modifies `fd_write` to
unconditionally copy the bytes to be written before passing them to
Rust's standard library in an `IoSlice`. This is likely not the optimal
solution but enables further `wasi-threads` development. In the future
this commit should probably change to conditionally copy the bytes when
shared memory is detected and expand to fix up all the `expect` errors
of the same kind in `preview_1.rs`.
abrown added a commit to abrown/wasmtime that referenced this pull request Dec 19, 2022
Previous Wiggle changes (bytecodealliance#5229, bytecodealliance#5264) made it possible to access slices
of WebAssembly linear memory from Wiggle but limited the interface so
that slices of shared memory could only be accessed either `unsafe`-ly
or via copying (`GuestPtr::to_vec`). This change modifies `fd_write` to
unconditionally copy the bytes to be written before passing them to
Rust's standard library in an `IoSlice`. This is likely not the optimal
solution but enables further `wasi-threads` development. In the future
this commit should probably change to conditionally copy the bytes when
shared memory is detected and expand to fix up all the `expect` errors
of the same kind in `preview_1.rs`.
abrown added a commit to abrown/wasmtime that referenced this pull request Dec 20, 2022
Previous Wiggle changes (bytecodealliance#5229, bytecodealliance#5264) made it possible to access slices
of WebAssembly linear memory from Wiggle but limited the interface so
that slices of shared memory could only be accessed either `unsafe`-ly
or via copying (`GuestPtr::to_vec`). This change modifies `fd_write` to
unconditionally copy the bytes to be written before passing them to
Rust's standard library in an `IoSlice`. This is likely not the optimal
solution but enables further `wasi-threads` development. In the future
this commit should probably change to conditionally copy the bytes when
shared memory is detected and expand to fix up all the `expect` errors
of the same kind in `preview_1.rs`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasi Issues pertaining to WASI wasm-proposal:threads Issues related to the WebAssembly threads proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants