Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deps: Update wasmtime to v4.0.0 #1412

Closed
wants to merge 7 commits into from
Closed

deps: Update wasmtime to v4.0.0 #1412

wants to merge 7 commits into from

Conversation

magik6k
Copy link
Contributor

@magik6k magik6k commented Jan 10, 2023

Doesn't build yet because error handling hates me

   Compiling fvm v3.0.0-alpha.17 (/home/magik6k/github.com/filecoin-project/ref-fvm/fvm)
error[E0277]: `?` couldn't convert the error to `Abort`
   --> fvm/src/call_manager/default.rs:598:23
    |
598 |                 Ok(res?)
    |                       ^ the trait `From<anyhow::Error>` is not implemented for `Abort`
    |
    = note: the question mark operation (`?`) implicitly performs a conversion on the error value using the `From` trait
    = help: the trait `From<wasmtime::Trap>` is implemented for `Abort`
    = note: required for `std::result::Result<u32, Abort>` to implement `FromResidual<std::result::Result<Infallible, anyhow::Error>>`

error[E0277]: the trait bound `wasmtime::Trap: From<Box<dyn StdError + Sync + std::marker::Send>>` is not satisfied
  --> fvm/src/syscalls/error.rs:56:20
   |
56 |         Trap::from(Box::new(Envelope::wrap(a)) as Box<dyn std::error::Error + Send + Sync + 'static>)
   |         ---------- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `From<Box<dyn StdError + Sync + std::marker::Send>>` is not implemented for `wasmtime::Trap`
   |         |
   |         required by a bound introduced by this call
   |
   = help: the trait `From<Abort>` is implemented for `wasmtime::Trap`

error[E0599]: no method named `trap_code` found for enum `wasmtime::Trap` in the current scope
  --> fvm/src/syscalls/error.rs:66:31
   |
66 |         if let Some(code) = t.trap_code() {
   |                               ^^^^^^^^^ method not found in `wasmtime::Trap`

error[E0277]: the trait bound `Abort: StdError` is not satisfied
   --> fvm/src/syscalls/bind.rs:126:53
    |
126 |                         charge_for_exec(&mut caller)?;
    |                                                     ^ the trait `StdError` is not implemented for `Abort`
...
196 | impl_bind_syscalls!();
    | --------------------- in this macro invocation
    |
    = help: the following other types implement trait `FromResidual<R>`:
              <std::result::Result<T, F> as FromResidual<Yeet<E>>>
              <std::result::Result<T, F> as FromResidual<std::result::Result<Infallible, E>>>
    = note: required for `anyhow::Error` to implement `From<Abort>`
    = note: required for `std::result::Result<u32, anyhow::Error>` to implement `FromResidual<std::result::Result<Infallible, Abort>>`
    = note: this error originates in the macro `impl_bind_syscalls` (in Nightly builds, run with -Z macro-backtrace for more info)


@magik6k
Copy link
Contributor Author

magik6k commented Jan 11, 2023

Update - The last commit reduces the number of errors which appear in the build output, but I can't tell if this is because things got more or less broken

Most of this mess is caused by changes in this wasmtime commit - bytecodealliance/wasmtime@2afaac5

Essentially wasmtime Traps are now a simple enum, but it's now possible to return any error types we want from syscalls and, IIUC, we just get them back from instance.get_typed_func([..])[..].call().

So in theory:

  • We want to make Abort be a normal std (or anyhow?) error
  • We don't need the hacks for wrapping Abort into/from Traps, we just need to translate 'normal' Traps into Aborts
  • Then we need to do a small bit of translating the (anyhow?) Error we get from wasm call into Abort we expect in the part of CallManager which makes that call

fvm/src/syscalls/error.rs Outdated Show resolved Hide resolved
@Stebalien
Copy link
Member

Ok, I think I fixed it. We'll see if the tests pass.

@codecov-commenter
Copy link

Codecov Report

Merging #1412 (20e9d57) into master (fcec8c6) will increase coverage by 17.44%.
The diff coverage is n/a.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #1412       +/-   ##
===========================================
+ Coverage   53.33%   70.77%   +17.44%     
===========================================
  Files         141       97       -44     
  Lines       13122     7561     -5561     
===========================================
- Hits         6998     5351     -1647     
+ Misses       6124     2210     -3914     
Impacted Files Coverage Δ
shared/src/state/mod.rs 0.00% <0.00%> (-100.00%) ⬇️
shared/src/chainid/mod.rs 0.00% <0.00%> (-57.15%) ⬇️
ipld/blockstore/src/memory.rs 57.14% <0.00%> (-42.86%) ⬇️
ipld/hamt/src/hamt.rs 62.40% <0.00%> (-18.80%) ⬇️
ipld/blockstore/src/lib.rs 33.33% <0.00%> (-8.89%) ⬇️
shared/src/econ/mod.rs 74.89% <0.00%> (-3.77%) ⬇️
shared/src/version/mod.rs 0.00% <0.00%> (-3.58%) ⬇️
shared/src/sector/registered_proof.rs 0.00% <0.00%> (-0.81%) ⬇️
shared/src/crypto/signature.rs 0.00% <0.00%> (-0.60%) ⬇️
sdk/src/vm.rs 0.00% <0.00%> (ø)
... and 46 more

@Stebalien
Copy link
Member

Ok, so, one error down but... I have no idea how to detect "out of memory" on initialization. We used to be able to do this, but it looks like the error types just disappeared. Options are:

  1. Have some form of "ran out of memory" flag that we record inside the limiter and/or kernel?
  2. Somehow determine if we're likely to run out of memory.
  3. Treat all instantiation failures the same (e.g., as ILLEGAL_INSTRUCTION).

Honestly, the last one is probably the simplest. It's not great, but...

We should probably write a test to see what error wasmtime gives us because there may be some error type that I'm missing.

},
Err(e) => Abort::Fatal(e),
.map_err(|e| {
// todo try to distinguish resource limit / trap / fatal
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Todo

Comment on lines +180 to +182
#[allow(deprecated)] // TODO https://github.com/bytecodealliance/wasmtime/issues/5037
// has to be enabled, otherwise returning errors from syscalls will not work
c.wasm_backtrace(true);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With c.wasm_backtrace(false) tests fail with

---- events_test stdout ----
thread 'events_test' panicked at 'assertion failed: needs_backtrace == backtrace.is_some()', /home/magik6k/.cargo/registry/src/github.com-1ecc6299db9ec823/wasmtime-4.0.0/src/trap.rs:101:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'events_test' panicked at 'assertion failed: needs_backtrace == backtrace.is_some()', /home/magik6k/.cargo/registry/src/github.com-1ecc6299db9ec823/wasmtime-4.0.0/src/trap.rs:101:13
thread 'events_test' panicked at 'assertion failed: needs_backtrace == backtrace.is_some()', /home/magik6k/.cargo/registry/src/github.com-1ecc6299db9ec823/wasmtime-4.0.0/src/trap.rs:101:13
thread 'events_test' panicked at 'assertion failed: needs_backtrace == backtrace.is_some()', /home/magik6k/.cargo/registry/src/github.com-1ecc6299db9ec823/wasmtime-4.0.0/src/trap.rs:101:13
thread 'events_test' panicked at 'assertion failed: needs_backtrace == backtrace.is_some()', /home/magik6k/.cargo/registry/src/github.com-1ecc6299db9ec823/wasmtime-4.0.0/src/trap.rs:101:13
thread 'events_test' panicked at 'assertion failed: needs_backtrace == backtrace.is_some()', /home/magik6k/.cargo/registry/src/github.com-1ecc6299db9ec823/wasmtime-4.0.0/src/trap.rs:101:13
thread 'events_test' panicked at 'assertion failed: `(left == right)`
  left: `ExitCode { value: 0 }`,
 right: `ExitCode { value: 10 }`', testing/integration/tests/events_test.rs:105:5

Which fails an assert in this code:

    store: &StoreOpaque,
    runtime_trap: Box<wasmtime_runtime::Trap>,
) -> Error {
    let wasmtime_runtime::Trap { reason, backtrace } = *runtime_trap;
    let (error, pc) = match reason {
        // For user-defined errors they're already an `anyhow::Error` so no
        // conversion is really necessary here, but a `backtrace` may have
        // been captured so it's attempted to get inserted here.
        //
        // If the error is actually a `Trap` then the backtrace is inserted
        // directly into the `Trap` since there's storage there for it.
        // Otherwise though this represents a host-defined error which isn't
        // using a `Trap` but instead some other condition that was fatal to
        // wasm itself. In that situation the backtrace is inserted as
        // contextual information on error using `error.context(...)` to
        // provide useful information to debug with for the embedder/caller,
        // otherwise the information about what the wasm was doing when the
        // error was generated would be lost.
        wasmtime_runtime::TrapReason::User {
            error,
            needs_backtrace,
        } => {
            debug_assert!(needs_backtrace == backtrace.is_some());  ******************This assert fails****************
            (error, None)
        }
        wasmtime_runtime::TrapReason::Jit(pc) => {
            let code = store
                .modules()
                .lookup_trap_code(pc)
                .unwrap_or(Trap::StackOverflow);
            (code.into(), Some(pc))
        }
        wasmtime_runtime::TrapReason::Wasm(trap_code) => (trap_code.into(), None),
    };
    match backtrace {
        Some(bt) => {
            let bt = WasmBacktrace::from_captured(store, bt, pc);
            if bt.wasm_trace.is_empty() {
                error
            } else {
                error.context(bt)
            }
        }
        None => error,
    }
}

Most likely due to a trap being generated that needs a backtrace, but none being captured due to backtraces being disabled; And because wasmtime wants to deprecate the option to disable backtraces it's probably unlikely that they'll fix this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately enabling backtraces makes native_stack_overflow take 13 minutes on my machine.

The native_stack_overflow test is invoking an actor which:

  • Makes 396 recursive calls inside wasm to fill the entire wasm stack limit
  • Then calls itself recursively until we hit recursive call limit

This test was written to ensure that in the most extreme abuse wasm actors can't exhaust native stacks, as that fault can't be caught / handled safely in any way. So why is this so slow? Didn't wasmtime fix backtrace re-capture in 3.0.0 (bytecodealliance/wasmtime#5037)? They did.. but when we return from a failed actor send syscall, we're returning a new user error, for which wasmtime will re-capture backtrace again..Ideally we'd have a way to tell wasmtime that we don't want to capture a backtrace when returning from syscalls, but I don't see a way today

@magik6k
Copy link
Contributor Author

magik6k commented Jan 16, 2023

Wasmtime issue about backtrace error bug: bytecodealliance/wasmtime#5577

@maciejwitowski maciejwitowski mentioned this pull request Jan 23, 2023
@Stebalien Stebalien linked an issue Feb 15, 2023 that may be closed by this pull request
@Stebalien
Copy link
Member

Replaced by #1672

@Stebalien Stebalien closed this Feb 16, 2023
@Stebalien Stebalien deleted the deps/wasmtime-4.0.0 branch February 16, 2023 02:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upgrade Wasmtime
3 participants