Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI has been failing since April #315

Closed
jsturtevant opened this issue Jun 21, 2024 · 24 comments
Closed

CI has been failing since April #315

jsturtevant opened this issue Jun 21, 2024 · 24 comments

Comments

@jsturtevant
Copy link
Contributor

Opening this issue to start tracking down why it is failing.

The last good run was https://github.com/bytecodealliance/wasmtime-dotnet/actions/runs/8808954841 on commit b2b0480

Started failing the next day with https://github.com/bytecodealliance/wasmtime-dotnet/actions/runs/8825026578 on commit b2b0480

Some dependency that isn't pinned in CI might be a culprit.

I am able to repo locally:

Microsoft (R) Test Execution Command Line Tool Version 17.10.0 (x64)
Copyright (c) Microsoft Corporation.  All rights reserved.

Starting test execution, please wait...
A total of 1 test files matched the specified pattern.
The active test run was aborted. Reason: Test host process crashed : Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Repeat 2 times:
--------------------------------
   at Wasmtime.Value+Native.wasmtime_externref_new(IntPtr, IntPtr, Finalizer)
--------------------------------
   at Wasmtime.Value.FromValueBox(Wasmtime.Store, Wasmtime.ValueBox)
   at Wasmtime.ValueBox.ToValue(Wasmtime.Store, Wasmtime.ValueKind)
   at Wasmtime.Function.Invoke(System.ReadOnlySpan`1<Wasmtime.ValueBox>)
   at Wasmtime.Function.Invoke(Wasmtime.ValueBox[])
   at Wasmtime.Tests.ExternRefTests.<ItCollectsExternRefs>g__RunTest|18_0(Int32*)
   at Wasmtime.Tests.ExternRefTests.ItCollectsExternRefs()
   at System.RuntimeMethodHandle.InvokeMethod(System.Object, Void**, System.Signature, Boolean)
   at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(System.Object, System.Reflection.BindingFlags)
   at System.Reflection.MethodBase.Invoke(System.Object, System.Object[])
   at Xunit.Sdk.TestInvoker`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].CallTestMethod(System.Object)
   at Xunit.Sdk.TestInvoker`1+<>c__DisplayClass48_1+<<InvokeTestMethodAsync>b__1>d[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.TestInvoker`1+<>c__DisplayClass48_1+<<InvokeTestMethodAsync>b__1>d[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.execution.dotnet, Version=2.4.1.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<<InvokeTestMethodAsync>b__1>d<System.__Canon> ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[Xunit.Sdk.TestInvoker`1+<>c__DisplayClass48_1+<<InvokeTestMethodAsync>b__1>d[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.execution.dotnet, Version=2.4.1.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<<InvokeTestMethodAsync>b__1>d<System.__Canon> ByRef)
   at Xunit.Sdk.TestInvoker`1+<>c__DisplayClass48_1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].<InvokeTestMethodAsync>b__1()
   at Xunit.Sdk.ExecutionTimer+<AggregateAsync>d__4.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.ExecutionTimer+<AggregateAsync>d__4, xunit.execution.dotnet, Version=2.4.1.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<AggregateAsync>d__4 ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[Xunit.Sdk.ExecutionTimer+<AggregateAsync>d__4, xunit.execution.dotnet, Version=2.4.1.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<AggregateAsync>d__4 ByRef)
   at Xunit.Sdk.ExecutionTimer.AggregateAsync(System.Func`1<System.Threading.Tasks.Task>)
   at Xunit.Sdk.TestInvoker`1+<>c__DisplayClass48_1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].<InvokeTestMethodAsync>b__0()
   at Xunit.Sdk.ExceptionAggregator+<RunAsync>d__9.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.ExceptionAggregator+<RunAsync>d__9, xunit.core, Version=2.4.1.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<RunAsync>d__9 ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[Xunit.Sdk.ExceptionAggregator+<RunAsync>d__9, xunit.core, Version=2.4.1.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<RunAsync>d__9 ByRef)
   at Xunit.Sdk.ExceptionAggregator.RunAsync(System.Func`1<System.Threading.Tasks.Task>)
   at Xunit.Sdk.TestInvoker`1+<InvokeTestMethodAsync>d__48[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.TestInvoker`1+<InvokeTestMethodAsync>d__48[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.execution.dotnet, Version=2.4.1.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<InvokeTestMethodAsync>d__48<System.__Canon> ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.Decimal, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Start[[Xunit.Sdk.TestInvoker`1+<InvokeTestMethodAsync>d__48[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.execution.dotnet, Version=2.4.1.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<InvokeTestMethodAsync>d__48<System.__Canon> ByRef)
   at Xunit.Sdk.TestInvoker`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].InvokeTestMethodAsync(System.Object)
   at Xunit.Sdk.XunitTestInvoker.InvokeTestMethodAsync(System.Object)
   at Xunit.Sdk.TestInvoker`1+<<RunAsync>b__47_0>d[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.TestInvoker`1+<<RunAsync>b__47_0>d[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.execution.dotnet, Version=2.4.1.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<<RunAsync>b__47_0>d<System.__Canon> ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.Decimal, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Start[[Xunit.Sdk.TestInvoker`1+<<RunAsync>b__47_0>d[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.execution.dotnet, Version=2.4.1.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<<RunAsync>b__47_0>d<System.__Canon> ByRef)
   at Xunit.Sdk.TestInvoker`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].<RunAsync>b__47_0()
   at Xunit.Sdk.ExceptionAggregator+<RunAsync>d__10`1[[System.Decimal, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.ExceptionAggregator+<RunAsync>d__10`1[[System.Decimal, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.core, Version=2.4.1.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<RunAsync>d__10`1<System.Decimal> ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.Decimal, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Start[[Xunit.Sdk.ExceptionAggregator+<RunAsync>d__10`1[[System.Decimal, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.core, Version=2.4.1.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<RunAsync>d__10`1<System.Decimal> ByRef)
   at Xunit.Sdk.ExceptionAggregator.RunAsync[[System.Decimal, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Func`1<System.Threading.Tasks.Task`1<System.Decimal>>)
   at Xunit.Sdk.TestInvoker`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].RunAsync()
   at Xunit.Sdk.XunitTestRunner.InvokeTestMethodAsync(Xunit.Sdk.ExceptionAggregator)
   at Xunit.Sdk.XunitTestRunner+<InvokeTestAsync>d__4.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.XunitTestRunner+<InvokeTestAsync>d__4, xunit.execution.dotnet, Version=2.4.1.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<InvokeTestAsync>d__4 ByRef)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Start[[Xunit.Sdk.XunitTestRunner+<InvokeTestAsy

Test Run Aborted.
@jsturtevant
Copy link
Contributor Author

jsturtevant commented Jun 21, 2024

I can reproduce the failing test consistently:

   at Wasmtime.Tests.ExternRefTests.<ItCollectsExternRefs>g__RunTest|18_0(Int32*)
   at Wasmtime.Tests.ExternRefTests.ItCollectsExternRefs()

Also the one from CI is failing consistently locally too:

[xUnit.net 00:00:00.75]     Wasmtime.Tests.TableImportBindingTests.ItGrowsATable [FAIL]
  Error Message:
   System.EntryPointNotFoundException : Unable to find an entry point named 'wasmtime_val_delete' in DLL 'wasmtime'.
  Stack Trace:
     at Wasmtime.Value.Native.wasmtime_val_delete(Value& val)
   at Wasmtime.Value.Dispose() in D:\a\wasmtime-dotnet\wasmtime-dotnet\src\Value.cs:line 174
   at Wasmtime.Table..ctor(Store store, TableKind kind, Object initialValue, UInt[32](https://github.com/bytecodealliance/wasmtime-dotnet/actions/runs/8825026578/job/24228588501#step:8:33) initial, UInt32 maximum) in D:\a\wasmtime-dotnet\wasmtime-dotnet\src\Table.cs:line 84
   at Wasmtime.Tests.TableImportBindingTests.ItFailsToInstantiateWithTableLimitsMismatch() in D:\a\wasmtime-dotnet\wasmtime-dotnet\tests\TableImportBindingTests.cs:line 58

@jsturtevant
Copy link
Contributor Author

I seems the results look random in CI because some tests hang, and when others fail it crashes rest of the test suite. Here is a list of failing tests:

list of failing tests in vsCode

@jsturtevant
Copy link
Contributor Author

Both the passing and failing builds from above are running the same .net version:

dotnet-install: Extracting zip from https://dotnetcli.azureedge.net/dotnet/Sdk/8.0.204/dotnet-sdk-8.0.204-linux-x64.tar.gz

@jsturtevant
Copy link
Contributor Author

The project downloads the wasmtime c api:

<DownloadFile
Condition="('$(Packing)'=='true' Or '%(Wasmtime.Copy)'=='true') And !Exists('$(BaseIntermediateOutputPath)/$(ReleaseFileNameBase)-%(Wasmtime.Arch)-%(Wasmtime.OS)-c-api.%(Wasmtime.FileExtension)')"
SourceUrl="$(ReleaseURLBase)/$(ReleaseFileNameBase)-%(Wasmtime.Arch)-%(Wasmtime.OS)-c-api.%(Wasmtime.FileExtension)"
DestinationFolder="$(BaseIntermediateOutputPath)"
SkipUnchangedFiles="true"
/>

This changes regularly, and in ci the size of the download changed between the two runs. My current guess is that the Dev release is broken for this.

@jsturtevant
Copy link
Contributor Author

Looks like Dev builds are being enabled even for pr runs:

- name: Enable development builds for the main branch
if: github.ref == 'refs/heads/main' || github.base_ref == 'main'
shell: bash
run: |
echo "DevBuild=true" >> $GITHUB_ENV

Here is an example of a PR that should be using the matching c-library but is using DEV:

https://github.com/bytecodealliance/wasmtime-dotnet/pull/313/checks

Run dotnet build Wasmtime.sln -c debug --no-restore
  Tool 'dotnet-t4' (version '2.3.1') was restored. Available commands: t4
  
  Restore was successful.
  Downloading from "https://github.com/bytecodealliance/wasmtime/releases/download/dev/wasmtime-dev-x[8](https://github.com/bytecodealliance/wasmtime-dotnet/pull/313/checks#step:7:9)6_64-linux-c-api.tar.xz" to "/home/runner/work/wasmtime-dotnet/wasmtime-dotnet/src/obj/wasmtime-dev-x86_64-linux-c-api.tar.xz" (15,430,[9](https://github.com/bytecodealliance/wasmtime-dotnet/pull/313/checks#step:7:10)20 bytes).

@jsturtevant
Copy link
Contributor Author

Switching the main build to pull the matching version locally (19.0.1) made many more of the tests pass, so might be on to something.

@kpreisser
Copy link
Contributor

kpreisser commented Jun 21, 2024

Hi,
AFAIK using dev builds of the Wasmtime C API when building from main or for PRs that target main is intentional, to detect any incompatibilities with upcoming Wasmtime C API releases.

There have been some changes in the C API recently, but wasmtime-dotnet hasn't yet been updated to it, which is why the build currently fails e.g. due to missing entry points (like wasmtime_val_delete) or due to changed function signatures (like wasmtime_externref_new),

For example, the following commits changed C API functions that wasmtime-dotnet currently uses, which have not yet been incorporated into wasmtime-dotnet:

However, it seems the project is currently missing a maintainer, see: #313 (comment)

@jsturtevant
Copy link
Contributor Author

AFAIK using dev builds of the Wasmtime C API when building from main or for PRs that target main is intentional, to detect any incompatibilities with upcoming Wasmtime C API releases.

This makes sense to have an early signal. Currently it looks like it is running on the PR's too which is blocking any updates for simple things like https://github.com/bytecodealliance/wasmtime-dotnet/pull/313/files.

For example, the following commits changed C API functions that wasmtime-dotnet currently uses, which have not yet been incorporated into wasmtime-dotnet:

bytecodealliance/wasmtime@e55fa3c
bytecodealliance/wasmtime@77405cc

Thanks for the pointers! Do you know enough to be able to update the project for these fixes?

However, it seems the project is currently missing a maintainer, see: #313 (comment)

I can help on this front but new to the project. If you know enough to update to the API changes, I could help review and get it in. Otherwise I'll start digging in.

@martindevans
Copy link
Contributor

@jsturtevant I've fixed the first of those two things you linked in #316.

@jsturtevant
Copy link
Contributor Author

I dug into this a bit more, it was bothering me that I couldn't get the main build properly after making it only pull the dev nightly builds of wasmtime on main and not on the prs (#317). This should have fixed the PR's since we we are on the 19.0.1 and now pulling the 19.01 version of the c-api but it didn't.

So what looks to have happen is:

  1. 19.0.1 is passing in Bump Wasmtime version to 19.0.1. #303. This passed because it wasn't against the main branch.
  Downloading from "https://github.com/bytecodealliance/wasmtime/releases/download/v19.0.1/wasmtime-v19.0.1-x86_64-windows-c-api.zip" to "D:\a\wasmtime-dotnet\wasmtime-dotnet\src\obj\wasmtime-v19.0.1-x86_64-windows-c-api.zip" (18,487,457 bytes).
  1. 19.0.1 is failing in the PR Merge release-19.0.x to main #304. The are two different jobs that pass in that pr (the ones using 19.0.1 and the ones that run against main on dev api.

  2. tests start passing again in Fix CI. #305 but this is using dev branch on april 16. Since the fixes where for the dev branch c-api (not 19.0.1 c-api) it means anything running against 19.0.1 will fail!

  3. Since nightly CI run on dev we started failing again on April 24 becuase the API changed again: https://github.com/bytecodealliance/wasmtime/commits/main/crates/c-api?since=2024-04-24&until=2024-04-24

So we are in a strange state, the version in the code is 19.0.1 but we are closer to using the c-API of version 21.0 and the nightly dev version is way ahead.

We have a few options:

  • go for latest version (22.0) and track down all the c-api changes between ~21 (which was around april 16th) to 22.0
  • Try for 1.21 and track down the changes between april 16 and when the 21.0 branch was cut. Then go from 1.21 to 1.22 after that.

@jsturtevant
Copy link
Contributor Author

There is a third option which is to bump to 1.20. The branch was cut on April 11th and got the backports from mid april, then was but right before april 24th!

bytecodealliance/wasmtime@main...release-20.0.0

@jsturtevant
Copy link
Contributor Author

I took approach 3 in #317 and PR CI is passing!

@kpreisser
Copy link
Contributor

kpreisser commented Jun 22, 2024

Hi @jsturtevant, thanks for your efforts in improving this!

So we are in a strange state, the version in the code is 19.0.1 but we are closer to using the c-API of version 21.0 and the nightly dev version is way ahead.

AFAIK, previously, the workflow in this repo regarding updates from upstream wasmtime was:

  • When there were (breaking) changes in the upstream wasmtime:main (reflected in the dev build), update wasmtime-dotnet:main for that changes (but don't touch the WasmtimeVersion property in Directory.Build.props). For example: Fix failing tests after the changes in bytecodealliance/wasmtime#7285 #280
  • When upstream wasmtime created a new release (e.g. 15.0.0) and wasmtime-dotnet should also make a release for that version:
    • Update the WasmtimeVersion property, either in main branch when the current wasmtime dev build still matches the state/API of the release version, for example: Bump Wasmtime version to 15.0.0. #286, or update it in a release branch (so that CI uses the specific wasmtime version), e.g.: Update Wasmtime to 18.0.0. #300
    • Then, create a tag for the release (when CI passes).

Therefore, it was expected that the wasmtime-dotnet:main branch contained changes for the Wasmtime C API that were in Wasmtime's dev build but not in the version specified by the WasmtimeVersion property, since wasmtime-dotnet:main would reflect the current state of wasmtime:main.

With #317, PRs against the main branch would run with the (probably older) Wasmtime version specified in WasmtimeVersion rather than the dev build, which could mean that even if CI for main passes, CI for PRs might fail since they use an older Wasmtime version.

For example, PR #245 updated the wasmtime-dotnet:main branch for changes done in upstream wasmtime:main. If the PR had been built against the current WasmtimeVersion rather than dev, it would have failed because the previously released wasmtime version didn't contain these changes. (That PR also uncovered a bug in the current wasmtime dev build, where CI was failing at first, but then succeeded after the upstream wasmtime dev build included the fix.)

Therefore, I'm currently not so sure if we really should change the CI behavior to no longer use the Wasmtime dev build for PRs that target main.

You are right that if upstream wasmtime contained breaking changes not yet done in wasmtime-dotnet, CI builds of PRs and the main branch would fail (e.g. like in #313). In such a case, I think the expectation is to first fix the CI failures in main, then retrigger CI for the PRs (by a repo maintainer, or by the PR author via rebasing or pushing an empty commit).

What do you think?
Thanks!

@jsturtevant
Copy link
Contributor Author

I was struggling with this too. I generally agree that is a good approach but comes with a few down sides, such as the one we are in now where we are far behind and it's pretty confusing what needs to be fixed.

I think we can take a hybrid approach atleast temporary to get unstuck in an approach that let's us merge changes with ci being green with out a huge pr

  • Check in Update to Wasmtime 20.0.2 to pass CI #317, so we have passing prs and Main Branch
  • bump cut branch then bump main to 21
  • fix issue for 21 and cut a branch
  • bump to 22 , fix api and cut branch
  • bump main back to dev and keep moving, cutting branches as we go.

I would propose we document the approach once we are back to "normal".

How does that sound for an approach?

kpreisser added a commit to kpreisser/wasmtime-dotnet that referenced this issue Jun 24, 2024
This includes updates for:
- bytecodealliance/wasmtime#8451
- bytecodealliance/wasmtime#8461
- bytecodealliance/wasmtime#8011

TODOs:
- Allocating an `externref` can now fail (by `wasmtime_externref_new` returning `false`). Currently, we throw a `WasmtimeException` in that case. We need to check where that exception can be thrown, and whether we need to do any additional clean-up (e.g. when converting arguments for a function call).
- Check whether it's ok to compare the `__private` field of externs (which has been remaned in the C API, previously it was `index`).
- `anyref` type is not yet supported, but I'm not sure what exactly it is and whether we need to add it.

Fixes bytecodealliance#315
@kpreisser
Copy link
Contributor

kpreisser commented Jun 24, 2024

Hi @jsturtevant,
thanks for your reply!

Yes, I think taking that hybrid approach might be a good idea.

Regarding the C API changes, I have opened #318 to incorporate the changes in the Wasmtime C API (except for WASI, which is done in #316), so that with both PRs merged, CI should succeed again with the current dev release of Wasmtime.

However, #318 is still a draft as some details are still missing (e.g. clean-up in case wasmtime_externref_new returns false).

Thanks!

@jsturtevant
Copy link
Contributor Author

Awesome! Thanks for your help on this.

Is there any value in having release 1.20 and 1.21, if not with your work on #318 we can skip ahead and cut a release for 1.22.

@kpreisser
Copy link
Contributor

Hi,

Is there any value in having release 1.20 and 1.21

I don't know Wasmtime well enough to determine this, but it would probably be easier to skip these versions and proceed e.g. with a v22.0.0 release.

@kpreisser
Copy link
Contributor

kpreisser commented Jun 29, 2024

Hi @jsturtevant,
thank you for for merging the PRs and creating wasmtime-dotnet releases!

However, regarding the releases of wasmtime-dotnet v20.0.2 and v21.0.1, I think unfortunately there is a mismatch between the native wasmtime API and the functions declared in wasmtime-dotnet, meaning the native functions may not be called correctly.

Consider e.g. the wasmtime_val_delete function, which looks like this in Wasmtime v19.0.2:

WASM_API_EXTERN void wasmtime_val_delete(wasmtime_val_t *val);

In Wasmtime v20.0.2, this has been changed to the following:

WASM_API_EXTERN void wasmtime_val_delete(wasmtime_context_t *context,
                                         wasmtime_val_t *val);

Then, in Wasmtime v21.0.1, it has been changed to the following:

WASM_API_EXTERN void wasmtime_val_unroot(wasmtime_context_t *context,
                                         wasmtime_val_t *val);

However, in wasmtime-dotnet tag v20.0.2, the imported function declaration is still the following:

[DllImport(Engine.LibraryName)]
public static extern void wasmtime_val_delete(in Value val);

Notice that it has only one parameter (in Value val, for the wasmtime_val_t *val parameter), but the C API function in Wasmtime v20.0.2 additionally has the store context, which means the function won't be called correctly. It seems this is not caught by the unit tests (as CI passes), but it may cause at least a memory leak because the Value is probably not unrooted correctly, and could also lead to other undefined behavior.

(This is why I mentioned that it would probably be easier to skip versions 20.x and 21.x and proceed with a v22.0 release 😉)

Similarly, for the released wasmtime-dotnet v21.0.1, it doesn't yet contain the changes from #316, so the imported function declaration looks like this:

[DllImport(Engine.LibraryName)]
public unsafe static extern void wasi_config_set_argv(Handle config, int argc, byte** argv);

However, Wasmtime v21.0.1 already contains the changes to that function (bytecodealliance/wasmtime#8578):

WASI_API_EXTERN bool wasi_config_set_argv(wasi_config_t *config, size_t argc,
                                          const char *argv[]);

What do you think about this? Maybe we should deprecate the releases, by unlisting/deprecate the NuGet packages for v20.0.2 and v21.0.1?

Once #316 is merged, I think we can then create a wasmtime-dotnet v22.0.0 release which should then be good to go.

Thanks!

@jsturtevant
Copy link
Contributor Author

Notice that it has only one parameter (in Value val, for the wasmtime_val_t *val parameter), but the C API function in Wasmtime v20.0.2 additionally has the store context, which means the function won't be called correctly. It seems this is not caught by the unit tests (as CI passes), but it may cause at least a memory leak because the Value is probably not unrooted correctly, and could also lead to other undefined behavior.

I guess I incorrectly assumed if CI, was passing we would be mostly in good shape. Thanks for point this out to me.

What do you think about this? Maybe we should deprecate the releases, by unlisting/deprecate the NuGet packages for v20.0.2 and v21.0.1?

Yes, lets get it into a good state and go from there

@jsturtevant
Copy link
Contributor Author

side note: Any thoughts on what can be done to improve CI so these types of things are caught?

@kpreisser
Copy link
Contributor

kpreisser commented Jul 2, 2024

side note: Any thoughts on what can be done to improve CI so these types of things are caught?

Hi @jsturtevant,

I'm not sure if there's an easy way to detect mismatching interop function signatures, as interop with native functions requires correctly declaring the function signature; otherwise, it depends on various factors (such as the CPU architecture, stack layout, parameter types etc.) what will happen.

To detect such issues before creating a new wasmtime-dotnet release, my suggestion would be to use Git or Github to display the differences between a new Wasmtime version and the previous one in the /crates/c-api/include folder, and then review the differences in the C header files.

Maybe there exists some build action that could detect whether two commits/refs of a specific repo (wasmtime), e.g. v19.0.2 and main, contain changes in a specific folder (/crates/c-api/include) and in that case fail the build, so that you have to manually verify the changes and then update the commit reference, but I don't know such a build action.

Thanks!

@jsturtevant
Copy link
Contributor Author

Thanks for the feedback and input. Will look into this a bit more since it would be nice to not have to manually do this to avoid mistakes.

@jsturtevant
Copy link
Contributor Author

I asked a few folks and they suggested looking at LibraryImport (https://learn.microsoft.com/en-us/dotnet/standard/native-interop/pinvoke-source-generation)

I tried it out and it worked for the scenario where function name changed. The tests failed and I was able to catch the function name change to wasmtime_val_unroot. It didn't work when function name stayed the same as in the first and third scenario with wasmtime_val_delete and wasi_config_set_argv. The other downside is it only works with .net7+ and we support netstandard2.1;netstandard2.0; as well but that could be worked around with preprocessor statements.

@jsturtevant
Copy link
Contributor Author

Doing a little more digging, and it seems we could use a tool like https://github.com/dotnet/clangsharp?tab=readme-ov-file#generating-bindings to point at header files and generate c#. This would require a bit of changes to the project but would help catch the changes to files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants