Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasi] wasm trap: out of bounds memory access in sgen_gray_object_queue_trim_free_list #88501

Closed
ericstj opened this issue Jul 7, 2023 · 10 comments · Fixed by #91761
Closed
Labels
arch-wasm WebAssembly architecture area-GC-mono blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab os-wasi Related to WASI variant of arch-wasm
Milestone

Comments

@ericstj
Copy link
Member

ericstj commented Jul 7, 2023

Build Information

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=384730
Build error leg or test failing: System.Buffers.Tests Work Item
Pull request: #91059

Error Blob

{
  "ErrorMessage": "sgen_gray_object_queue_trim_free_list",
  "BuildRetry": false,
  "ErrorPattern": "",
  "ExcludeConsoleLog": false
}

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=384730
Error message validated: sgen_gray_object_queue_trim_free_list
Result validation: ❌ Known issue did not match with the provided build.
Validation performed at: 8/29/2023 5:41:34 PM UTC

Report

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 0 0
  • Output:
info: Discovering: managed/System.Globalization.Tests.dll (method display = ClassAndMethod, method display options = None)
info: Discovered:  managed/System.Globalization.Tests.dll (found 470 of 476 test cases)
info: Using random seed for test cases: 1518398705
info: Using random seed for collections: 1518398705
info: Starting:    managed/System.Globalization.Tests.dll
info: Error: failed to run main module `dotnet.wasm`
info: 
info: Caused by:
info:     0: failed to invoke command default
info:     1: error while executing at wasm backtrace:
info:            0: 0x52889 - <unknown>!sgen_gray_object_queue_trim_free_list
info:            1: 0x5292e - <unknown>!sgen_gray_object_queue_dispose
info:            2: 0x4de68 - <unknown>!collect_nursery
info:            3: 0x4d70c - <unknown>!sgen_perform_collection
info:            4: 0x4d53f - <unknown>!sgen_ensure_free_space
info:            5: 0x4c2d6 - <unknown>!sgen_alloc_obj_nolock
info:            6: 0x110ffe - <unknown>!mono_gc_alloc_vector
info:            7: 0xeac8e - <unknown>!mono_array_new_specific_internal
info:            8: 0xeacd3 - <unknown>!mono_array_new_specific_checked
info:            9: 0x10fa2 - <unknown>!mono_interp_exec_method
info:           10: 0x8106 - <unknown>!interp_runtime_invoke
info:           11: 0x12027a - <unknown>!mono_jit_runtime_invoke
info:           12: 0xe388b - <unknown>!do_runtime_invoke
info:           13: 0xe4225 - <unknown>!mono_runtime_try_invoke
info:           14: 0xe8a6c - <unknown>!do_try_exec_main
info:           15: 0xe850b - <unknown>!mono_runtime_run_main
info:           16: 0x6972 - <unknown>!main
info:           17: 0x275713 - <unknown>!__main_void
info:           18: 0x59d7 - <unknown>!_start
info:           19: 0x2888e4 - <unknown>!_start.command_export
info:        note: using the `WASMTIME_BACKTRACE_DETAILS=1` environment variable may show more debugging information
info:     2: wasm trap: out of bounds memory access
info: Process wasmtime exited with 134
info: Waiting to flush log messages with a timeout of 120 secs ..
fail: Application has finished with exit code 134 but 0 was expected
XHarness exit code: 71 (GENERAL_FAILURE)
/root/helix/work/workitem/e /root/helix/work/workitem/e
----- end Wed 16 Aug 2023 03:40:59 PM UTC ----- exit code 71 ----------------------------------------------------------
@ericstj ericstj added blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab labels Jul 7, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jul 7, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jul 7, 2023
@ericstj ericstj added arch-wasm WebAssembly architecture area-GC-mono os-wasi Related to WASI variant of arch-wasm labels Jul 7, 2023
@ghost
Copy link

ghost commented Jul 7, 2023

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

Build Information

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=331160
Build error leg or test failing: System.Buffers.Tests Work Item
Pull request: #87857

Error Blob

{
  "ErrorMessage": "sgen_gray_object_queue_trim_free_list",
  "BuildRetry": false,
  "ErrorPattern": "",
  "ExcludeConsoleLog": false
}
Author: ericstj
Assignees: -
Labels:

arch-wasm, blocking-clean-ci, untriaged, area-GC-mono, Known Build Error, os-wasi, needs-area-label

Milestone: -

@ericstj ericstj removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jul 7, 2023
@ericstj ericstj changed the title wasm trap: out of bounds memory access in sgen_gray_object_queue_trim_free_list [wasi] wasm trap: out of bounds memory access in sgen_gray_object_queue_trim_free_list Jul 7, 2023
@radical radical added this to the Future milestone Jul 7, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Jul 7, 2023
@vargaz
Copy link
Contributor

vargaz commented Jul 10, 2023

Can't reproduce this using:
while true; do XUNIT_RANDOM_ORDER_SEED=785219469 ./dotnet.sh build /t:Test /p:TargetOS=wasi /p:TargetArchitecture=wasm /p:Configuration=Release /p:WasmNativeStrip=false src/libraries/System.Buffers/tests/ || break; done

@carlossanlop
Copy link
Member

Not sure why Known Build Error is not matching runs hitting this (maybe because it's showing up as a warning and not as an error) but I found another hit today:

info: Discovering: managed/System.Globalization.Tests.dll (method display = ClassAndMethod, method display options = None)
info: Discovered:  managed/System.Globalization.Tests.dll (found 470 of 476 test cases)
info: Using random seed for test cases: 1518398705
info: Using random seed for collections: 1518398705
info: Starting:    managed/System.Globalization.Tests.dll
info: Error: failed to run main module `dotnet.wasm`
info: 
info: Caused by:
info:     0: failed to invoke command default
info:     1: error while executing at wasm backtrace:
info:            0: 0x52889 - <unknown>!sgen_gray_object_queue_trim_free_list
info:            1: 0x5292e - <unknown>!sgen_gray_object_queue_dispose
info:            2: 0x4de68 - <unknown>!collect_nursery
info:            3: 0x4d70c - <unknown>!sgen_perform_collection
info:            4: 0x4d53f - <unknown>!sgen_ensure_free_space
info:            5: 0x4c2d6 - <unknown>!sgen_alloc_obj_nolock
info:            6: 0x110ffe - <unknown>!mono_gc_alloc_vector
info:            7: 0xeac8e - <unknown>!mono_array_new_specific_internal
info:            8: 0xeacd3 - <unknown>!mono_array_new_specific_checked
info:            9: 0x10fa2 - <unknown>!mono_interp_exec_method
info:           10: 0x8106 - <unknown>!interp_runtime_invoke
info:           11: 0x12027a - <unknown>!mono_jit_runtime_invoke
info:           12: 0xe388b - <unknown>!do_runtime_invoke
info:           13: 0xe4225 - <unknown>!mono_runtime_try_invoke
info:           14: 0xe8a6c - <unknown>!do_try_exec_main
info:           15: 0xe850b - <unknown>!mono_runtime_run_main
info:           16: 0x6972 - <unknown>!main
info:           17: 0x275713 - <unknown>!__main_void
info:           18: 0x59d7 - <unknown>!_start
info:           19: 0x2888e4 - <unknown>!_start.command_export
info:        note: using the `WASMTIME_BACKTRACE_DETAILS=1` environment variable may show more debugging information
info:     2: wasm trap: out of bounds memory access
info: Process wasmtime exited with 134
info: Waiting to flush log messages with a timeout of 120 secs ..
fail: Application has finished with exit code 134 but 0 was expected
XHarness exit code: 71 (GENERAL_FAILURE)
/root/helix/work/workitem/e /root/helix/work/workitem/e
----- end Wed 16 Aug 2023 03:40:59 PM UTC ----- exit code 71 ----------------------------------------------------------

@carlossanlop
Copy link
Member

carlossanlop commented Aug 18, 2023

Hit this one again in 8.0 (warning, not error).

Output:

info: Discovering: managed/System.Collections.Tests.dll (method display = ClassAndMethod, method display options = None)
info: Discovered:  managed/System.Collections.Tests.dll (found 5666 of 7580 test cases)
info: Using random seed for test cases: 1927835451
info: Using random seed for collections: 1927835451
info: Starting:    managed/System.Collections.Tests.dll
info: Error: failed to run main module `dotnet.wasm`
info: 
info: Caused by:
info:     0: failed to invoke command default
info:     1: error while executing at wasm backtrace:
info:            0: 0x4ff1e - <unknown>!sgen_gray_object_queue_trim_free_list
info:            1: 0x4ffc3 - <unknown>!sgen_gray_object_queue_dispose
info:            2: 0x4b4fd - <unknown>!collect_nursery
info:            3: 0x4ada1 - <unknown>!sgen_perform_collection
info:            4: 0x4abd4 - <unknown>!sgen_ensure_free_space
info:            5: 0x49a0a - <unknown>!sgen_alloc_obj_nolock
info:            6: 0x4a0ea - <unknown>!sgen_alloc_obj
info:            7: 0x10e041 - <unknown>!mono_gc_alloc_obj
info:            8: 0xe6f0 - <unknown>!mono_interp_exec_method
info:            9: 0x8040 - <unknown>!interp_runtime_invoke
info:           10: 0x11d90f - <unknown>!mono_jit_runtime_invoke
info:           11: 0xe0f20 - <unknown>!do_runtime_invoke
info:           12: 0xe18ba - <unknown>!mono_runtime_try_invoke
info:           13: 0xe6101 - <unknown>!do_try_exec_main
info:           14: 0xe5ba0 - <unknown>!mono_runtime_run_main
info:           15: 0x68ac - <unknown>!main
info:           16: 0x272da8 - <unknown>!__main_void
info:           17: 0x5911 - <unknown>!_start
info:           18: 0x285f79 - <unknown>!_start.command_export
info:        note: using the `WASMTIME_BACKTRACE_DETAILS=1` environment variable may show more debugging information
info:     2: wasm trap: out of bounds memory access
info: Process wasmtime exited with 134

@carlossanlop carlossanlop added Known Build Error Use this to report build issues in the .NET Helix tab and removed Known Build Error Use this to report build issues in the .NET Helix tab labels Aug 28, 2023
@carlossanlop
Copy link
Member

Unsure why KnownBuildError is unable to keep linking existing hits, it says 0 | 0 | 0.

It was hit in a 8.0 PR: #91231

@carlossanlop carlossanlop added Known Build Error Use this to report build issues in the .NET Helix tab and removed Known Build Error Use this to report build issues in the .NET Helix tab labels Aug 29, 2023
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Sep 7, 2023
vargaz added a commit to vargaz/runtime that referenced this issue Sep 8, 2023
…memory.

Some code in sgen like sgen_los_free_object () expects the return address to be aligned.

Hopefully fixes dotnet#88501 and others.
vargaz added a commit that referenced this issue Sep 11, 2023
…memory. (#91761)

Some code in sgen like sgen_los_free_object () expects the return address to be aligned.

Hopefully fixes #88501 and others.
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Sep 11, 2023
github-actions bot pushed a commit that referenced this issue Sep 14, 2023
…memory.

Some code in sgen like sgen_los_free_object () expects the return address to be aligned.

Hopefully fixes #88501 and others.
@carlossanlop
Copy link
Member

@vargaz @lewing @lambdageek @radical do we need the fix for this backported to 8.0? I'm still seeing the sgen_gray_object_queue_trim_free_list failure there. Example:

PR: #92374
Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=413445&view=logs&j=f02b8cf9-dd4d-54fc-c292-2bb1d305b019&t=fd9015e5-cd94-57d7-8286-4fe749355863&l=56

@carlossanlop carlossanlop reopened this Sep 21, 2023
@carlossanlop carlossanlop modified the milestones: Future, 8.0.0 Sep 21, 2023
@carlossanlop carlossanlop added the untriaged New issue has not been triaged by the area owner label Sep 21, 2023
@lambdageek
Copy link
Member

@carlossanlop looks like the backport went in last week #91761

@carlossanlop
Copy link
Member

The PR that is hitting this failure was opened yesterday: #92374

This is what I'm seeing there:

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=413445&view=logs&j=f02b8cf9-dd4d-54fc-c292-2bb1d305b019&t=fd9015e5-cd94-57d7-8286-4fe749355863&l=56
Log: https://helixre107v0xd1eu3ibi6ka.blob.core.windows.net/dotnet-runtime-refs-heads-release-80-rc1-6a14376058b74ebebf/System.Collections.Tests/1/console.90d8fea5.log?helixlogtype=result
Output:

===========================================================================================================
/root/helix/work/workitem/e /root/helix/work/workitem/e /root/helix/work/workitem/e
[8.0.0-prerelease.23407.2+480b9159eb7e69b182a87581d5a336e97e0b6dae] XHarness command issued: wasi test --app=. --output-directory=/root/helix/work/workitem/uploads/xharness-output --engine-arg=--dir=. --timeout=00:30:00 -- dotnet.wasm WasmTestRunner managed/System.Collections.Tests.dll -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing
info: Using wasm engine WasmTime from path /root/helix/work/correlation/wasmtime/wasmtime
info: wasmtime-cli 5.0.0
info: 
info: Running /root/helix/work/correlation/wasmtime/wasmtime --dir=. dotnet.wasm WasmTestRunner managed/System.Collections.Tests.dll -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing
info: Discovering: managed/System.Collections.Tests.dll (method display = ClassAndMethod, method display options = None)
info: Discovered:  managed/System.Collections.Tests.dll (found 5666 of 7580 test cases)
info: Using random seed for test cases: 1986925510
info: Using random seed for collections: 1986925510
info: Starting:    managed/System.Collections.Tests.dll
info: Error: failed to run main module `dotnet.wasm`
info: 
info: Caused by:
info:     0: failed to invoke command default
info:     1: error while executing at wasm backtrace:
info:            0: 0x4ff1e - <unknown>!sgen_gray_object_queue_trim_free_list
info:            1: 0x4ffc3 - <unknown>!sgen_gray_object_queue_dispose
info:            2: 0x4b4fd - <unknown>!collect_nursery
info:            3: 0x4ada1 - <unknown>!sgen_perform_collection
info:            4: 0x4abd4 - <unknown>!sgen_ensure_free_space
info:            5: 0x49a0a - <unknown>!sgen_alloc_obj_nolock
info:            6: 0x4a0ea - <unknown>!sgen_alloc_obj
info:            7: 0x10e041 - <unknown>!mono_gc_alloc_obj
info:            8: 0xe617 - <unknown>!mono_interp_exec_method
info:            9: 0x8040 - <unknown>!interp_runtime_invoke
info:           10: 0x11d90f - <unknown>!mono_jit_runtime_invoke
info:           11: 0xe0f20 - <unknown>!do_runtime_invoke
info:           12: 0xe18ba - <unknown>!mono_runtime_try_invoke
info:           13: 0xe6101 - <unknown>!do_try_exec_main
info:           14: 0xe5ba0 - <unknown>!mono_runtime_run_main
info:           15: 0x68ac - <unknown>!main
info:           16: 0x272da8 - <unknown>!__main_void
info:           17: 0x5911 - <unknown>!_start
info:           18: 0x285f79 - <unknown>!_start.command_export
info:        note: using the `WASMTIME_BACKTRACE_DETAILS=1` environment variable may show more debugging information
info:     2: wasm trap: out of bounds memory access
info: Process wasmtime exited with 134
info: Waiting to flush log messages with a timeout of 120 secs ..
fail: Application has finished with exit code 134 but 0 was expected
XHarness exit code: 71 (GENERAL_FAILURE)
/root/helix/work/workitem/e /root/helix/work/workitem/e
----- end Wed 20 Sep 2023 09:20:08 PM UTC ----- exit code 71 ----------------------------------------------------------

@vargaz
Copy link
Contributor

vargaz commented Sep 21, 2023

Probably another failure with the same symptoms. Will investigate.

@vargaz
Copy link
Contributor

vargaz commented Sep 26, 2023

Couldn't reproduce it locally.

@jeffschwMSFT jeffschwMSFT removed the untriaged New issue has not been triaged by the area owner label Sep 28, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Nov 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly architecture area-GC-mono blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab os-wasi Related to WASI variant of arch-wasm
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants