Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InlineArray[Valid]/[Invalid] Runtime tests are failing for Mono miniJIT and interpreter #90398

Closed
matouskozak opened this issue Aug 11, 2023 · 11 comments · Fixed by #90583
Closed
Assignees
Labels
area-Infrastructure-mono disabled-test The test is disabled in source code against the issue
Milestone

Comments

@matouskozak
Copy link
Member

matouskozak commented Aug 11, 2023

Build Information

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=369496

Affected pipelines:

  • runtime-extra-platforms

Build error legs:

  • Mono android x64 Release @ Ubuntu.1804.Amd64.Android.29.Open

Failing:

  • android-x64 Release AllSubsets_Mono_RuntimeTests minijit
  • android-x64 Release AllSubsets_Mono_RuntimeTests_Interp monointerpreter
[10:40:25] dbug: Executing command: '/datadisks/disk1/work/9B9908E4/p/microsoft.dotnet.xharness.cli/8.0.0-prerelease.23401.3/runtimes/any/native/adb/linux/adb -s emulator-5556 shell am instrument -e entrypoint:libname InlineArrayValid.dll -w net.dot.Loader_classloader/net.dot.MonoRunner'
        [10:40:29] info: Running instrumentation class net.dot.MonoRunner took 4.5776835 seconds
        [10:40:29] dbug: Exit code: 0
                         Std out:
                         INSTRUMENTATION_RESULT: return-code=101
                         INSTRUMENTATION_CODE: 101
                         
                         
                         
        [10:40:30] info: Instrumentation finished normally with exit code 101
        [10:40:33] dbug: Executing command: '/datadisks/disk1/work/9B9908E4/p/microsoft.dotnet.xharness.cli/8.0.0-prerelease.23401.3/runtimes/any/native/adb/linux/adb -s emulator-5556 logcat -d '
        [10:40:33] info: Wrote current ADB log to /datadisks/disk1/work/9B9908E4/w/AB97096C/uploads/Reports/Loader.classloader/InlineArray/InlineArrayValid/adb-logcat-net.dot.Loader_classloader-net.dot.MonoRunner.log
        [10:40:33] fail: Non-success instrumentation exit code: 101, expected: 100
        [10:40:33] dbug: Saving diagnostics data to '/datadisks/disk1/work/9B9908E4/w/AB97096C/e/diagnostics.json'
        XHarness exit code: 1 (TESTS_FAILED)
[10:39:01] dbug: Executing command: '/datadisks/disk1/work/9B9908E4/p/microsoft.dotnet.xharness.cli/8.0.0-prerelease.23401.3/runtimes/any/native/adb/linux/adb -s emulator-5556 shell am instrument -e entrypoint:libname InlineArrayInvalid.dll -w net.dot.Loader_classloader/net.dot.MonoRunner'
        [10:39:05] info: Running instrumentation class net.dot.MonoRunner took 3.6335327 seconds
        [10:39:05] dbug: Exit code: 0
                         Std out:
                         INSTRUMENTATION_RESULT: shortMsg=Process crashed.
                         INSTRUMENTATION_CODE: 0
                         
                         
                         
        [10:39:05] info: Short message:
                         Process crashed.
        [10:39:05] fail: No value for 'return-code' provided in instrumentation result. This may indicate a crashed test (see log)

Error Message

Fill the error message using step by step known issues guidance.

{
  "ErrorPattern": "Loader\\/classloader\\/InlineArray\\/[\\w\\/.\\s]+\\[FAIL\\]",
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=369496
Error message validated: Loader\/classloader\/InlineArray\/[\w\/.\s]+\[FAIL\]
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 8/14/2023 10:41:08 AM UTC

Report

Build Definition Test Pull Request
374175 dotnet/runtime Loader.classloader.WorkItemExecution #90583
373829 dotnet/runtime Loader/classloader/regressions/GitHub_82187/GitHub_82187/GitHub_82187.sh #90412
372849 dotnet/runtime Loader.classloader.WorkItemExecution #90519
372550 dotnet/runtime Loader.classloader.WorkItemExecution
372445 dotnet/runtime Loader.classloader.WorkItemExecution #90270
372307 dotnet/runtime Loader.classloader.WorkItemExecution
372186 dotnet/runtime Loader.classloader.WorkItemExecution
371961 dotnet/runtime Loader.classloader.WorkItemExecution #90023
369496 dotnet/runtime Loader.classloader.WorkItemExecution

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
2 9 9
@matouskozak matouskozak added blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab labels Aug 11, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Aug 11, 2023
@ghost
Copy link

ghost commented Aug 11, 2023

Tagging subscribers to this area: @directhex
See info in area-owners.md if you want to be subscribed.

Issue Details

Error Blob

{
  "ErrorMessage": "Non-success instrumentation exit code: 101, expected: 100",
  "ErrorMessage": "No value for 'return-code' provided in instrumentation result. This may indicate a crashed test (see log)",
  "BuildRetry": false,
  "ErrorPattern": "",
  "ExcludeConsoleLog": true
}
[10:40:25] dbug: Executing command: '/datadisks/disk1/work/9B9908E4/p/microsoft.dotnet.xharness.cli/8.0.0-prerelease.23401.3/runtimes/any/native/adb/linux/adb -s emulator-5556 shell am instrument -e entrypoint:libname InlineArrayValid.dll -w net.dot.Loader_classloader/net.dot.MonoRunner'
        [10:40:29] info: Running instrumentation class net.dot.MonoRunner took 4.5776835 seconds
        [10:40:29] dbug: Exit code: 0
                         Std out:
                         INSTRUMENTATION_RESULT: return-code=101
                         INSTRUMENTATION_CODE: 101
                         
                         
                         
        [10:40:30] info: Instrumentation finished normally with exit code 101
        [10:40:33] dbug: Executing command: '/datadisks/disk1/work/9B9908E4/p/microsoft.dotnet.xharness.cli/8.0.0-prerelease.23401.3/runtimes/any/native/adb/linux/adb -s emulator-5556 logcat -d '
        [10:40:33] info: Wrote current ADB log to /datadisks/disk1/work/9B9908E4/w/AB97096C/uploads/Reports/Loader.classloader/InlineArray/InlineArrayValid/adb-logcat-net.dot.Loader_classloader-net.dot.MonoRunner.log
        [10:40:33] fail: Non-success instrumentation exit code: 101, expected: 100
        [10:40:33] dbug: Saving diagnostics data to '/datadisks/disk1/work/9B9908E4/w/AB97096C/e/diagnostics.json'
        XHarness exit code: 1 (TESTS_FAILED)
[10:39:01] dbug: Executing command: '/datadisks/disk1/work/9B9908E4/p/microsoft.dotnet.xharness.cli/8.0.0-prerelease.23401.3/runtimes/any/native/adb/linux/adb -s emulator-5556 shell am instrument -e entrypoint:libname InlineArrayInvalid.dll -w net.dot.Loader_classloader/net.dot.MonoRunner'
        [10:39:05] info: Running instrumentation class net.dot.MonoRunner took 3.6335327 seconds
        [10:39:05] dbug: Exit code: 0
                         Std out:
                         INSTRUMENTATION_RESULT: shortMsg=Process crashed.
                         INSTRUMENTATION_CODE: 0
                         
                         
                         
        [10:39:05] info: Short message:
                         Process crashed.
        [10:39:05] fail: No value for 'return-code' provided in instrumentation result. This may indicate a crashed test (see log)

Reproduction Steps

android-x64 Release AllSubsets_Mono_RuntimeTests_[minijit]/[Interp monointerpreter] CI pipelines are failing, e.g.: https://dev.azure.com/dnceng-public/public/_build/results?buildId=369496&view=logs&jobId=e847dbc9-1eac-5acf-c55e-fa567d2d051e&j=353cbaf6-57aa-5b05-dc8e-ce12fdf7e230&t=a84f0057-2a70-587b-3a48-3788b239fbbb

Author: matouskozak
Assignees: -
Labels:

blocking-clean-ci, area-Infrastructure-mono, Known Build Error

Milestone: -

@matouskozak
Copy link
Member Author

The tests were enabled in #90192 and since then started failing on runtime-extra-platforms.

@SamMonoRT
Copy link
Member

@AaronRobinsonMSFT - If these newly added tests can't be fixed for Mono (which at the moment we don't have cycles for RC1 cutoff) I suggest we disable the failing tests on mono lanes to get extra-platforms JIT/Interpreter lanes running again. The extra-platforms is not green and we are trying to make sure it doesn't regress further.

cc @lambdageek

@SamMonoRT SamMonoRT added this to the 8.0.0 milestone Aug 11, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Aug 11, 2023
@AaronRobinsonMSFT
Copy link
Member

@AaronRobinsonMSFT - If these newly added tests can't be fixed for Mono (which at the moment we don't have cycles for RC1 cutoff) I suggest we disable the failing tests on mono lanes to get extra-platforms JIT/Interpreter lanes running again.

Sure. I encourage the mono owners to handle that specifically. The disable mechanisms for some of these mono scenrios are unknown to a lot of us. I'm open to reviewing any PR, but simply disabling a test should be relatively simple to handle, no?

@SamMonoRT
Copy link
Member

@AaronRobinsonMSFT - If these newly added tests can't be fixed for Mono (which at the moment we don't have cycles for RC1 cutoff) I suggest we disable the failing tests on mono lanes to get extra-platforms JIT/Interpreter lanes running again.

Sure. I encourage the mono owners to handle that specifically. The disable mechanisms for some of these mono scenrios are unknown to a lot of us. I'm open to reviewing any PR, but simply disabling a test should be relatively simple to handle, no?

Sounds good - @matouskozak will create a PR to disable those and add you for a quick review.

@lambdageek
Copy link
Member

The tests are working on desktop, right? it's just android-x64 that's busted?

@lambdageek
Copy link
Member

lambdageek commented Aug 11, 2023

Looking at the CI run for #90192 - it doesn't look like the runtime tests ran on Mono at all??? @steveisok @directhex Are the CI triggers not running tests with Mono if only src/tests changes???

Nevermind. the lanes are just named differently. For example these ran runtime tests:

Build osx-x64 Release AllSubsets_Mono_Interpreter_RuntimeTests monointerpreter
Build osx-x64 Release AllSubsets_Mono_Minijit_RuntimeTests minijit

So it is just mobile that is acting up

@SamMonoRT
Copy link
Member

SamMonoRT commented Aug 11, 2023

Looking at the CI run for #90192 - it doesn't look like the runtime tests ran on Mono at all??? @steveisok @directhex Are the CI triggers not running tests with Mono if only src/tests changes???

It's the lanes in extra-platforms, which most devs outside mono don't run

[Sam edit] - Interesting, yeah, seems like no Mono runtime tests lanes were triggered at all. But the failures have been in Android only lanes -- [Sam Edit #2] - Some mono lanes have indeed run. Only failures are in extra-platforms.

@SamMonoRT
Copy link
Member

The tests are working on desktop, right? it's just android-x64 that's busted?

Yes, this is accurate

@kotlarmilos
Copy link
Member

kotlarmilos commented Aug 15, 2023

The test InlineArrayValid is unable to resolve the SpanArr class from the assembly when executed on both minijit and the interpreter. On the other hand, the test InlineArrayInvalid passes on minijit but fails on the interpreter. The failure asserts in:

g_assert (klass != NULL);

I was unable to reproduce the issue locally using an arm64 emulator in any configuration. Here is an example from interpreter run:

08-15 11:05:08.651  7146  7161 D DOTNET  : Interp Enabled
08-15 11:05:08.652  7146  7161 D DOTNET  : assembly_preload_hook: System.Private.CoreLib (null) /data/user/0/net.dot.Loader_classloader/files
08-15 11:05:08.687  7146  7161 D DOTNET  : assembly_preload_hook: InlineArrayInvalid.dll (null) /data/user/0/net.dot.Loader_classloader/files
08-15 11:05:08.687  7146  7161 D DOTNET  : Executable: InlineArrayInvalid.dll
08-15 11:05:08.687  7146  7161 D DOTNET  : assembly_preload_hook: System.Runtime  /data/user/0/net.dot.Loader_classloader/files
08-15 11:05:08.687  7146  7161 D DOTNET  : assembly_preload_hook: System.Console  /data/user/0/net.dot.Loader_classloader/files
08-15 11:05:08.687  7146  7161 D DOTNET  : assembly_preload_hook: System.Runtime.InteropServices  /data/user/0/net.dot.Loader_classloader/files
08-15 11:05:08.687  7146  7161 D DOTNET  : assembly_preload_hook: xunit.core  /data/user/0/net.dot.Loader_classloader/files
08-15 11:05:08.688  7146  7161 D DOTNET  : assembly_preload_hook: xunit.assert  /data/user/0/net.dot.Loader_classloader/files
08-15 11:05:08.688  7146  7161 D DOTNET  : assembly_preload_hook: System.Threading  /data/user/0/net.dot.Loader_classloader/files
08-15 11:05:08.694  7146  7161 D DOTNET  : assembly_preload_hook: System.Memory  /data/user/0/net.dot.Loader_classloader/files
08-15 11:05:08.695  7146  7161 I DOTNET  : Explicit_Fails...
08-15 11:05:08.695  7146  7161 D DOTNET  : assembly_preload_hook: InvalidCSharp  /data/user/0/net.dot.Loader_classloader/files
08-15 11:05:08.696  7146  7161 D DOTNET  : assembly_preload_hook: System.Threading.Tasks  /data/user/0/net.dot.Loader_classloader/files
08-15 11:05:08.696  7146  7161 I DOTNET  : ZeroLength_Fails...
08-15 11:05:08.696  7146  7161 I DOTNET  : TooLarge_Fails...
08-15 11:05:08.696  7146  7161 I DOTNET  : NegativeLength_Fails...
08-15 11:05:08.696  7146  7161 I DOTNET  : NoFields_Fails...
08-15 11:05:08.696  7146  7161 I DOTNET  : TwoFields_Fails...
08-15 11:05:08.696  7146  7161 D DOTNET  : Exit code: 100.
08-15 11:05:08.697  7146  7161 I DOTNET  : MonoRunner finished, return-code=100

The failure is android-specific since the tests run on linux and apple mobile platforms. Initially I thought the ilstrip removed more than it should have, but it is not enabled by default. I can try to enable the android-arm64 CI job to check if it is arch-specific. @lambdageek Any other ideas I could try?

@lambdageek
Copy link
Member

lambdageek commented Aug 15, 2023

@kotlarmilos mono_class_has_failure is pretty widely used, unfortunately. it will be hard to figure out why it's getting a null klass without a stack trace.

I guess if normal debugging doesn't work, I would remove mono_class_has_failure from class.c and change its declaration in class-internals.h like this:

static inline gboolean
mono_class_has_failure_inline (MonoClass *klass, const char *func, const char *file, int line)
{ 
	g_assertf(klass != NULL, "klass ptr is null in %s at %s:%d\n", func, file, line);
	return m_class_has_failure ((MonoClass*)klass) != 0;
}

#define mono_class_has_failure (klass) mono_class_has_failure_inline(klass, __func__, __FILE__, __LINE__)

and hopefully the immediate caller will make it clear what is happening. if that doesn't work, I would try harder to get debugging working ;-)

@matouskozak matouskozak removed blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab labels Aug 15, 2023
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Aug 16, 2023
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Aug 16, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Sep 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Infrastructure-mono disabled-test The test is disabled in source code against the issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants