Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Android emulator not booting completely on Helix queue #1448

Open
3 tasks
akoeplinger opened this issue Nov 20, 2023 · 8 comments
Open
3 tasks

Android emulator not booting completely on Helix queue #1448

akoeplinger opened this issue Nov 20, 2023 · 8 comments

Comments

@akoeplinger
Copy link
Member

akoeplinger commented Nov 20, 2023

Build

https://dev.azure.com/dnceng-public/public/_build/results?buildId=472093

Build leg reported

android-x86 Release AllSubsets_Mono

Pull Request

dotnet/runtime#93220

Known issue core information

Fill out the known issue JSON section by following the step by step documentation on how to create a known issue

 {
    "ErrorMessage" : "Did not detect boot completion variable on device",
    "BuildRetry": false,
    "ErrorPattern": "",
    "ExcludeConsoleLog": false
 }

@dotnet/dnceng

Release Note Category

  • Feature changes/additions
  • Bug fixes
  • Internal Infrastructure Improvements

Release Note Description

Additional information about the issue reported

No response

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=472093
Error message validated: Did not detect boot completion variable on device
Result validation: ❌ Known issue did not match with the provided build.
Validation performed at: 11/20/2023 10:43:09 AM UTC

Report

Build Definition Test Pull Request
963460 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution
962946 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution
962727 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112728
961063 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#110472
961015 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112853
960352 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112728
960054 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112632
959687 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112513
959166 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112753
959056 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution
958954 dotnet/runtime Microsoft.Extensions.Logging.Tests.WorkItemExecution
958865 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112782
958022 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112705
957330 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112594
957297 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112721
957284 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112543
955765 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112667
955572 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112662
955558 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112642
955508 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#111791
955417 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112632
955356 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112639
955192 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112595
954896 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112513
954833 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112461
953761 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112535
953310 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112593
952116 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112543
951650 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112404
951118 dotnet/runtime System.Runtime.Caching.Tests.WorkItemExecution dotnet/runtime#112480
951120 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#111564
951065 dotnet/runtime Microsoft.Bcl.Numerics.Tests.WorkItemExecution
950119 dotnet/runtime System.Drawing.Primitives.Tests.WorkItemExecution
949904 dotnet/runtime Android.Device_Emulator.JIT.Test.WorkItemExecution dotnet/runtime#112352
949878 dotnet/runtime System.Diagnostics.DiagnosticSource.Tests.WorkItemExecution
948533 dotnet/runtime System.Net.WebSockets.Client.Tests.WorkItemExecution
948247 dotnet/runtime IntrinsicsInSystemPrivateCoreLib.Tests.WorkItemExecution dotnet/runtime#111666
947649 dotnet/runtime Microsoft.Bcl.Numerics.Tests.WorkItemExecution
947319 dotnet/runtime System.Composition.AttributeModel.Tests.WorkItemExecution
947329 dotnet/runtime System.Diagnostics.Process.Tests.WorkItemExecution
946847 dotnet/runtime IntrinsicsInSystemPrivateCoreLib.Tests.WorkItemExecution
946432 dotnet/runtime IntrinsicsInSystemPrivateCoreLib.Tests.WorkItemExecution
942068 dotnet/runtime System.Collections.Immutable.Tests.WorkItemExecution
941724 dotnet/runtime System.Security.Cryptography.OpenSsl.Tests.WorkItemExecution
940311 dotnet/runtime Microsoft.Extensions.DependencyInjection.ExternalContainers.Tests.WorkItemExecution dotnet/runtime#111666
939831 dotnet/runtime System.Runtime.Caching.Tests.WorkItemExecution dotnet/runtime#111666

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
3 16 46
@premun
Copy link
Member

premun commented Nov 20, 2023

fyi @dougbu this seems to be catching cases when Android emulators are not booted properly

@akoeplinger
Copy link
Member Author

Not sure why the result validation doesn't match, do we need to set up something special to monitor the runtime-extra-platforms pipeline?

@dougbu
Copy link
Member

dougbu commented Nov 20, 2023

This feels very similar to #1383 and #1415. The general theme is the emulator isn't starting as quickly as expected (there's a 5 minute loop checking for boot_completed in the XHarness case) or just isn't started. We haven't made much progress on either issue, partially because only @premun knows much about the emulators and he's busy elsewhere.

the ubuntu.2204.amd64.android.29.open queue is one of many we've had problems with when deploying in our staging environment.

I can see how dotnet/xharness#1106 could help here and suggest we keep an eye on this issue for additional hits.

@akoeplinger
Copy link
Member Author

#1383 should be different since that is about Android devices i.e. there's no emulator to start so if they report not booted the device is usually hosed.

If the emulator issue is really about not starting fast enough I think I'd be happy if you add a sleep 5min into the VM provisioning as a quick workaround.

@AlitzelMendez
Copy link
Member

Not sure why the result validation doesn't match, do we need to set up something special to monitor the runtime-extra-platforms pipeline?

for this particular question the problem is not the runtime-extra-platforms (we are analyzing it), it is a problem on our side when there are helix work items internal retries, we are not analyzing the logs of all the attempts, created an issue for this: #1467

@dougbu
Copy link
Member

dougbu commented Nov 23, 2023

I'd be happy if you add a sleep 5min into the VM provisioning as a quick workaround.

there's a 5 minute loop just prior to the failing sys.boot_completed search in the function starting at https://github.com/dotnet/xharness/blob/38841f0f33ca713ca5d6388c681bdd911425b488/src/Microsoft.DotNet.XHarness.Android/AdbRunner.cs#L191

personally, I'm nervous about adding Thread.Sleep(...) in that code b/c @premun seemed confident my similar actions for #1415 (where I extended a loop searching for a different readiness signal) were unhelpful. we found that "fix" only reduced the likelihood of our validation failures; a build soon after my fix went in failed again and we (temporarily❔) gave up

if someone understands dotnet/xharness better, please chime in❕

@dougbu
Copy link
Member

dougbu commented Jan 8, 2024

@akoeplinger how are things going w/ your fix attempt(s)❔

@akoeplinger
Copy link
Member Author

I just came back from vacation, will take another stab at this early next week :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants