Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.Threading.Tasks.Tests timed out on net5.0-Linux-Debug-arm64-Mono_release #42024

Closed
danmoseley opened this issue Sep 9, 2020 · 39 comments
Milestone

Comments

@danmoseley
Copy link
Member

danmoseley commented Sep 9, 2020

net5.0-Linux-Debug-arm64-Mono_release-(Ubuntu.1804.ArmArch.Open)Ubuntu.1804.ArmArch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-16.04-helix-arm64v8-bfcd90a-20200127194925

----- start Wed Sep 9 13:12:33 UTC 2020 =============== To repro directly: =====================================================
pushd .
/root/helix/work/correlation/dotnet exec --runtimeconfig System.Threading.Tasks.Tests.runtimeconfig.json --depsfile System.Threading.Tasks.Tests.deps.json xunit.console.dll System.Threading.Tasks.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing 
popd
===========================================================================================================
/root/helix/work/workitem /root/helix/work/workitem
  Discovering: System.Threading.Tasks.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Threading.Tasks.Tests (found 558 of 738 test cases)
  Starting:    System.Threading.Tasks.Tests (parallel test collections = on, max threads = 6)
    System.Threading.Tasks.Tests.ExecutionContextFlowTest.TaskCompletionSourceDoesntCaptureExecutionContext [SKIP]
      Condition(s) not met: "IsPreciseGcSupported"
    System.Threading.Tasks.Tests.TaskSchedulerTests.GetTaskSchedulersForDebugger_DebuggerAttached_ReturnsAllSchedulers [SKIP]
      Condition(s) not met: "DebuggerIsAttached"
    System.Threading.Tasks.Tests.TaskSchedulerTests.GetScheduledTasksForDebugger_DebuggerAttached_ReturnsTasksFromCustomSchedulers [SKIP]
      Condition(s) not met: "DebuggerIsAttached"
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:02:04
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:04:04
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:06:04
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:08:04
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:10:04
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:12:04
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:14:04
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:16:04
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:18:04
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:20:04
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:22:04
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:24:04
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:26:04
   System.Threading.Tasks.Tests: [Long Running Test] 'System.Threading.Tasks.Tests.TaskContinueWithTests.RunContinueWithParamsTest_IllegalArgs', Elapsed: 00:28:04

...
[EXECUTION TIMED OUT]
Exit Code:-3Executor timed out after 1800 seconds and was killed

https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-42014-merge-47dea998f8584728a3/System.Threading.Tasks.Tests/console.181c257c.log?sv=2019-02-02&se=2020-09-29T13%3A09%3A28Z&sr=c&sp=rl&sig=lQ0DpS2rIbgdmMDvCYSnjAiZcVgh3Mgaf9fv5VNVCog%3D
https://dev.azure.com/dnceng/public/_build/results?buildId=807090&view=ms.vss-test-web.build-test-results-tab&runId=25594170&resultId=178179&paneView=dotnet-dnceng.dnceng-build-release-tasks.helix-test-information-tab

Can't say whether it's a Mono issue, or a rare testcase issue that hasn't shown up in other configurations yet.

Note there is no dump, but @ViktorHofer is working to migrate to vstest, at which point we will have dumps for hangs and timeouts like this.

Runfo Tracking Issue: System.Threading.Tasks.Tests timeout Mono

Build Definition Kind Run Name Console Core Dump Test Results Run Client
1067051 runtime PR 50479 net6.0-OSX-Debug-x64-Mono_release-OSX.1014.Amd64.Open console.log core dump runclient.py
1067051 runtime PR 50479 net6.0-OSX-Debug-x64-Mono_release-OSX.1015.Amd64.Open console.log core dump
1067051 runtime PR 50479 net6.0-Linux-Debug-x64-Mono_release-(Centos.8.Amd64.Open)Ubuntu.1604.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:centos-8-helix-20201229003624-c1bf759 console.log core dump
1067051 runtime PR 50479 net6.0-Linux-Debug-x64-Mono_release-RedHat.7.Amd64.Open console.log core dump
1067051 runtime PR 50479 net6.0-Linux-Debug-x64-Mono_release-(Debian.10.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:debian-10-helix-amd64-bfcd90a-20200121150006 console.log core dump
1067051 runtime PR 50479 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1604.Amd64.Open console.log core dump
1067051 runtime PR 50479 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1804.Amd64.Open console.log core dump
1067051 runtime PR 50479 net6.0-Linux-Debug-x64-Mono_release-SLES.15.Amd64.Open console.log core dump
1067051 runtime PR 50479 net6.0-Linux-Debug-x64-Mono_release-(Fedora.30.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:fedora-30-helix-20200512010621-4f8cef7 console.log core dump
1067051 runtime PR 50479 net6.0-Linux-Debug-arm64-Mono_release-(Ubuntu.1804.ArmArch.Open)Ubuntu.1804.ArmArch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-16.04-helix-arm64v8-20210106155927-56c6673 console.log core dump
1067051 runtime PR 50479 net6.0-Browser-Release-wasm-Mono_Release-wasmtestonbrowser-Ubuntu.1804.Amd64.Open console.log
1057897 runtime PR 49762 net6.0-Browser-Release-wasm-Mono_Release-wasmtestonbrowser-Ubuntu.1804.Amd64.Open console.log runclient.py
1050243 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-normal-Ubuntu.1804.Amd64.Open console.log
1050243 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-wasmtestonbrowser-Ubuntu.1804.Amd64.Open console.log
1047146 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-normal-Ubuntu.1804.Amd64.Open console.log
1047146 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-wasmtestonbrowser-Ubuntu.1804.Amd64.Open console.log
1047146 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-normal-Ubuntu.1804.Amd64.Open console.log
1047146 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-wasmtestonbrowser-Ubuntu.1804.Amd64.Open console.log
1046026 runtime PR 49740 net6.0-Linux-Debug-x64-mono_interpreter_release-Debian.9.Amd64.Open console.log core dump
1046026 runtime PR 49740 net6.0-Linux-Debug-x64-Mono_release-(Centos.8.Amd64.Open)Ubuntu.1604.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:centos-8-helix-20201229003624-c1bf759 console.log core dump
1046026 runtime PR 49740 net6.0-Linux-Debug-x64-Mono_release-RedHat.7.Amd64.Open console.log core dump runclient.py
1046026 runtime PR 49740 net6.0-Linux-Debug-x64-Mono_release-(Debian.10.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:debian-10-helix-amd64-bfcd90a-20200121150006 console.log core dump
1046026 runtime PR 49740 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1604.Amd64.Open console.log core dump
1046026 runtime PR 49740 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1804.Amd64.Open console.log core dump
1046026 runtime PR 49740 net6.0-Linux-Debug-x64-Mono_release-SLES.15.Amd64.Open console.log core dump
1046026 runtime PR 49740 net6.0-Linux-Debug-x64-Mono_release-(Fedora.30.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:fedora-30-helix-20200512010621-4f8cef7 console.log core dump
1046026 runtime PR 49740 net6.0-Linux-Debug-arm64-Mono_release-(Ubuntu.1804.ArmArch.Open)Ubuntu.1804.ArmArch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-16.04-helix-arm64v8-20210106155927-56c6673 console.log core dump
1046026 runtime PR 49740 net6.0-OSX-Debug-x64-Mono_release-OSX.1014.Amd64.Open console.log core dump
1046026 runtime PR 49740 net6.0-OSX-Debug-x64-Mono_release-OSX.1015.Amd64.Open console.log core dump
1045420 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-normal-Ubuntu.1804.Amd64.Open console.log
1045420 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-wasmtestonbrowser-Ubuntu.1804.Amd64.Open console.log
1041040 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-normal-Ubuntu.1804.Amd64.Open console.log
1041040 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-wasmtestonbrowser-Ubuntu.1804.Amd64.Open console.log
1041040 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-normal-Ubuntu.1804.Amd64.Open console.log
1041040 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-wasmtestonbrowser-Ubuntu.1804.Amd64.Open console.log
1041040 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-normal-Ubuntu.1804.Amd64.Open console.log
1041040 runtime PR 49635 net6.0-Browser-Release-wasm-Mono_Release-wasmtestonbrowser-Ubuntu.1804.Amd64.Open console.log
1038213 runtime PR 49511 net6.0-Browser-Release-wasm-Mono_Release-normal-Ubuntu.1804.Amd64.Open console.log core dump test results
1038213 runtime PR 49511 net6.0-Browser-Release-wasm-Mono_Release-normal-Ubuntu.1804.Amd64.Open console.log core dump test results
1027303 runtime PR 49072 net6.0-Browser-Release-wasm-Mono_Release-normal-Ubuntu.1804.Amd64.Open console.log
1027303 runtime PR 49072 net6.0-Browser-Release-wasm-Mono_Release-wasmtestonbrowser-Ubuntu.1804.Amd64.Open console.log
1027001 runtime PR 49072 net6.0-Browser-Release-wasm-Mono_Release-normal-Ubuntu.1804.Amd64.Open console.log
1025834 runtime PR 49260 net6.0-OSX-Debug-x64-Mono_release-OSX.1014.Amd64.Open console.log core dump
1025834 runtime PR 49260 net6.0-OSX-Debug-x64-Mono_release-OSX.1015.Amd64.Open console.log core dump
1025834 runtime PR 49260 net6.0-Linux-Debug-x64-Mono_release-(Centos.8.Amd64.Open)Ubuntu.1604.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:centos-8-helix-20201229003624-c1bf759 console.log core dump
1025834 runtime PR 49260 net6.0-Linux-Debug-x64-Mono_release-RedHat.7.Amd64.Open console.log core dump
1025834 runtime PR 49260 net6.0-Linux-Debug-x64-Mono_release-(Debian.10.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:debian-10-helix-amd64-bfcd90a-20200121150006 console.log core dump
1025834 runtime PR 49260 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1604.Amd64.Open console.log core dump
1025834 runtime PR 49260 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1804.Amd64.Open console.log core dump
1025834 runtime PR 49260 net6.0-Linux-Debug-x64-Mono_release-SLES.15.Amd64.Open console.log core dump
1025834 runtime PR 49260 net6.0-Linux-Debug-x64-Mono_release-(Fedora.30.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:fedora-30-helix-20200512010621-4f8cef7 console.log core dump
1022669 runtime PR 49072 net6.0-Browser-Release-wasm-Mono_Release-normal-Ubuntu.1804.Amd64.Open console.log
1022669 runtime PR 49072 net6.0-Browser-Release-wasm-Mono_Release-wasmtestonbrowser-Ubuntu.1804.Amd64.Open console.log
1019817 runtime PR 47864 net6.0-OSX-Debug-arm64-Mono_release-OSX.1100.ARM64.Open console.log core dump
1016780 runtime PR 48908 net6.0-OSX-Debug-x64-Mono_release-OSX.1014.Amd64.Open console.log core dump
1016780 runtime PR 48908 net6.0-OSX-Debug-x64-Mono_release-OSX.1015.Amd64.Open console.log core dump
1016780 runtime PR 48908 net6.0-Linux-Debug-x64-Mono_release-(Centos.8.Amd64.Open)Ubuntu.1604.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:centos-8-helix-20201229003624-c1bf759 console.log core dump
1016780 runtime PR 48908 net6.0-Linux-Debug-x64-Mono_release-RedHat.7.Amd64.Open console.log core dump
1016780 runtime PR 48908 net6.0-Linux-Debug-x64-Mono_release-(Debian.10.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:debian-10-helix-amd64-bfcd90a-20200121150006 console.log core dump
1016780 runtime PR 48908 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1604.Amd64.Open console.log core dump
1016780 runtime PR 48908 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1804.Amd64.Open console.log core dump runclient.py
1016780 runtime PR 48908 net6.0-Linux-Debug-x64-Mono_release-SLES.15.Amd64.Open console.log core dump
1016780 runtime PR 48908 net6.0-Linux-Debug-x64-Mono_release-(Fedora.30.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:fedora-30-helix-20200512010621-4f8cef7 console.log core dump
1016780 runtime PR 48908 net6.0-Linux-Debug-x64-mono_interpreter_release-Debian.9.Amd64.Open console.log core dump
1016780 runtime PR 48908 net6.0-Linux-Debug-arm64-Mono_release-(Ubuntu.1804.ArmArch.Open)Ubuntu.1804.ArmArch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-16.04-helix-arm64v8-20210106155927-56c6673 console.log core dump
1012729 runtime PR 48788 net5.0-Linux-Debug-x64-Mono_release-(Ubuntu.1910.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-19.10-helix-amd64-cfcfd50-20191030180623 console.log runclient.py
997608 runtime PR 48254 net6.0-Linux-Debug-x64-Mono_release-SLES.15.Amd64.Open console.log runclient.py
996581 runtime PR 48348 net6.0-Linux-Debug-x64-Mono_release-(Fedora.30.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:fedora-30-helix-20200512010621-4f8cef7 console.log runclient.py
995855 runtime Rolling net6.0-Linux-Release-x64-Mono_release-(Fedora.30.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:fedora-30-helix-20200512010621-4f8cef7 console.log runclient.py
995655 runtime PR 48322 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1804.Amd64.Open console.log runclient.py
995438 runtime PR 48254 net6.0-Linux-Debug-x64-Mono_release-(Centos.8.Amd64.Open)Ubuntu.1604.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:centos-8-helix-20201229003624-c1bf759 console.log runclient.py
994830 runtime Rolling net6.0-Linux-Release-x64-Mono_release-(Debian.10.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:debian-10-helix-amd64-bfcd90a-20200121150006 console.log runclient.py
994178 runtime PR 48276 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1604.Amd64.Open console.log runclient.py
991878 runtime PR 48160 net6.0-Linux-Debug-x64-Mono_release-(Debian.10.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:debian-10-helix-amd64-bfcd90a-20200121150006 console.log runclient.py
985761 runtime Rolling net6.0-Linux-Release-x64-Mono_release-RedHat.7.Amd64.Open console.log runclient.py
985190 runtime PR 47589 net6.0-Linux-Debug-x64-Mono_release-SLES.15.Amd64.Open console.log runclient.py
984199 runtime PR 48023 net6.0-OSX-Debug-x64-Mono_release-OSX.1014.Amd64.Open console.log
984199 runtime PR 48023 net6.0-OSX-Debug-x64-Mono_release-OSX.1015.Amd64.Open console.log core dump
984199 runtime PR 48023 net6.0-Linux-Debug-x64-mono_interpreter_release-Debian.9.Amd64.Open console.log core dump
984199 runtime PR 48023 net6.0-Linux-Debug-x64-Mono_release-(Centos.8.Amd64.Open)Ubuntu.1604.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:centos-8-helix-20201229003624-c1bf759 console.log core dump
984199 runtime PR 48023 net6.0-Linux-Debug-x64-Mono_release-RedHat.7.Amd64.Open console.log core dump
984199 runtime PR 48023 net6.0-Linux-Debug-x64-Mono_release-(Debian.10.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:debian-10-helix-amd64-bfcd90a-20200121150006 console.log core dump
984199 runtime PR 48023 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1604.Amd64.Open console.log core dump
984199 runtime PR 48023 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1804.Amd64.Open console.log core dump
984199 runtime PR 48023 net6.0-Linux-Debug-x64-Mono_release-SLES.15.Amd64.Open console.log core dump
984199 runtime PR 48023 net6.0-Linux-Debug-x64-Mono_release-(Fedora.30.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:fedora-30-helix-20200512010621-4f8cef7 console.log core dump
984199 runtime PR 48023 net6.0-Linux-Debug-arm64-Mono_release-(Ubuntu.1804.ArmArch.Open)Ubuntu.1804.ArmArch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-16.04-helix-arm64v8-20210106155927-56c6673 console.log core dump
983983 runtime PR 47589 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1804.Amd64.Open console.log runclient.py
982990 runtime PR 47991 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1804.Amd64.Open console.log runclient.py
982935 runtime PR 44608 net6.0-Linux-Debug-x64-Mono_release-(Centos.8.Amd64.Open)Ubuntu.1604.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:centos-8-helix-20201229003624-c1bf759 console.log test results runclient.py
981976 runtime PR 47958 net6.0-Linux-Debug-x64-Mono_release-(Debian.10.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:debian-10-helix-amd64-bfcd90a-20200121150006 console.log runclient.py
981438 runtime PR 47945 net6.0-Linux-Debug-x64-Mono_release-SLES.15.Amd64.Open console.log runclient.py
981438 runtime PR 47945 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1804.Amd64.Open console.log runclient.py
980574 runtime PR 47918 net6.0-Linux-Debug-x64-Mono_release-(Fedora.30.Amd64.Open)ubuntu.1604.amd64.open@mcr.microsoft.com/dotnet-buildtools/prereqs:fedora-30-helix-20200512010621-4f8cef7 console.log runclient.py
980524 runtime PR 47864 net6.0-OSX-Debug-arm64-Mono_release-OSX.1100.ARM64.Open console.log core dump
980524 runtime PR 47864 net6.0-OSX-Debug-arm64-Mono_release-OSX.1100.ARM64.Open console.log core dump
980524 runtime PR 47864 net6.0-OSX-Debug-arm64-Mono_release-OSX.1100.ARM64.Open console.log core dump
980442 runtime PR 47857 net6.0-Browser-Release-wasm-Mono_Release-normal-Ubuntu.1804.Amd64.Open console.log
980442 runtime PR 47857 net6.0-Browser-Release-wasm-Mono_Release-wasmtestonbrowser-Ubuntu.1804.Amd64.Open console.log
973640 runtime PR 47519 net6.0-Linux-Debug-x64-Mono_release-Ubuntu.1604.Amd64.Open console.log runclient.py

Displaying 100 of 129 results

Build Result Summary

Day Hit Count Week Hit Count Month Hit Count
0 0 8
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Threading untriaged New issue has not been triaged by the area owner labels Sep 9, 2020
@danmoseley
Copy link
Member Author

#42014

@steveisok
Copy link
Member

According to Kusto, there are only 13 records of this method taking longer than 1 second.

Is there a good way to query how many hangs?

@danmoseley
Copy link
Member Author

@MattGal

@MattGal
Copy link
Member

MattGal commented Sep 9, 2020

taking a look.

@MattGal
Copy link
Member

MattGal commented Sep 9, 2020

Given it's only timed out 5x in mono and 3x in not-mono for the history of everything in the DB (less than .1%) I'd go with "rare testcase issue that hasn't shown up in other configurations yet."

let monoJobs = Jobs
| extend props=parse_json(Properties) 
| where props["runtimeFlavor"] == "mono" and QueueName == "ubuntu.1804.armarch.open"
| project JobId;
WorkItems
| where JobId in (monoJobs)
| where FriendlyName == "System.Threading.Tasks.Tests"
| summarize count() by Status

let librariesJobs = Jobs
| extend props=parse_json(Properties) 
| where props["runtimeFlavor"] =~ "coreclr" and QueueName == "ubuntu.1804.armarch.open"
| project JobId;
WorkItems
| where JobId in (librariesJobs)
| where FriendlyName == "System.Threading.Tasks.Tests"
| summarize count() by Status

@danmoseley
Copy link
Member Author

According to Kusto, there are only 13 records of this method taking longer than 1 second.

Sounds like then this isn't the case of getting unlucky at the tail end of a distribution of test durations. It sounds like if the tests/product work correctly, even on a heavily loaded machine it will take no more than a second or two. In this case, it did not finish in several minutes. That suggests to me that either the tests, the library or the runtime has a flaw that occasionally causes a genuine hang. And the key thing we need is a dump file. So +1 for migrating to vstest, and I guess we wait meantime.

@mangod9
Copy link
Member

mangod9 commented Sep 10, 2020

Moving to area-Infrastructure, since there is no System.Threading action at the moment.

@ghost
Copy link

ghost commented Sep 10, 2020

Tagging subscribers to this area: @ViktorHofer
See info in area-owners.md if you want to be subscribed.

@ericstj
Copy link
Member

ericstj commented Sep 11, 2020

This bug is tracking a test failure in Threading not an infrastructure issue. Infrastructure can help debug but this isn’t tracking any infra work.

@ghost
Copy link

ghost commented Sep 11, 2020

Tagging subscribers to this area: @tarekgh
See info in area-owners.md if you want to be subscribed.

@ericstj ericstj added this to the 6.0.0 milestone Sep 11, 2020
@jeffschwMSFT jeffschwMSFT removed the untriaged New issue has not been triaged by the area owner label Sep 11, 2020
@tarekgh tarekgh added arch-arm64 runtime-mono specific to the Mono runtime labels Sep 11, 2020
@danmoseley
Copy link
Member Author

Hit again in #42014

@jaredpar
Copy link
Member

@ericstj
Copy link
Member

ericstj commented Sep 29, 2020

cc @SamMonoRT @lambdageek since this is mono.

@steveisok
Copy link
Member

I ran a Kusto query for both mono and coreclr in the last week (substitue =~ "coreclr" in the RuntimeFlavor for coreclr)

let monoJobs = Jobs
| extend props=parse_json(Properties) 
| where props["runtimeFlavor"] == "mono" and QueueName == "ubuntu.1804.armarch.open"
| project JobId;
WorkItems
| where JobId in (monoJobs)
| where FriendlyName == "System.Threading.Tasks.Tests"
| where Started > datetime("2020-09-22")
| summarize count() by Status

Mono:

	Pass	 414
	BadExit	 4
	InfraRetry 2
	Timeout	 1

CoreCLR:

	Pass	811
	BadExit	18

What does BadExit mean? And assuming the query is correct, there doesn't seem to be a significant amount of timeouts.

@MattGal
Copy link
Member

MattGal commented Sep 29, 2020

What does BadExit mean? And assuming the query is correct, there doesn't seem to be a significant amount of timeouts.

non-zero exit codes. Seg faults, script failures, that sort of thing.

@ViktorHofer
Copy link
Member

This still seems very much active

@tarekgh @stephentoub can you please take a look? Note that the timeout might be mono specific.

@jaredpar
Copy link
Member

Added a live table of failures to the issue.

@tarekgh
Copy link
Member

tarekgh commented Sep 30, 2020

can you please take a look? Note that the timeout might be mono specific.

I think Mono's team can take a look as this is happening on Mono's runtime only.

@marek-safar could you please get someone look at this one?

@SamMonoRT
Copy link
Member

@steveisok - any bandwidth to take an initial look ?

@marek-safar
Copy link
Contributor

Is it Mono issue only? @steveisok query #42024 (comment) indicates that CoreCLR is crashing even more often

@tarekgh
Copy link
Member

tarekgh commented Sep 30, 2020

@marek-safar Mono is the one timing out, coreclr is not.

@jaredpar
Copy link
Member

Agree with @tarekgh about the CoreCLR version of the failure. Clicking through ~5 failures they aren't timing out. Also all the failures are on PRs so it's not clear if that is a real failure or failure because the change being tested was bad.

CoreCLR specific failures

CoreCLR specific failures excluding PRs. Note that this data set is empty over last seven days.

@steveisok
Copy link
Member

Feels like there's a bit to unravel here as the original issue was specific to net5.0-Linux-Debug-arm64-Mono_release. If you look at that queue alone, there does not seem to be that big of an issue. If I expand the Kusto query out to all queues, we have the following numbers (all runs in the last week):

Mono:

	Pass	        7804
	Timeout	        44
	BadExit	        115
	Error	        39
	InfraRetry	6
	DeadLetter	20
	Fail	                 1

CoreCLR:

	Pass	        10756
	BadExit	        208
	Timeout	        4
	Error	        2
	InfraRetry	3
	DeadLetter	17

A cursory glance into some of the builds that @jaredpar linked into the issue seem to indicate infrastructure induced timeouts/cancellations like:

https://dev.azure.com/dnceng/public/_build/results?buildId=836799&view=results https://dev.azure.com/dnceng/public/_build/results?buildId=836302&view=results

And a weird error of API rate limit exceeded. Maximum allowed 90 per 1m. in https://dev.azure.com/dnceng/public/_build/results?buildId=835053&view=logs&j=3f95ca23-483b-5dc9-4b05-6bd24fb925c2&t=86f401ca-968f-52b7-9834-4ba449ed1b68

More analysis is definitely necessary before drawing further conclusions.

Should we change the title of the issue to be something more expansive? Or close it and link to a new one? Thoughts?

@MattGal
Copy link
Member

MattGal commented Sep 30, 2020

And a weird error of API rate limit exceeded. Maximum allowed 90 per 1m. in https://dev.azure.com/dnceng/public/_build/results?buildId=835053&view=logs&j=3f95ca23-483b-5dc9-4b05-6bd24fb925c2&t=86f401ca-968f-52b7-9834-4ba449ed1b68

https://github.com/dotnet/core-eng/issues/11021

This seems to mostly be the fact that all the agents in buildpool scale sets are in the same network topology behind a NAT, so our throttling limits may be too low currently.

@steveisok
Copy link
Member

steveisok commented Sep 30, 2020

I took a closer look at each mono item that @jaredpar put in the issue. I'm trying to get to a point where we can determine if there is a significant issue w/ the suite or not. I've bucketed them as follows:

Timeouts on a Single Arch (Legit)

https://dev.azure.com/dnceng/public/_build/results?buildId=836007&view=results
https://dev.azure.com/dnceng/public/_build/results?buildId=828912&view=results

Timeouts Across Many Legs / Arch (Potentially infra driven)

https://dev.azure.com/dnceng/public/_build/results?buildId=836799
https://dev.azure.com/dnceng/public/_build/results?buildId=836302&view=results
https://dev.azure.com/dnceng/public/_build/results?buildId=835053&view=results
https://dev.azure.com/dnceng/public/_build/results?buildId=834855&view=results
https://dev.azure.com/dnceng/public/_build/results?buildId=834366&view=results
https://dev.azure.com/dnceng/public/_build/results?buildId=833333&view=results
https://dev.azure.com/dnceng/public/_build/results?buildId=832448&view=results
https://dev.azure.com/dnceng/public/_build/results?buildId=827350&view=results

No Problems

https://dev.azure.com/dnceng/public/_build/results?buildId=832690&view=results
https://dev.azure.com/dnceng/public/_build/results?buildId=831748&view=results

Not Sure

https://dev.azure.com/dnceng/public/_build/results?buildId=828501&view=results

Configuration Related / Experimental PR (Likely Not a Problem)

https://dev.azure.com/dnceng/public/_build/results?buildId=835790&view=results
https://dev.azure.com/dnceng/public/_build/results?buildId=829731&view=results

@jaredpar
Copy link
Member

jaredpar commented Sep 30, 2020

@MattGal

That API rate limit error happens a lot

https://runfo.azurewebsites.net/search/timelines/?bq=definition%3Aruntime+started%3A%7E7&tq=API+rate+limit+exceeded.+Maximum+allowed+90

500+ occurences in the last week on the runtime build definition

@MattGal
Copy link
Member

MattGal commented Sep 30, 2020

@MattGal

That API rate limit error happens a lot

https://runfo.azurewebsites.net/search/timelines/?bq=definition%3Aruntime+started%3A%7E7&tq=API+rate+limit+exceeded.+Maximum+allowed+90

500+ occurences in the last week on the runtime build definition

I've already merged a quota increase that would help here, and am discussing whether we can hotfix this (it's just changing #s in a JSON file, so maybe?) today. The tricky part about your runfo stuff is that many of those runs didn't change their fail-y-ness (e.g. https://dev.azure.com/dnceng/public/_build/results?buildId=836918&view=logs&j=d5c01a48-52b8-51d9-fe3a-6804ba4b63f8&t=215202f9-e149-511e-645c-558c2532aa74 ) because of throttling, so it's hard to say how many runs are actually broken by it, but I share your concerns and am trying to expedite it if possible.

@jaredpar
Copy link
Member

The tricky part about your runfo stuff is that many of those runs didn't change their fail-y-ness (e

Not sure what you mean here. These are all timeline issue errors so that will default to failing the build.

@MattGal
Copy link
Member

MattGal commented Sep 30, 2020

The tricky part about your runfo stuff is that many of those runs didn't change their fail-y-ness (e

Not sure what you mean here. These are all timeline issue errors so that will default to failing the build.

I mean runfo caught this but it failed exactly the same as it was going to fail without any 429s: https://dev.azure.com/dnceng/public/_build/results?buildId=836918&view=logs&j=d5c01a48-52b8-51d9-fe3a-6804ba4b63f8&t=215202f9-e149-511e-645c-558c2532aa74

@jaredpar
Copy link
Member

So essentially you want to know when the only source of errors in a build were these 429s?

@MattGal
Copy link
Member

MattGal commented Sep 30, 2020

So essentially you want to know when the only source of errors in a build were these 429s?

If you can figure out how to do that that'd be rad. I'm not sure how easy or worth it it is. A hotfix to more than double the quota per minute is rolling out presently and should be live in the next 30 m.

@lambdageek
Copy link
Member

A couple of days ago, I ran this testsuite locally about 300 times in a loop and couldn't get it to hang on OSX. (runtime and libs built with ./build.sh -s Mono+Libs+Libs.Test -c Release -rc Debug)

@stephentoub stephentoub removed the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label Jul 26, 2021
@stephentoub
Copy link
Member

Looks like this test failed frequently for a few weeks and then hasn't failed in over six months. I'm going to close this for now.

@stephentoub stephentoub reopened this Aug 4, 2021
@stephentoub
Copy link
Member

stephentoub commented Aug 4, 2021

Ok, my runfo search skills are apparently quite lacking. Not sure what I searched for that yielded an empty set.

@stephentoub
Copy link
Member

stephentoub commented Aug 4, 2021

Actually, no, I was right the first time. None of those failures are this test. A couple of them are RunContinueWithStressTestsNoState, which is also in the System.Threading.Tasks test suite... but it's supposed to be disabled:

[ActiveIssue("https://github.com/dotnet/runtime/issues/2271")]

so I don't know why it's showing up in test results from the last few days. And that one is tracked by a different issue, #2271.

@jaredpar
Copy link
Member

jaredpar commented Aug 4, 2021

Actually, no, I was right the first time. None of those failures are this test

My bad. I was looking at the title of the issue which had the test group.

@akoeplinger
Copy link
Member

akoeplinger commented Aug 4, 2021

@stephentoub the only instance of the RunContinueWithStressTestsNoState I can find is from a release/5.0 build and the test doesn't have ActiveIssue there.

@stephentoub
Copy link
Member

Ah, that explains that then.

@ghost ghost locked as resolved and limited conversation to collaborators Sep 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests