Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM on outerloop-linux runs of runtime-libraries-coreclr pipeline on System.Runtime.Tests and System.Runtime.Numerics.Tests #89173

Open
akoeplinger opened this issue Jul 19, 2023 · 2 comments
Labels
area-System.Numerics blocking-clean-ci-optional Blocking optional rolling runs blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs os-linux Linux OS (any supported distro) runtime-coreclr specific to the CoreCLR runtime test-bug Problem in test source code (most likely)
Milestone

Comments

@akoeplinger
Copy link
Member

akoeplinger commented Jul 19, 2023

The System.Runtime and System.Runtime.Numerics work items consistently fail on the linux outerloop pipeline:

/root/helix/work/workitem/e /root/helix/work/workitem/e
  Discovering: System.Runtime.Numerics.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Runtime.Numerics.Tests (found 7 of 562 test cases)
  Starting:    System.Runtime.Numerics.Tests (parallel test collections = on, max threads = 2)
./RunTests.sh: line 168:    22 Killed                  "$RUNTIME_PATH/dotnet" exec --runtimeconfig System.Runtime.Numerics.Tests.runtimeconfig.json --depsfile System.Runtime.Numerics.Tests.deps.json xunit.console.dll System.Runtime.Numerics.Tests.dll -xml testResults.xml -nologo -nocolor -trait category=OuterLoop -notrait category=IgnoreForCI -notrait category=failing $RSP_FILE
/root/helix/work/workitem/e
----- end Fri 07 Jul 2023 11:41:50 AM UTC ----- exit code 137 ----------------------------------------------------------
exit code 137 means SIGKILL Killed eg by kill

Looking further down in the log we can see that they've been killed by the Out of Memory (OOM) killer:

[ 2661.445879] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/docker/6a407bb075622c6a7f6a9a7b88ddd30217d8294911afde258ca18d342721bf3b,task=dotnet,pid=25681,uid=1000
[ 2661.445924] Out of memory: Killed process 25681 (dotnet) total-vm:273925828kB, anon-rss:7218652kB, file-rss:0kB, shmem-rss:14728kB, UID:1000 pgtables:14564kB oom_score_adj:0

This has been going on for as long as there are logs in AzDO (June 19th).

@akoeplinger akoeplinger added os-linux Linux OS (any supported distro) area-Infrastructure runtime-coreclr specific to the CoreCLR runtime blocking-clean-ci-optional Blocking optional rolling runs blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs labels Jul 19, 2023
@akoeplinger akoeplinger added this to the 8.0.0 milestone Jul 19, 2023
@ghost
Copy link

ghost commented Jul 19, 2023

Tagging subscribers to this area: @dotnet/runtime-infrastructure
See info in area-owners.md if you want to be subscribed.

Issue Details

The System.Runtime and System.Runtime.Numerics work items consistently fail on the linux outerloop pipeline:

/root/helix/work/workitem/e /root/helix/work/workitem/e
  Discovering: System.Runtime.Numerics.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Runtime.Numerics.Tests (found 7 of 562 test cases)
  Starting:    System.Runtime.Numerics.Tests (parallel test collections = on, max threads = 2)
./RunTests.sh: line 168:    22 Killed                  "$RUNTIME_PATH/dotnet" exec --runtimeconfig System.Runtime.Numerics.Tests.runtimeconfig.json --depsfile System.Runtime.Numerics.Tests.deps.json xunit.console.dll System.Runtime.Numerics.Tests.dll -xml testResults.xml -nologo -nocolor -trait category=OuterLoop -notrait category=IgnoreForCI -notrait category=failing $RSP_FILE
/root/helix/work/workitem/e
----- end Fri 07 Jul 2023 11:41:50 AM UTC ----- exit code 137 ----------------------------------------------------------
exit code 137 means SIGKILL Killed eg by kill

Looking further down in the log we can see that they've been killed by the Out of Memory (OOM) killer:

[ 2661.445879] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/docker/6a407bb075622c6a7f6a9a7b88ddd30217d8294911afde258ca18d342721bf3b,task=dotnet,pid=25681,uid=1000
[ 2661.445924] Out of memory: Killed process 25681 (dotnet) total-vm:273925828kB, anon-rss:7218652kB, file-rss:0kB, shmem-rss:14728kB, UID:1000 pgtables:14564kB oom_score_adj:0
Author: akoeplinger
Assignees: -
Labels:

os-linux, area-Infrastructure, runtime-coreclr, blocking-clean-ci-optional, blocking-outerloop

Milestone: 8.0.0

@ericstj
Copy link
Member

ericstj commented Aug 14, 2023

Unlikely a product bug - issue with test that needs to be better conditioned for its environment.

@ericstj ericstj modified the milestones: 8.0.0, 9.0.0 Aug 14, 2023
@ericstj ericstj added the test-bug Problem in test source code (most likely) label Aug 14, 2023
@jeffhandley jeffhandley modified the milestones: 9.0.0, Future Jul 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Numerics blocking-clean-ci-optional Blocking optional rolling runs blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs os-linux Linux OS (any supported distro) runtime-coreclr specific to the CoreCLR runtime test-bug Problem in test source code (most likely)
Projects
None yet
Development

No branches or pull requests

3 participants