-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: tpccbench/nodes=6/cpu=16/multi-az failed #61189
Comments
(roachtest).tpccbench/nodes=6/cpu=16/multi-az failed on master@9595a158f0233e1c3d86786ec4462dd39c7beb20:
More
Artifacts: /tpccbench/nodes=6/cpu=16/multi-az
See this test on roachdash |
(roachtest).tpccbench/nodes=6/cpu=16/multi-az failed on master@7d1324fa42732f482329a524b0166db8dd7365e6:
More
Artifacts: /tpccbench/nodes=6/cpu=16/multi-az
See this test on roachdash |
(roachtest).tpccbench/nodes=6/cpu=16/multi-az failed on master@9ba48738bc511ad6954682cab41e23b8492facd8:
More
Artifacts: /tpccbench/nodes=6/cpu=16/multi-az
See this test on roachdash |
(roachtest).tpccbench/nodes=6/cpu=16/multi-az failed on master@6de4313ec216161c79fe725fcc31fc87ef1804ea:
More
Artifacts: /tpccbench/nodes=6/cpu=16/multi-az
See this test on roachdash |
(roachtest).tpccbench/nodes=6/cpu=16/multi-az failed on master@b703e663da8ededaee2e28fc39a24e3880ae54cf:
More
Artifacts: /tpccbench/nodes=6/cpu=16/multi-az
See this test on roachdash |
(roachtest).tpccbench/nodes=6/cpu=16/multi-az failed on master@15a185606d5e80b47d9fdd0ed4f54cfe29c527c6:
More
Artifacts: /tpccbench/nodes=6/cpu=16/multi-az
See this test on roachdash |
(roachtest).tpccbench/nodes=6/cpu=16/multi-az failed on master@a69e6549a71f5a0e83eb13509001f4d7351050fb:
More
Artifacts: /tpccbench/nodes=6/cpu=16/multi-az
See this test on roachdash |
Logs look pretty unhappy. Slow latches, waiting for pushees for >60s, and ultimately an ill-timed death that I can't attribute to anything but will assume is related to an OOM. @nvanbenschoten is this a beta/release blocker? I know that these tests have been unstable for a long time. If the issue is that they are overloading nodes, can we change the line search algorithm to be more conservatively approaching the target value from the bottom? |
Dup #59424. |
For this build:
n5 dissappears around 14:44:22. Other nodes pick up on it soon enough.
n5's allocated memory is pretty close to the 13G limit that precipitates the OOM killer I've seen elsewhere (#59424).
So I'm guessing it's an OOM? What's confusing to me is that there's no mention of the oom-killer in n5's dmesg.txt. @tbg, @nvanbenschoten: Is that possible? Memstats pre-crash (?) here. Heap-profiles here and here (unfortunately the most recent one is 5m before we crashed). Last goroutine dump here (6m before crash). |
(roachtest).tpccbench/nodes=6/cpu=16/multi-az failed on master@6601d827b814d4e85a1081b03bf2562d8ac2a4ab:
More
Artifacts: /tpccbench/nodes=6/cpu=16/multi-az
Related:
roachtest: tpccbench/nodes=6/cpu=16/multi-az failed #59044 roachtest: tpccbench/nodes=6/cpu=16/multi-az failed C-test-failure O-roachtest O-robot branch-release-20.1
roachtest: tpccbench/nodes=6/cpu=16/multi-az failed #58641 roachtest: tpccbench/nodes=6/cpu=16/multi-az failed C-test-failure O-roachtest O-robot branch-release-20.2
roachtest: tpccbench/nodes=6/cpu=16/multi-az failed #55544 roachtest: tpccbench/nodes=6/cpu=16/multi-az failed C-test-failure O-roachtest O-robot branch-release-19.2
See this test on roachdash
powered by pkg/cmd/internal/issues
The text was updated successfully, but these errors were encountered: