Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: tpccbench/nodes=3/cpu=16 failed [overload] #62244

Closed
cockroach-teamcity opened this issue Mar 19, 2021 · 5 comments
Closed

roachtest: tpccbench/nodes=3/cpu=16 failed [overload] #62244

cockroach-teamcity opened this issue Mar 19, 2021 · 5 comments
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.

Comments

@cockroach-teamcity
Copy link
Member

(roachtest).tpccbench/nodes=3/cpu=16 failed on release-21.1@cba235b11752e2f1e1c6dfee7ac55c514f4ea930:

		  |   | main.main
		  |   | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1852
		  |   | runtime.main
		  |   | 	/usr/local/go/src/runtime/proc.go:204
		  |   | runtime.goexit
		  |   | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		  | Wraps: (2) 3: dead
		  | Error types: (1) *withstack.withStack (2) *errutil.leafError
		Wraps: (3) secondary error attachment
		  | 1: dead
		  | (1) attached stack trace
		  |   -- stack trace:
		  |   | main.glob..func14
		  |   | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1147
		  |   | main.wrap.func1
		  |   | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:271
		  |   | github.com/spf13/cobra.(*Command).execute
		  |   | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:830
		  |   | github.com/spf13/cobra.(*Command).ExecuteC
		  |   | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:914
		  |   | github.com/spf13/cobra.(*Command).Execute
		  |   | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:864
		  |   | main.main
		  |   | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1852
		  |   | runtime.main
		  |   | 	/usr/local/go/src/runtime/proc.go:204
		  |   | runtime.goexit
		  |   | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		  | Wraps: (2) 1: dead
		  | Error types: (1) *withstack.withStack (2) *errutil.leafError
		Wraps: (4) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1147
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:271
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:830
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:914
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:864
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1852
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:204
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (5) 2: dead
		Error types: (1) errors.Unclassified (2) *secondary.withSecondaryError (3) *secondary.withSecondaryError (4) *withstack.withStack (5) *errutil.leafError

More

Artifacts: /tpccbench/nodes=3/cpu=16
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity cockroach-teamcity added branch-release-21.1 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Mar 19, 2021
@cockroach-teamcity
Copy link
Member Author

(roachtest).tpccbench/nodes=3/cpu=16 failed on release-21.1@2bdb62260a178e5bb63cf15f704944c5384f4347:

The test failed on branch=release-21.1, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/tpccbench/nodes=3/cpu=16/run_1
	cluster.go:2220,tpcc.go:807,search.go:43,search.go:173,tpcc.go:803,tpcc.go:617,test_runner.go:767: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod stop teamcity-2798738-1616393020-27-n4cpu16:1-3 returned: exit status 1
		(1) /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod stop teamcity-2798738-1616393020-27-n4cpu16:1-3 returned
		  | stderr:
		  |
		  | stdout:
		  | teamcity-2798738-1616393020-27-n4cpu16: stopping and waiting....................................................................................................................................................................................................................................................................................................................................................................
		  | 2: exit status 255: 
		  | I210322 08:34:36.428445 1 (gostd) cluster_synced.go:1732  [-] 1  command failed
		Wraps: (2) exit status 1
		Error types: (1) *main.withCommandDetails (2) *exec.ExitError

More

Artifacts: /tpccbench/nodes=3/cpu=16
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@nvanbenschoten
Copy link
Member

In the first failure, we see that Cockroach OOMed:

[12721.799898] Out of memory: Kill process 16607 (cockroach) score 956 or sacrifice child
[12721.808178] Killed process 16607 (cockroach) total-vm:21204016kB, anon-rss:14081048kB, file-rss:0kB, shmem-rss:0kB
[12722.734336] oom_reaper: reaped process 16607 (cockroach), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

The second looks more similar to what we were seeing in #61973.

@irfansharif can I ask you to continue investigating this? Something still looks off here and we'll need to stabilize this roachtest before a release can go out the door.

@tbg
Copy link
Member

tbg commented Mar 22, 2021

@irfansharif is looking into the tracing perf gap which is also valuable, I can take a stab at this one instead.

@tbg tbg assigned tbg and unassigned irfansharif Mar 22, 2021
@tbg
Copy link
Member

tbg commented Mar 22, 2021

The second looks more similar to what we were seeing in #61973.

What do you mean by that exactly?

@irfansharif
Copy link
Contributor

#62145 (comment) yea unfortunately #62039 was still not good enough.

@JuanLeon1 JuanLeon1 removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Mar 22, 2021
@tbg tbg assigned tbg and unassigned tbg Mar 24, 2021
@tbg tbg changed the title roachtest: tpccbench/nodes=3/cpu=16 failed roachtest: tpccbench/nodes=3/cpu=16 failed [overload] Mar 29, 2021
@tbg tbg removed the GA-blocker label Mar 29, 2021
@tbg tbg closed this as completed Apr 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Projects
None yet
Development

No branches or pull requests

5 participants