Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: tpcdsvec failed #62301

Closed
cockroach-teamcity opened this issue Mar 20, 2021 · 7 comments · Fixed by #63900
Closed

roachtest: tpcdsvec failed #62301

cockroach-teamcity opened this issue Mar 20, 2021 · 7 comments · Fixed by #63900
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.

Comments

@cockroach-teamcity
Copy link
Member

(roachtest).tpcdsvec failed on master@3d19b2cf6b290a152b23722fc32e995eed3b437b:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/tpcdsvec/run_1
	tpcdsvec.go:89,tpcdsvec.go:133,tpcdsvec.go:174,test_runner.go:768: pgx conn: dial tcp 34.70.106.7:26257: connect: connection refused
		(1) attached stack trace
		  -- stack trace:
		  | github.com/cockroachdb/cockroach/pkg/cmd/cmpconn.NewConn
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/cmpconn/conn.go:114
		  | main.registerTPCDSVec.func1.1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcdsvec.go:85
		  | main.registerTPCDSVec.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcdsvec.go:133
		  | main.registerTPCDSVec.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcdsvec.go:174
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:768
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (2) pgx conn
		Wraps: (3) dial tcp 34.70.106.7:26257
		Wraps: (4) connect
		Wraps: (5) connection refused
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *net.OpError (4) *os.SyscallError (5) syscall.Errno

	cluster.go:1667,context.go:140,cluster.go:1656,test_runner.go:849: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-2795641-1616219975-22-n3cpu4 --oneshot --ignore-empty-nodes: exit status 1 2: 20509
		3: 20839
		1: dead
		Error: UNCLASSIFIED_PROBLEM: 1: dead
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1147
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:271
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:830
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:914
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:864
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1852
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:204
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (3) 1: dead
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError

More

Artifacts: /tpcdsvec

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Mar 20, 2021
@yuzefovich
Copy link
Member

This is an OOM on node 1 when executing Q39 with stats collected. It requires more investigation, but the cursory look at the heap profile indicates a known limitation that when executing the subqueries, we're fully buffering the results in-memory:
Screen Shot 2021-03-22 at 10 24 50 PM

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpcdsvec failed on master@53bf501e233c337b9863755914d9c00010517329:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/tpcdsvec/run_1
	tpcdsvec.go:89,tpcdsvec.go:133,tpcdsvec.go:174,test_runner.go:768: pgx conn: dial tcp 34.70.105.87:26257: connect: connection refused
		(1) attached stack trace
		  -- stack trace:
		  | github.com/cockroachdb/cockroach/pkg/cmd/cmpconn.NewConn
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/cmpconn/conn.go:114
		  | main.registerTPCDSVec.func1.1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcdsvec.go:85
		  | main.registerTPCDSVec.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcdsvec.go:133
		  | main.registerTPCDSVec.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcdsvec.go:174
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:768
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (2) pgx conn
		Wraps: (3) dial tcp 34.70.105.87:26257
		Wraps: (4) connect
		Wraps: (5) connection refused
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *net.OpError (4) *os.SyscallError (5) syscall.Errno

	cluster.go:1667,context.go:140,cluster.go:1656,test_runner.go:849: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-2802936-1616478847-22-n3cpu4 --oneshot --ignore-empty-nodes: exit status 1 2: 15863
		3: 16394
		1: dead
		Error: UNCLASSIFIED_PROBLEM: 1: dead
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1147
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:271
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:830
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:914
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:864
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1852
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:204
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (3) 1: dead
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError

More

Artifacts: /tpcdsvec

See this test on roachdash
powered by pkg/cmd/internal/issues

@yuzefovich yuzefovich removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Mar 25, 2021
@cockroach-teamcity
Copy link
Member Author

(roachtest).tpcdsvec failed on master@8b137b4f068a0d590a3e86ae91fd60eb84f2750a:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/tpcdsvec/run_1
	tpcdsvec.go:89,tpcdsvec.go:133,tpcdsvec.go:174,test_runner.go:768: pgx conn: dial tcp 34.121.237.224:26257: connect: connection refused
		(1) attached stack trace
		  -- stack trace:
		  | github.com/cockroachdb/cockroach/pkg/cmd/cmpconn.NewConn
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/cmpconn/conn.go:114
		  | main.registerTPCDSVec.func1.1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcdsvec.go:85
		  | main.registerTPCDSVec.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcdsvec.go:133
		  | main.registerTPCDSVec.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcdsvec.go:174
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:768
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (2) pgx conn
		Wraps: (3) dial tcp 34.121.237.224:26257
		Wraps: (4) connect
		Wraps: (5) connection refused
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *net.OpError (4) *os.SyscallError (5) syscall.Errno

	cluster.go:1667,context.go:140,cluster.go:1656,test_runner.go:849: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-2816340-1616738703-23-n3cpu4 --oneshot --ignore-empty-nodes: exit status 1 2: 18520
		3: 18738
		1: dead
		Error: UNCLASSIFIED_PROBLEM: 1: dead
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1147
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:271
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:830
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:914
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:864
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1852
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:204
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (3) 1: dead
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError

More

Artifacts: /tpcdsvec
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpcdsvec failed on master@d891594d3c998f153b88f631e3c89ac7d12c2a6e:

						JOIN date_dim ON ss_sold_date_sk = d_date_sk
					WHERE
						sr_ticket_number IS NULL
					GROUP BY
						d_year, ss_item_sk, ss_customer_sk
				)
		SELECT
			ss_customer_sk,
			round(
				ss_qty
				/ (COALESCE(ws_qty, 0) + COALESCE(cs_qty, 0)),
				2
			)
				AS ratio,
			ss_qty AS store_qty,
			ss_wc AS store_wholesale_cost,
			ss_sp AS store_sales_price,
			COALESCE(ws_qty, 0) + COALESCE(cs_qty, 0)
				AS other_chan_qty,
			COALESCE(ws_wc, 0) + COALESCE(cs_wc, 0)
				AS other_chan_wholesale_cost,
			COALESCE(ws_sp, 0) + COALESCE(cs_sp, 0)
				AS other_chan_sales_price
		FROM
			ss
			LEFT JOIN ws ON
					ws_sold_year = ss_sold_year
					AND ws_item_sk = ss_item_sk
					AND ws_customer_sk = ss_customer_sk
			LEFT JOIN cs ON
					cs_sold_year = ss_sold_year
					AND cs_item_sk = ss_item_sk
					AND cs_customer_sk = ss_customer_sk
		WHERE
			(COALESCE(ws_qty, 0) > 0 OR COALESCE(cs_qty, 0) > 0)
			AND ss_sold_year = 1998
		ORDER BY
			ss_customer_sk,
			ss_qty DESC,
			ss_wc DESC,
			ss_sp DESC,
			other_chan_qty,
			other_chan_wholesale_cost,
			other_chan_sales_price,
			ratio
		LIMIT
			100;
		;
		
		vectorize=ON: [same as previous]

More

Artifacts: /tpcdsvec
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@yuzefovich yuzefovich self-assigned this Mar 30, 2021
@yuzefovich
Copy link
Member

The last instance is a dup of #62520.

I want to get back to this issue but am pretty busy at the moment. I believe it is due to a known limitation (#62674) and so far has only occurred on master 3 times, so possibly it is due to some non-backported to release-21.1 branch changes. Putting into the backlog for now.

@cockroach-teamcity
Copy link
Member Author

(roachtest).tpcdsvec failed on master@ed698aecdf0715c4edb91a9617bcc5df45f7ccde:

						JOIN date_dim ON ss_sold_date_sk = d_date_sk
					WHERE
						sr_ticket_number IS NULL
					GROUP BY
						d_year, ss_item_sk, ss_customer_sk
				)
		SELECT
			ss_customer_sk,
			round(
				ss_qty
				/ (COALESCE(ws_qty, 0) + COALESCE(cs_qty, 0)),
				2
			)
				AS ratio,
			ss_qty AS store_qty,
			ss_wc AS store_wholesale_cost,
			ss_sp AS store_sales_price,
			COALESCE(ws_qty, 0) + COALESCE(cs_qty, 0)
				AS other_chan_qty,
			COALESCE(ws_wc, 0) + COALESCE(cs_wc, 0)
				AS other_chan_wholesale_cost,
			COALESCE(ws_sp, 0) + COALESCE(cs_sp, 0)
				AS other_chan_sales_price
		FROM
			ss
			LEFT JOIN ws ON
					ws_sold_year = ss_sold_year
					AND ws_item_sk = ss_item_sk
					AND ws_customer_sk = ss_customer_sk
			LEFT JOIN cs ON
					cs_sold_year = ss_sold_year
					AND cs_item_sk = ss_item_sk
					AND cs_customer_sk = ss_customer_sk
		WHERE
			(COALESCE(ws_qty, 0) > 0 OR COALESCE(cs_qty, 0) > 0)
			AND ss_sold_year = 1998
		ORDER BY
			ss_customer_sk,
			ss_qty DESC,
			ss_wc DESC,
			ss_sp DESC,
			other_chan_qty,
			other_chan_wholesale_cost,
			other_chan_sales_price,
			ratio
		LIMIT
			100;
		;
		
		vectorize=ON: [same as previous]

More

Artifacts: /tpcdsvec
Related:

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

roachtest.tpcdsvec failed with artifacts on master @ 4dc05cceea254854317a9374a4b88df06c0946a6:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/tpcdsvec/run_1
	tpcdsvec.go:89,tpcdsvec.go:133,tpcdsvec.go:174,test_runner.go:777: pgx conn: dial tcp 34.71.14.111:26257: connect: connection refused
		(1) attached stack trace
		  -- stack trace:
		  | github.com/cockroachdb/cockroach/pkg/cmd/cmpconn.NewConn
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/cmpconn/conn.go:113
		  | main.registerTPCDSVec.func1.1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcdsvec.go:85
		  | main.registerTPCDSVec.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcdsvec.go:133
		  | main.registerTPCDSVec.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tpcdsvec.go:174
		  | main.(*testRunner).runTest.func2
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachtest/test_runner.go:777
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (2) pgx conn
		Wraps: (3) dial tcp 34.71.14.111:26257
		Wraps: (4) connect
		Wraps: (5) connection refused
		Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *net.OpError (4) *os.SyscallError (5) syscall.Errno

	cluster.go:1688,context.go:140,cluster.go:1677,test_runner.go:858: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-2898497-1618858605-22-n3cpu4 --oneshot --ignore-empty-nodes: exit status 1 3: 20944
		2: 20499
		1: dead
		Error: UNCLASSIFIED_PROBLEM: 1: dead
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1159
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:283
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:830
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:914
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:864
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:2054
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:204
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1374
		Wraps: (3) 1: dead
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError
Reproduce

To reproduce, try:

# From https://go.crdb.dev/p/roachstress, perhaps edited lightly.
caffeinate ./roachstress.sh tpcdsvec

/cc @cockroachdb/sql-queries

This test on roachdash | Improve this report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants