Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: sqlsmith/setup=empty/setting=default failed #47541

Closed
cockroach-teamcity opened this issue Apr 15, 2020 · 8 comments
Closed

roachtest: sqlsmith/setup=empty/setting=default failed #47541

cockroach-teamcity opened this issue Apr 15, 2020 · 8 comments
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

(roachtest).sqlsmith/setup=empty/setting=default failed on release-20.1@575a5bd7a8714464a74f52f337dc5af33558ebb0:

The test failed on branch=release-20.1, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/sqlsmith/setup=empty/setting=default/run_1
	sqlsmith.go:185,sqlsmith.go:199,test_runner.go:753: ping: dial tcp 34.71.101.82:26257: connect: connection refused
		previous sql:
		DELETE FROM defaultdb.public.tab_220 AS tab_471;

	cluster.go:1420,context.go:135,cluster.go:1409,test_runner.go:825: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-1878596-1586986323-08-n4cpu4 --oneshot --ignore-empty-nodes: exit status 1 3: 5718
		4: 5024
		1: dead
		2: 5737
		Error: UNCLASSIFIED_PROBLEM:
		  - 1: dead
		    main.glob..func13
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1129
		    main.wrap.func1
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:272
		    github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).execute
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:766
		    github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).ExecuteC
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:852
		    github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra.(*Command).Execute
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/vendor/github.com/spf13/cobra/command.go:800
		    main.main
		    	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1793
		    runtime.main
		    	/usr/local/go/src/runtime/proc.go:203
		    runtime.goexit
		    	/usr/local/go/src/runtime/asm_amd64.s:1357

More

Artifacts: /sqlsmith/setup=empty/setting=default

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity cockroach-teamcity added branch-release-20.1 C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Apr 15, 2020
@cockroach-teamcity cockroach-teamcity added this to the 20.1 milestone Apr 15, 2020
@yuzefovich
Copy link
Member

E200415 21:38:59.393341 1865 sql/conn_executor.go:796  [n1,client=34.73.146.245:48220,hostnossl,user=root] a SQL panic has occurred while executing "DELETE FROM defaultdb.public.tab_220 AS tab_471": runtime error: index out of range [6] with length 6
E200415 21:38:59.393420 1865 util/log/crash_reporting.go:208  [n1,client=34.73.146.245:48220,hostnossl,user=root] a panic has occurred!
panic: runtime error: index out of range [6] with length 6 [recovered]
	panic: panic while executing 1 statements: DELETE FROM _._._ AS _; caused by runtime error: index out of range [6] with length 6

goroutine 1865 [running]:
panic(0x40fb780, 0xc0044d3110)
	/usr/local/go/src/runtime/panic.go:722 +0x2c2 fp=0xc003034bb0 sp=0xc003034b20 pc=0x79a4b2
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).closeWrapper(0xc000a60000, 0x4bb7960, 0xc00381ba40, 0x40e1f60, 0xc00035db80)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:810 +0x3f9 fp=0xc003034c58 sp=0xc003034bb0 pc=0x2c1fb69
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn.func1(0xc000a60000, 0x4bb7960, 0xc00381ba40)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:474 +0x61 fp=0xc003034c90 sp=0xc003034c58 pc=0x2d77361
runtime.call32(0x0, 0x44a86f0, 0xc003037e08, 0x1800000018)
	/usr/local/go/src/runtime/asm_amd64.s:539 +0x3b fp=0xc003034cc0 sp=0xc003034c90 pc=0x7c8e0b
panic(0x40e1f60, 0xc00035db80)
	/usr/local/go/src/runtime/panic.go:679 +0x1b2 fp=0xc003034d50 sp=0xc003034cc0 pc=0x79a3a2
runtime.goPanicIndex(0x6, 0x6)
	/usr/local/go/src/runtime/panic.go:75 +0xa3 fp=0xc003034d98 sp=0xc003034d50 pc=0x798bd3
github.com/cockroachdb/cockroach/pkg/sql/sqlbase.findColumnValue(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/sqlbase/index_encoding.go:472
github.com/cockroachdb/cockroach/pkg/sql/sqlbase.EncodeColumns(0xc0052b4598, 0x1, 0x2, 0x0, 0x0, 0x0, 0xc0044d2870, 0xc00361f5c0, 0x6, 0x6, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/sqlbase/table.go:247 +0x347 fp=0xc003034e38 sp=0xc003034d98 pc=0x158bd47
github.com/cockroachdb/cockroach/pkg/sql/sqlbase.EncodeSecondaryIndex(0xc004b0e000, 0xc00381f000, 0xc0044d2870, 0xc00361f5c0, 0x6, 0x6, 0x4bb7901, 0xc004108380, 0xc0000f0f20, 0x7, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/sqlbase/index_encoding.go:1029 +0x1a6 fp=0xc003035208 sp=0xc003034e38 pc=0x151ef06
github.com/cockroachdb/cockroach/pkg/sql/row.(*Deleter).DeleteRow(0xc00414c7c8, 0x4bb7960, 0xc004108380, 0xc003d7b180, 0xc00361f5c0, 0x6, 0x6, 0xc003670001, 0x4, 0x9f)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/row/deleter.go:161 +0x105 fp=0xc003035358 sp=0xc003035208 pc=0x20161f5
github.com/cockroachdb/cockroach/pkg/sql.(*tableDeleter).row(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/tablewriter_delete.go:66
github.com/cockroachdb/cockroach/pkg/sql.(*deleteNode).processSourceRow(0xc00414c780, 0x4bb7960, 0xc004108380, 0xc0041e5180, 0xc000a603d0, 0xc00361f5c0, 0x6, 0x6, 0x4092940, 0x3f13401)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/delete.go:173 +0x8e fp=0xc0030353c8 sp=0xc003035358 pc=0x2c6007e
github.com/cockroachdb/cockroach/pkg/sql.(*deleteNode).BatchedNext(0xc00414c780, 0x4bb7960, 0xc004108380, 0xc0041e5180, 0xc000a603d0, 0x3f13400, 0xc003035401, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/delete.go:126 +0x150 fp=0xc003035430 sp=0xc0030353c8 pc=0x2c5fd90
...

@yuzefovich
Copy link
Member

yuzefovich commented Apr 15, 2020

I tried reproducing this problem by replaying the sqlsmith log on the given commit (with sql_safe_updates set to false and true), but the crash didn't succeed.

The logs of the crashed node contain a bunch of messages like

E200415 21:37:57.672332 3978 storage/cloud/http_storage.go:199  [n1,client=34.73.146.245:48220,hostnossl,user=root] HTTP:Req error: err=retryable http error: Get http://127.0.0.1:43059/backup_11/BACKUP: dial tcp 127.0.0.1:43059: connect: connection refused (attempt 0)

and the first one occurs about 62 seconds before the crash. Not sure whether it matters though.

@rohany could you take a quick look at the stack trace and this issue since you updated some code in here recently?

@rohany rohany self-assigned this Apr 16, 2020
@rohany
Copy link
Contributor

rohany commented Apr 16, 2020

I can't repro it either, but the stack in the logs looks like the culprit. I'll take a closer look.

@dt
Copy link
Member

dt commented Sep 3, 2020

Inactive.

@dt dt closed this as completed Sep 3, 2020
@dpkirchner
Copy link

This crash doesn't seem related to the referenced issues. Is it indeed fixed in some version or another issue?

@yuzefovich
Copy link
Member

@dpkirchner I agree that the referenced issues seem unrelated. I'm assuming that you hit a problem on a DELETE query with a similar stack trace? If so, please feel free to open another issue with reproduction steps if possible.

@dpkirchner
Copy link

@yuzefovich Yeah, similar stack trace. I'll need to upgrade to the latest version before I open an issue, but if I hit it again I definitely will and will reference this one.

@dpkirchner
Copy link

@yuzefovich The bug is still in the latest version. I'll try to come up with a repro case so I can submit it as an issue, but it looks like the above stack trace (at least the part we're able to see here) so I guess that's a good place to start.

I was able to "fix" this by removing some indices from the table, so at least production nodes aren't crashing any more. Unfortunately this may mean I can't provide any information at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
Projects
None yet
Development

No branches or pull requests

6 participants