Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: decommission/nodes=4/duration=1h0m0s failed #55576

Closed
cockroach-teamcity opened this issue Oct 15, 2020 · 5 comments · Fixed by #55734
Closed

roachtest: decommission/nodes=4/duration=1h0m0s failed #55576

cockroach-teamcity opened this issue Oct 15, 2020 · 5 comments · Fixed by #55734
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
Milestone

Comments

@cockroach-teamcity
Copy link
Member

(roachtest).decommission/nodes=4/duration=1h0m0s failed on master@80e7127197f76ef35c1f6ec3984c4d49d4afde7f:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/decommission/nodes=4/duration=1h0m0s/run_1
	decommission.go:288,decommission.go:48,test_runner.go:755: read tcp 172.17.0.3:59240->35.202.16.50:26257: read: connection reset by peer

	cluster.go:1657,context.go:135,cluster.go:1646,test_runner.go:836: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-2366018-1602742581-03-n4cpu4 --oneshot --ignore-empty-nodes: exit status 1 4: dead
		3: 4285
		2: 4547
		1: 4539
		Error: UNCLASSIFIED_PROBLEM: 4: dead
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1143
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:267
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:830
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:914
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:864
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1839
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:203
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1357
		Wraps: (3) 4: dead
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError

More

Artifacts: /decommission/nodes=4/duration=1h0m0s

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Oct 15, 2020
@cockroach-teamcity cockroach-teamcity added this to the 20.2 milestone Oct 15, 2020
@cockroach-teamcity
Copy link
Member Author

(roachtest).decommission/nodes=4/duration=1h0m0s failed on master@47044feed11ec0c0390989bf8f44e777ec3eb00d:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/decommission/nodes=4/duration=1h0m0s/run_1
	decommission.go:288,decommission.go:48,test_runner.go:755: read tcp 172.17.0.3:59912->35.226.254.141:26257: read: connection reset by peer

	cluster.go:1657,context.go:135,cluster.go:1646,test_runner.go:836: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-2368807-1602828776-13-n4cpu4 --oneshot --ignore-empty-nodes: exit status 1 2: 4500
		4: dead
		3: 4216
		1: 4593
		Error: UNCLASSIFIED_PROBLEM: 4: dead
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1143
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:267
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:830
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:914
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:864
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1839
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:203
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1357
		Wraps: (3) 4: dead
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError

More

Artifacts: /decommission/nodes=4/duration=1h0m0s

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).decommission/nodes=4/duration=1h0m0s failed on master@b1abf9c8dfb5880fce69dfc7240e593f077bf77c:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/decommission/nodes=4/duration=1h0m0s/run_1
	decommission.go:288,decommission.go:48,test_runner.go:755: read tcp 172.17.0.3:48804->34.121.89.140:26257: read: connection reset by peer

	cluster.go:1657,context.go:135,cluster.go:1646,test_runner.go:836: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-2371613-1602915236-14-n4cpu4 --oneshot --ignore-empty-nodes: exit status 1 4: dead
		3: 4595
		1: 4805
		2: 4306
		Error: UNCLASSIFIED_PROBLEM: 4: dead
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1143
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:267
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:830
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:914
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:864
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1839
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:203
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1357
		Wraps: (3) 4: dead
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError

More

Artifacts: /decommission/nodes=4/duration=1h0m0s

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).decommission/nodes=4/duration=1h0m0s failed on master@d752fa2bd9afad255e8c655de9c7edc6dad14486:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/decommission/nodes=4/duration=1h0m0s/run_1
	decommission.go:288,decommission.go:48,test_runner.go:755: read tcp 172.17.0.3:52568->35.238.74.210:26257: read: connection reset by peer

	cluster.go:1657,context.go:135,cluster.go:1646,test_runner.go:836: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-2372996-1603001519-06-n4cpu4 --oneshot --ignore-empty-nodes: exit status 1 4: dead
		3: 4145
		2: 4329
		1: 5129
		Error: UNCLASSIFIED_PROBLEM: 4: dead
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1143
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:267
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:830
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:914
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:864
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1839
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:203
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1357
		Wraps: (3) 4: dead
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError

More

Artifacts: /decommission/nodes=4/duration=1h0m0s

See this test on roachdash
powered by pkg/cmd/internal/issues

@cockroach-teamcity
Copy link
Member Author

(roachtest).decommission/nodes=4/duration=1h0m0s failed on master@ab503e2fd708541e5e9ebb9a6f2651eda506f2ef:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/decommission/nodes=4/duration=1h0m0s/run_1
	decommission.go:288,decommission.go:48,test_runner.go:755: read tcp 172.17.0.3:58678->34.121.107.159:26257: read: connection reset by peer

	cluster.go:1657,context.go:135,cluster.go:1646,test_runner.go:836: dead node detection: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod monitor teamcity-2374262-1603088202-14-n4cpu4 --oneshot --ignore-empty-nodes: exit status 1 4: dead
		2: 4478
		1: 5014
		3: 4231
		Error: UNCLASSIFIED_PROBLEM: 4: dead
		(1) UNCLASSIFIED_PROBLEM
		Wraps: (2) attached stack trace
		  -- stack trace:
		  | main.glob..func14
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1143
		  | main.wrap.func1
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:267
		  | github.com/spf13/cobra.(*Command).execute
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:830
		  | github.com/spf13/cobra.(*Command).ExecuteC
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:914
		  | github.com/spf13/cobra.(*Command).Execute
		  | 	/home/agent/work/.go/pkg/mod/github.com/spf13/cobra@v0.0.5/command.go:864
		  | main.main
		  | 	/home/agent/work/.go/src/github.com/cockroachdb/cockroach/pkg/cmd/roachprod/main.go:1839
		  | runtime.main
		  | 	/usr/local/go/src/runtime/proc.go:203
		  | runtime.goexit
		  | 	/usr/local/go/src/runtime/asm_amd64.s:1357
		Wraps: (3) 4: dead
		Error types: (1) errors.Unclassified (2) *withstack.withStack (3) *errutil.leafError

More

Artifacts: /decommission/nodes=4/duration=1h0m0s

See this test on roachdash
powered by pkg/cmd/internal/issues

@tbg tbg self-assigned this Oct 20, 2020
@tbg
Copy link
Member

tbg commented Oct 20, 2020

This one's on me -

E201019 06:31:50.278378 1 cli/error.go:398 ⋮ ‹ERROR: cockroach server exited with error: while trying to initialize store: engine cannot be bootstrapped, contains:›
‹[/Local/Store/nodeTombstone/4/0,0: "\x12\x04\b\x00\x10\x00\x18\x00 \x00(\x002\x0f\x00\x00\x00\x00\x03\b\x83\xbd\x84\xb4\x9a\x90ԟ\x16"]›

I shouldn't be putting node tombstones on stores that are not bootstrapped.

craig bot pushed a commit that referenced this issue Oct 20, 2020
55708: storage: Use batches for direct RocksDB mutations r=itsbilal a=itsbilal

Currently, doing direct mutations on a RocksDB instance bypasses
custom batching / syncing logic that we've built on top of it.
This, or something internal to RocksDB, started leading to some bugs
when all direct mutations started passing in WriteOptions.sync = true
(see #55240 for when this change went in).

In this change, direct mutations still commit the batch with sync=true
to guarantee WAL syncing, but they go through the batch commit pipeline
too, just like the vast majority of operations already do.

Fixes #55362.

Release note: None.

55734: server: skip unit'ed engines in tombstone storage r=irfansharif a=tbg

Empty (i.e. uninitialized engines) could receive tombstones before
being initialized. Initialization checks that the engine is empty
(save for the cluster version) and thus failed. Simply don't write
tombstones to uninitialized engines, which is fine since by the
time the callbacks fire, at least one is initialized anyway, and
besides, this mechanism is best effort.

The alternatives would have been to either allow tombstones to
be present on an engine that is being bootstrapped, or to give
the storage the option to defer writing to the engine once it's
bootstrapped. Neither seemed worth the extra work.

Fixes #55576.

Release note: None


55739: opt: fix normalization of st_distance when use_spheroid parameter used r=rytaft a=rytaft

This commit fixes the normalization rule that converts `st_distance` to
`st_dwithin` or `st_dwithinexclusive`, which was broken in the case when
the `use_spheroid` parameter was used. Prior to this commit, the rule was
assigning the `use_spheroid` parameter as the 3rd parameter to `st_dwithin`
or `st_dwithinexclusive` and the `distance` parameter as the 4th, but that
order does not match the function signatures. This commit fixes the issue
by assigning `distance` as the 3rd parameter and `use_spheroid` as the 4th
if it exists.

Fixes #55675

Release note (bug fix): Fixed an internal error that could occur during
query planning when the use_spheroid parameter was used in the ST_Distance
function as part of a filter predicate. For example, `SELECT ... WHERE
ST_Distance(geog1, geog2, false) < 10` previously caused an error. This
has now been fixed.

Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>
Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
@craig craig bot closed this as completed in 6523c22 Oct 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants