-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv/kvnemesis: TestKVNemesisMultiNode failed #119681
Comments
The failed operation was:
and the trace is: |
Slack discussion here: https://cockroachlabs.slack.com/archives/C04AFSL6E3C/p1709219305351869 |
It looks like we're getting an unexpected cockroach/pkg/storage/pebble_mvcc_scanner.go Lines 1531 to 1552 in bac5c82
We only attempt a server side retry while holding latches once. Server side retries can happen for both We seem to be breaking the invariants described in: cockroach/pkg/kv/kvserver/replica_send.go Line 640 in 338aa9c
However, for this specific case, we're just missing a potential opportunity to perform a server side refresh. I don't think this qualifies as a release blocker as a result. That being said, let's discuss this a bit more once you're back @nvanbenschoten, as you've got the most context here. |
Thanks for helping debug this btw @miraradeva! |
We have marked this test failure issue as stale because it has been |
Fixes cockroachdb#119681. Fixes cockroachdb#131005. Epic: none Release note: None
131093: storage: disable checkUncertainty on failOnMoreRecent in scanner r=tbg a=tbg It was possible for reads with failOnMoreRecent to hit a ReadWithinUncertaintyIntervalError instead of the desired WriteTooOldError. This commit disables uncertainty checks when failOnMoreRecent is active, as the latter is a stronger check anyway. Fixes #119681. Fixes #131005. Epic: none Release note: None 131384: roachtest: admission-control/disk-bandwidth-limiter test improvements r=sumeerbhola a=aadityasondhi This patch fixes a few things in this test: - Runs the first step longer to have a fuller LSM to induce block and page cache misses to have some disk reads. - Reduces the throughput of the foreground workload since it was causing saturation on its own. - Assert on total bandwidth since the disk bandwidth limiter should be accounting for reads when determining tokens. Fixes #129534. Release note: None 131395: crosscluster/producer: modify lastEmitWait and lastProduceWait computation r=dt a=msbutler This patch modifies the lastEmitWait and lastProduceWait in the crdb_internal.cluster_replication_node streams vtable to be either the current wait or previous wait, if the event stream is currently waiting on that given state. Epic: none Release note: none Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com> Co-authored-by: Aaditya Sondhi <20070511+aadityasondhi@users.noreply.github.com> Co-authored-by: Michael Butler <butler@cockroachlabs.com>
It was possible for reads with failOnMoreRecent to hit a ReadWithinUncertaintyIntervalError instead of the desired WriteTooOldError. This commit disables uncertainty checks when failOnMoreRecent is active, as the latter is a stronger check anyway. Fixes cockroachdb#119681. Fixes cockroachdb#131005. Epic: none Release note: None
Based on the specified backports for linked PR #131093, I applied the following new label(s) to this issue: branch-release-24.2. Please adjust the labels as needed to match the branches actually affected by this issue, including adding any known older branches. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Based on the specified backports for linked PR #131093, I applied the following new label(s) to this issue: branch-release-24.1. Please adjust the labels as needed to match the branches actually affected by this issue, including adding any known older branches. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
It was possible for reads with failOnMoreRecent to hit a ReadWithinUncertaintyIntervalError instead of the desired WriteTooOldError. This commit disables uncertainty checks when failOnMoreRecent is active, as the latter is a stronger check anyway. Fixes #119681. Fixes #131005. Epic: none Release note: None
It was possible for reads with failOnMoreRecent to hit a ReadWithinUncertaintyIntervalError instead of the desired WriteTooOldError. This commit disables uncertainty checks when failOnMoreRecent is active, as the latter is a stronger check anyway. Fixes #119681. Fixes #131005. Epic: none Release note: None
Based on the specified backports for linked PR #131093, I applied the following new label(s) to this issue: branch-release-23.2. Please adjust the labels as needed to match the branches actually affected by this issue, including adding any known older branches. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
kv/kvnemesis.TestKVNemesisMultiNode failed with artifacts on release-23.1 @ 5d02fdf2851038279f7544e6271a08e1036a2966:
Parameters:
TAGS=bazel,gss
Help
See also: How To Investigate a Go Test Failure (internal)
Same failure on other branches
This test on roachdash | Improve this report!
Jira issue: CRDB-36279
The text was updated successfully, but these errors were encountered: