-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
db: FormatPrePebblev1Marked panics if sstable does not exist #2019
Comments
Fix the FormatPrePebblev1Marked migration to tolerate concurrent file deletions by disabling physical deletion of files removed from the LSM until the migration completes. Fix cockroachdb#2019. Informs cockroachdb/cockroach#89755. Informs cockroachdb/cockroach#83079.
Fix the FormatPrePebblev1Marked migration to tolerate concurrent file deletions by disabling physical deletion of files removed from the LSM until the migration completes. Fix cockroachdb#2019. Informs cockroachdb/cockroach#89755. Informs cockroachdb/cockroach#83079.
Fix the FormatPrePebblev1Marked migration to tolerate concurrent file deletions by disabling physical deletion of files removed from the LSM until the migration completes. Fix #2019. Informs cockroachdb/cockroach#89755. Informs cockroachdb/cockroach#83079.
Fix the FormatPrePebblev1Marked migration to tolerate concurrent file deletions by disabling physical deletion of files removed from the LSM until the migration completes. Fix cockroachdb#2019. Informs cockroachdb/cockroach#89755. Informs cockroachdb/cockroach#83079.
Fix the FormatPrePebblev1Marked migration to tolerate concurrent file deletions by disabling physical deletion of files removed from the LSM until the migration completes. Fix #2019. Informs cockroachdb/cockroach#89755. Informs cockroachdb/cockroach#83079.
Convo at https://cockroachlabs.slack.com/archives/C01CDD4HRC5/p1666329387677329?thread_ts=1664295784.890119&cid=C01CDD4HRC5 about how there was nothing in logs about the crash reason, requiring @renatolabs to do this:
We will open some follow up issues soon. It seems there are naked calls to |
Details given above not right exactly but there is certainly a bug of some kind given lack of crash reason in logs. See #2039 for current understanding. |
Summary
We've been observing a Pebble crash in CRDB's upgrade tests. Specifically, while the cluster is upgrading from the 22.1 release to the current version (either master or the
22.2.0
release branch), a crash is non-deterministically observed in Pebble during theRatchetFormatMajorVersion
call. The actual panic happens insidemarkFilesPrePebblev1
.Stack trace:
CRDB test failures:
Notes
tpcc/mixed-headroom/n5cpu16
as is on mastertpcc
workload to ramp up before attempting a cluster version upgrade increases the probability (I don't have a good estimate for the probability of failure, but I did notice it happens more frequently).The text was updated successfully, but these errors were encountered: