Skip to content

Conversation

@snaury
Copy link
Member

@snaury snaury commented Feb 26, 2025

Changelog entry

Fixed a rare assertion (process crash) when followers attached to leaders with an inconsistent snapshot. Fixes #15042.

Changelog category

  • Bugfix

Description for reviewers

Followers produced crashes in production periodically complaining about log reordering, with an error message indicating as if they tried to apply a duplicate redo log entry (which shouldn't have been possible). Turns out snapshots created within read-only transactions that used QueueScan (e.g. ReadTable and ScanQuery) persisted an incorrect Serial field (a monotonically increasing change number) that was equal to the next transaction. When follower attached at just the right time, it could bootstrap from such a snapshot, and discover the next commit has the same Serial, indicating a duplicate or reordered change.

Thankfully this didn't affect leaders, since they apply pre-snapshot and post-snapshot redo log entries together, and only use snapshot serial as a hint of previously compacted changes. So even though snapshot technically had an inconsistent value it was self-healing and couldn't produce any externally visible inconsistencies.

@snaury snaury self-assigned this Feb 26, 2025
@github-actions
Copy link

🟢 2025-02-26 10:01:25 UTC The validation of the Pull Request description is successful.

@github-actions
Copy link

github-actions bot commented Feb 26, 2025

2025-02-26 10:02:48 UTC Pre-commit check linux-x86_64-relwithdebinfo for a6dd13c has started.
2025-02-26 10:03:13 UTC Artifacts will be uploaded here
2025-02-26 10:06:15 UTC ya make is running...
🟡 2025-02-26 11:10:09 UTC Some tests failed, follow the links below. Going to retry failed tests...

Details

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
26437 23833 0 1 2464 139

2025-02-26 11:12:32 UTC ya make is running... (failed tests rerun, try 2)
🟢 2025-02-26 11:31:22 UTC Tests successful.

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
211 (only retried tests) 79 0 0 3 129

🟢 2025-02-26 11:31:29 UTC Build successful.
🟢 2025-02-26 11:31:47 UTC ydbd size 2.1 GiB changed* by +208 Bytes, which is < 100.0 KiB vs main: OK

ydbd size dash main: d2f7432 merge: a6dd13c diff diff %
ydbd size 2 263 764 888 Bytes 2 263 765 096 Bytes +208 Bytes +0.000%
ydbd stripped size 477 143 000 Bytes 477 143 064 Bytes +64 Bytes +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@github-actions
Copy link

github-actions bot commented Feb 26, 2025

2025-02-26 10:02:51 UTC Pre-commit check linux-x86_64-release-asan for a6dd13c has started.
2025-02-26 10:03:04 UTC Artifacts will be uploaded here
2025-02-26 10:05:54 UTC ya make is running...
🟡 2025-02-26 11:18:49 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Details

Test history | Ya make output | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
12100 12030 0 22 12 36

2025-02-26 11:19:57 UTC ya make is running... (failed tests rerun, try 2)
🟡 2025-02-26 11:44:20 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet Going to retry failed tests...

Details

Test history | Ya make output | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
110 (only retried tests) 70 0 1 9 30

2025-02-26 11:44:29 UTC ya make is running... (failed tests rerun, try 3)
🟡 2025-02-26 11:56:24 UTC Some tests failed, follow the links below. This fail is not in blocking policy yet

Test history | Ya make output | Test bloat | Test bloat | Test bloat

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
67 (only retried tests) 34 0 1 1 31

🟢 2025-02-26 11:56:31 UTC Build successful.
🟢 2025-02-26 11:57:00 UTC ydbd size 3.7 GiB changed* by +320 Bytes, which is < 100.0 KiB vs main: OK

ydbd size dash main: d2f7432 merge: a6dd13c diff diff %
ydbd size 3 943 200 208 Bytes 3 943 200 528 Bytes +320 Bytes +0.000%
ydbd stripped size 1 374 169 712 Bytes 1 374 169 840 Bytes +128 Bytes +0.000%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

@snaury snaury requested a review from kungasc February 26, 2025 12:04
@snaury snaury marked this pull request as ready for review February 26, 2025 12:04
@snaury snaury enabled auto-merge (squash) February 26, 2025 12:06
@snaury snaury merged commit 57173d5 into ydb-platform:main Feb 26, 2025
14 checks passed
snaury added a commit to snaury/ydb that referenced this pull request Feb 26, 2025
snaury added a commit to snaury/ydb that referenced this pull request Feb 26, 2025
lberserq pushed a commit to lberserq/ydb that referenced this pull request Mar 3, 2025
@liruoko liruoko added the changelog/f25-3 PR участвует в списке изменений label Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix changelog/f25-3 PR участвует в списке изменений

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Follower assertion about log reordering

3 participants