You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fix stale read of some acknowledged writes after a table split.
Changelog category
Bugfix
Additional information
Some G-single-item-realtime anomalies were detected with Jepsen, which corresponded to a stale read immediately after a table split. Investigation showed several cases where a single-shard write could be acknowledged to clients by source shard, when destination shards would consider those versions unacknowledged due to lagging mediator time. The underlying races caused several unintended side-effects:
Destination shards could attach to mediator time before they had all the necessary information:
Non-repeatable reads: destination shards could select a new write version which was supposed to be frozen by a repeatable snapshot read at their source shard during split
Stale reads: destination shards could select a new read version which was acknowledged by a single-shard write at their source shard during split
Source shards could reply to writes after destination shards have fully initialized, which could cause stale reads due to mediator time lagging at their corresponding nodes
These issues are fixed by not starting mediator time restore until all snapshots are received by destination shards (this ensures destination shards await mediator time which is not less than the last theoretically observed by source shards at the time they sent their snapshots), and not sending delayed replies after a snapshot is prepared by source shards (this ensures destination shards may trust their local mediator time to determine write visibility).
⚪ 2024-02-27 15:57:09 UTC Pre-commit check for 1d8796c has started.
⚪ 2024-02-27 15:57:10 UTC Build linux-x86_64-release-cmake14 is running...
🟢 2024-02-27 15:59:33 UTC Build successful.
⚪ 2024-02-27 15:59:36 UTC Pre-commit check for 1d8796c has started.
⚪ 2024-02-27 15:59:40 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-27 16:02:10 UTC Build successful.
⚪ 2024-02-27 16:02:23 UTC Tests are running...
🔴 2024-02-27 17:23:17 UTC Some tests failed, follow the links below.
⚪ 2024-02-27 16:04:16 UTC Pre-commit check for 1d8796c has started.
⚪ 2024-02-27 16:04:18 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-27 16:06:50 UTC Build successful.
⚪ 2024-02-27 16:06:59 UTC Tests are running...
🔴 2024-02-27 17:48:43 UTC Some tests failed, follow the links below.
⚪ 2024-02-28 13:56:49 UTC Pre-commit check for 73264da has started.
⚪ 2024-02-28 13:56:51 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-28 14:12:23 UTC Build successful.
⚪ 2024-02-28 14:12:35 UTC Tests are running...
🔴 2024-02-28 15:54:23 UTC Some tests failed, follow the links below.
⚪ 2024-02-28 13:56:50 UTC Pre-commit check for 73264da has started.
⚪ 2024-02-28 13:56:52 UTC Build linux-x86_64-release-cmake14 is running...
🟢 2024-02-28 14:20:46 UTC Build successful.
⚪ 2024-02-28 13:57:52 UTC Pre-commit check for 73264da has started.
⚪ 2024-02-28 13:57:54 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-28 14:10:26 UTC Build successful.
⚪ 2024-02-28 14:10:38 UTC Tests are running...
🔴 2024-02-28 15:42:12 UTC Some tests failed, follow the links below.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changelog entry
Fix stale read of some acknowledged writes after a table split.
Changelog category
Additional information
Some G-single-item-realtime anomalies were detected with Jepsen, which corresponded to a stale read immediately after a table split. Investigation showed several cases where a single-shard write could be acknowledged to clients by source shard, when destination shards would consider those versions unacknowledged due to lagging mediator time. The underlying races caused several unintended side-effects:
These issues are fixed by not starting mediator time restore until all snapshots are received by destination shards (this ensures destination shards await mediator time which is not less than the last theoretically observed by source shards at the time they sent their snapshots), and not sending delayed replies after a snapshot is prepared by source shards (this ensures destination shards may trust their local mediator time to determine write visibility).
Partially fixes KIKIMR-21065.