Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix stale read of some acknowledged writes after a table split #2286

Merged
merged 2 commits into from
Feb 29, 2024

Conversation

snaury
Copy link
Member

@snaury snaury commented Feb 27, 2024

Changelog entry

Fix stale read of some acknowledged writes after a table split.

Changelog category

  • Bugfix

Additional information

Some G-single-item-realtime anomalies were detected with Jepsen, which corresponded to a stale read immediately after a table split. Investigation showed several cases where a single-shard write could be acknowledged to clients by source shard, when destination shards would consider those versions unacknowledged due to lagging mediator time. The underlying races caused several unintended side-effects:

  • Destination shards could attach to mediator time before they had all the necessary information:
    • Non-repeatable reads: destination shards could select a new write version which was supposed to be frozen by a repeatable snapshot read at their source shard during split
    • Stale reads: destination shards could select a new read version which was acknowledged by a single-shard write at their source shard during split
  • Source shards could reply to writes after destination shards have fully initialized, which could cause stale reads due to mediator time lagging at their corresponding nodes

These issues are fixed by not starting mediator time restore until all snapshots are received by destination shards (this ensures destination shards await mediator time which is not less than the last theoretically observed by source shards at the time they sent their snapshots), and not sending delayed replies after a snapshot is prepared by source shards (this ensures destination shards may trust their local mediator time to determine write visibility).

Partially fixes KIKIMR-21065.

Copy link

github-actions bot commented Feb 27, 2024

2024-02-27 15:57:09 UTC Pre-commit check for 1d8796c has started.
2024-02-27 15:57:10 UTC Build linux-x86_64-release-cmake14 is running...
🟢 2024-02-27 15:59:33 UTC Build successful.

@snaury snaury requested a review from CyberROFL February 27, 2024 15:57
@snaury snaury marked this pull request as ready for review February 27, 2024 15:57
Copy link

github-actions bot commented Feb 27, 2024

2024-02-27 15:59:36 UTC Pre-commit check for 1d8796c has started.
2024-02-27 15:59:40 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-27 16:02:10 UTC Build successful.
2024-02-27 16:02:23 UTC Tests are running...
🔴 2024-02-27 17:23:17 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
67969 57061 0 3 10883 22

Copy link

github-actions bot commented Feb 27, 2024

2024-02-27 16:04:16 UTC Pre-commit check for 1d8796c has started.
2024-02-27 16:04:18 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-27 16:06:50 UTC Build successful.
2024-02-27 16:06:59 UTC Tests are running...
🔴 2024-02-27 17:48:43 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14877 14715 0 35 101 26

CyberROFL
CyberROFL previously approved these changes Feb 27, 2024
@snaury snaury marked this pull request as draft February 28, 2024 09:02
@snaury
Copy link
Member Author

snaury commented Feb 28, 2024

The Cdc.ResolvedTimestamps test exposed a TxInit race at split dst, need to find a proper non-racy way to initialize post split.

Copy link

github-actions bot commented Feb 28, 2024

2024-02-28 13:56:49 UTC Pre-commit check for 73264da has started.
2024-02-28 13:56:51 UTC Build linux-x86_64-release-asan is running...
🟢 2024-02-28 14:12:23 UTC Build successful.
2024-02-28 14:12:35 UTC Tests are running...
🔴 2024-02-28 15:54:23 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
14879 14676 0 37 120 46

Copy link

github-actions bot commented Feb 28, 2024

2024-02-28 13:56:50 UTC Pre-commit check for 73264da has started.
2024-02-28 13:56:52 UTC Build linux-x86_64-release-cmake14 is running...
🟢 2024-02-28 14:20:46 UTC Build successful.

Copy link

github-actions bot commented Feb 28, 2024

2024-02-28 13:57:52 UTC Pre-commit check for 73264da has started.
2024-02-28 13:57:54 UTC Build linux-x86_64-relwithdebinfo is running...
🟢 2024-02-28 14:10:26 UTC Build successful.
2024-02-28 14:10:38 UTC Tests are running...
🔴 2024-02-28 15:42:12 UTC Some tests failed, follow the links below.

Test history

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
68022 57071 0 9 10890 52

@snaury snaury marked this pull request as ready for review February 29, 2024 08:26
@snaury
Copy link
Member Author

snaury commented Feb 29, 2024

Failing tests seem to be all KV/PQ, unrelated to datashards.

@snaury snaury requested a review from CyberROFL February 29, 2024 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants