-
Notifications
You must be signed in to change notification settings - Fork 3.7k
branch-3.1: [fix](move-memtable) fix segment number mismatch for erroneously skipped segments #55092 #55471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ped segments (#55092) ### What problem does this PR solve? Fix segment number mismatch caused by erroneously skipped segments during concurrent incremental open on auto-partitioned table: #### Problem During concurrent incremental open on an auto-partitioned table, one sink may incorrectly assume that stream opened by another sink have already been opened and begin writing data while those segments are still being opened. This leads to some segments being silently skipped and results in a segment number mismatch. For example(two instances, 4 BEs: a, b, c, d): | Time | Event | | ---- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | t0 | `sink1` and `sink2` start incremental open for BEs **a, b, c, d**. | | t1 | `sink1` adds **a, b, c** to `_load_stream_map` and initiates open. | | t2 | `sink2` adds **d** to `_load_stream_map` and initiates open. | | t3 | `sink1` completes open for **a** and **b**; **c** is still in progress. | | t4 | `sink2` successfully opens **d**, assumes **a, b, c** are **all** ready, and starts writing. Because **c** is not yet fully open, its segments are skipped, causing the mismatch. | #### Expected behavior A sink must wait until all stream it depends on are fully opened before starting any write. #### Proposed fix All sinks open the full set of streams (a, b, c, d) instead of a partial subset. Lock on each stream guarantees that: - Duplicate open attempts are prevented:only the first sink performs the actual open; subsequent sinks wait until the open is complete. - Expected behavior is preserved:every sink waits until all streams are fully opened before starting any write, eliminating skipped segments and the resulting segment-number mismatch. ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 32730 ms |
TPC-DS: Total hot run time: 192710 ms |
ClickBench: Total hot run time: 28.28 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
1 similar comment
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
Cherry-picked from #55092