-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[bugfix](compaction) Fix the issue where input rowsets are prematurely evicted after compaction, causing query failures #55382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 34744 ms |
TPC-DS: Total hot run time: 187983 ms |
ClickBench: Total hot run time: 32.43 s |
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage `` 🎉 |
TPC-H: Total hot run time: 34198 ms |
TPC-DS: Total hot run time: 186779 ms |
ClickBench: Total hot run time: 33.87 s |
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 34145 ms |
TPC-DS: Total hot run time: 188399 ms |
FE UT Coverage ReportIncrement line coverage `` 🎉 |
ClickBench: Total hot run time: 33.94 s |
ebc93a2 to
01c045f
Compare
|
run buildall |
cb13aa4 to
913b264
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage `` 🎉 |
6f7e262 to
98498b9
Compare
|
run buildall |
TPC-H: Total hot run time: 34780 ms |
TPC-DS: Total hot run time: 189354 ms |
ClickBench: Total hot run time: 30.29 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
dataroaring
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
…y evicted after compaction, causing query failures (#55382) ### What problem does this PR solve? Problem Summary: 1. Problem background `There is a critical bug in Doris's compaction: after input rowsets participate in compaction, their expiration time calculation incorrectly uses the rowset's creation time (creation_time), instead of the compaction completion time` 2. Scene for example: a. After compaction is completed, the rowset should be discarded after another tablet_rowset_stale_sweep_time_sec b. Due to the use of creation time calculation, rowset is immediately eliminated c. The executing query failed, error occurred : [E-230]fail to find path in version_graph. spec_version: 0-1789 versions are already compacted 3. Cause a. In the current implementation, TimestampedVersion is created using rs->creation_time() b. Elimination judgment logic : `rowset_creation_time <= (current_time - tablet_rowset_stale_sweep_time_sec)` c. For earlier created rowsets, even if they have just participated in compaction, they will be immediately discarded due to their long creation time ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
…y evicted after compaction, causing query failures (apache#55382) (apache#5870) pick:apache#55382 Co-authored-by: lw112 <131352377+felixwluo@users.noreply.github.com>
…y evicted after compaction, causing query failures (apache#55382) (apache#5871) pick:apache#55382 Co-authored-by: lw112 <131352377+felixwluo@users.noreply.github.com>
What problem does this PR solve?
Problem Summary:
Problem background
There is a critical bug in Doris's compaction: after input rowsets participate in compaction, their expiration time calculation incorrectly uses the rowset's creation time (creation_time), instead of the compaction completion timeScene
for example:
a. After compaction is completed, the rowset should be discarded after another tablet_rowset_stale_sweep_time_sec
b. Due to the use of creation time calculation, rowset is immediately eliminated
c. The executing query failed, error occurred : [E-230]fail to find path in version_graph. spec_version: 0-1789 versions are already compacted
Cause
a. In the current implementation, TimestampedVersion is created using rs->creation_time()
b. Elimination judgment logic :
rowset_creation_time <= (current_time - tablet_rowset_stale_sweep_time_sec)c. For earlier created rowsets, even if they have just participated in compaction, they will be immediately discarded due to their long creation time
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)