Skip to content

Conversation

@bobhan1
Copy link
Contributor

@bobhan1 bobhan1 commented Sep 5, 2025

pick #55630

…ache#55630)

Currently, `MetaServiceImpl::get_rowset` use `calc_sync_versions` to
eliminate unnecessary version ranges when BE sync rowset metas. One of
the optimizations is as the following:
```cpp
std::vector<std::pair<int64_t, int64_t>> calc_sync_versions(int64_t req_bc_cnt, int64_t bc_cnt,
                                                            int64_t req_cc_cnt, int64_t cc_cnt,
                                                            int64_t req_cp, int64_t cp,
                                                            int64_t req_start, int64_t req_end) {
    // ...
    if (req_cc_cnt < cc_cnt) {
        Version cc_version;
        if (req_cp < cp && req_cc_cnt + 1 == cc_cnt) {
            // * only one CC happened and CP changed
            // BE  [=][=][=][=][=====][=][=]
            //                  ^~~~~ req_cp
            // MS  [=][=][=][=][xxxxxxxxxxxxxx][=======][=][=]
            //                                  ^~~~~~~ ms_cp
            //                  ^____________^ related_versions: [req_cp, ms_cp - 1]
            //
            cc_version = {req_cp, cp - 1};
        } else {
    // ...
}
```
This optimization replies on the assumption that only cumulative
compaction will change the cumulative point. However, full compaction
can also change the cumulative point, which breaks the above replied
assumption. This will cause data correctness problem in multi-cluster
environment because it will make the tablet failed to sync some rowset
metas forever.

A data correctness problem has been observed in the following
situaitions:

1. For a certain tablet, base_compaction_cnt=14,
cumulative_compaction_cnt=804, cumu_point=7458.
On node A of the write cluster (cluster 0), a full compaction of
[2-7464] and a cumulative compaction of [7465-7486] were performed. The
stats then became base_compaction_cnt=15, cumulative_compaction_cnt=805,
cumu_point=7465.
2. On node B of the read cluster (cluster 1), during sync_rowset, we
have:
req_base_compaction_cnt=14, base_compaction_cnt=15,
req_cumulative_compaction_cnt=804, cumulative_compaction_cnt=805,
req_cp=7458, cp=7465,
req_start=7487, req_end=int_max.
3. calc_sync_version computes that the rowsets to be pulled are [0-7464]
and [7487-int_max], but it misses the rowset [7465-7486] produced by
cumulative compaction.
4. Moreover, since the max_version of the tablet on cluster 1 node B has
been updated, subsequent sync_rowset operations will also not pull the
rowset [7465-7486].
5. This causes duplicate keys problem on MOW table because new rowset
will generate delete bitmap marks on [7465-7486].

---
This PR forbids the above optimization when full compaction cnt is
changed.

None

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [x] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
@bobhan1 bobhan1 requested a review from morrySnow as a code owner September 5, 2025 06:16
@Thearas
Copy link
Contributor

Thearas commented Sep 5, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@bobhan1
Copy link
Contributor Author

bobhan1 commented Sep 5, 2025

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 66.67% (6/9) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 82.02% (1218/1485)
Line Coverage 65.97% (21831/33094)
Region Coverage 67.44% (10984/16286)
Branch Coverage 57.01% (5798/10170)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage `` 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

TPC-H: Total hot run time: 32827 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a74ee19aa84f6e22176ec1a0ed2cf1ba4955e70a, data reload: false

------ Round 1 ----------------------------------
q1	17695	5586	5890	5586
q2	2039	404	291	291
q3	11938	1277	736	736
q4	10298	893	448	448
q5	9117	2436	2131	2131
q6	182	171	136	136
q7	909	757	615	615
q8	9349	1483	1176	1176
q9	5368	5015	4940	4940
q10	6783	2266	1802	1802
q11	463	279	260	260
q12	359	357	209	209
q13	17781	3609	3009	3009
q14	220	228	211	211
q15	523	469	471	469
q16	424	435	388	388
q17	572	876	363	363
q18	7211	6362	6399	6362
q19	2095	966	539	539
q20	323	328	199	199
q21	2773	2129	1950	1950
q22	1064	1035	1007	1007
Total cold run time: 107486 ms
Total hot run time: 32827 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5655	5486	5530	5486
q2	237	333	240	240
q3	2260	2590	2316	2316
q4	1378	1774	1380	1380
q5	4380	5103	5035	5035
q6	173	164	127	127
q7	2073	1983	1818	1818
q8	2623	2852	2696	2696
q9	7276	7231	7248	7231
q10	3012	3194	2733	2733
q11	575	501	491	491
q12	682	749	591	591
q13	3465	3787	3152	3152
q14	285	288	279	279
q15	516	470	465	465
q16	429	495	432	432
q17	1224	1760	1262	1262
q18	7723	7671	7388	7388
q19	782	1149	1230	1149
q20	2045	2051	1903	1903
q21	5203	4810	4417	4417
q22	1113	1079	1027	1027
Total cold run time: 53109 ms
Total hot run time: 51618 ms

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 42.31% (11/26) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 45.55% (12756/28005)
Line Coverage 36.40% (113743/312502)
Region Coverage 34.02% (65056/191231)
Branch Coverage 31.04% (34133/109956)

@doris-robot
Copy link

TPC-DS: Total hot run time: 192921 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a74ee19aa84f6e22176ec1a0ed2cf1ba4955e70a, data reload: false

query1	963	385	382	382
query2	6220	1926	1869	1869
query3	8690	202	205	202
query4	33758	23522	23514	23514
query5	3701	600	455	455
query6	291	194	180	180
query7	4208	479	306	306
query8	305	235	239	235
query9	9537	2604	2605	2604
query10	453	329	257	257
query11	18186	15439	15250	15250
query12	166	114	108	108
query13	1556	549	425	425
query14	9669	7140	6707	6707
query15	233	200	193	193
query16	8068	626	452	452
query17	1587	761	585	585
query18	2130	427	335	335
query19	243	190	178	178
query20	137	126	119	119
query21	205	128	113	113
query22	4693	4767	4568	4568
query23	35795	34147	34291	34147
query24	7812	2647	2716	2647
query25	488	489	434	434
query26	844	267	177	177
query27	2249	483	363	363
query28	5318	2275	2238	2238
query29	597	585	477	477
query30	254	189	170	170
query31	1011	931	841	841
query32	86	61	57	57
query33	511	395	309	309
query34	725	867	532	532
query35	779	825	757	757
query36	1047	1056	977	977
query37	103	93	70	70
query38	4006	4077	3951	3951
query39	1536	1507	1542	1507
query40	207	114	104	104
query41	49	52	47	47
query42	123	106	106	106
query43	504	523	483	483
query44	1373	849	839	839
query45	189	178	168	168
query46	876	1072	682	682
query47	2028	2048	1998	1998
query48	434	424	349	349
query49	755	477	403	403
query50	669	707	441	441
query51	7340	7292	7293	7292
query52	98	104	92	92
query53	231	267	195	195
query54	547	539	479	479
query55	77	81	80	80
query56	277	294	263	263
query57	1275	1291	1222	1222
query58	224	219	236	219
query59	3045	3178	3088	3088
query60	302	301	259	259
query61	137	116	112	112
query62	808	744	694	694
query63	235	192	190	190
query64	3648	1011	672	672
query65	3403	3282	3328	3282
query66	893	411	313	313
query67	16846	16053	15731	15731
query68	7528	821	559	559
query69	497	302	273	273
query70	1163	1163	1103	1103
query71	432	310	266	266
query72	5108	3740	3849	3740
query73	648	746	358	358
query74	10658	9477	9187	9187
query75	3564	3173	2683	2683
query76	3555	1209	760	760
query77	761	375	276	276
query78	10427	10461	9590	9590
query79	2887	916	612	612
query80	743	511	429	429
query81	489	258	224	224
query82	352	123	92	92
query83	166	165	141	141
query84	276	100	81	81
query85	732	352	293	293
query86	358	328	295	295
query87	4330	4302	4273	4273
query88	3468	2472	2423	2423
query89	411	338	304	304
query90	1934	190	189	189
query91	138	144	126	126
query92	65	56	49	49
query93	1983	864	557	557
query94	639	379	305	305
query95	350	280	271	271
query96	490	623	286	286
query97	3203	3289	3149	3149
query98	216	210	200	200
query99	1506	1416	1333	1333
Total cold run time: 293830 ms
Total hot run time: 192921 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.23 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit a74ee19aa84f6e22176ec1a0ed2cf1ba4955e70a, data reload: false

query1	0.03	0.04	0.03
query2	0.08	0.04	0.04
query3	0.23	0.06	0.05
query4	1.64	0.08	0.08
query5	0.53	0.49	0.50
query6	1.13	0.74	0.74
query7	0.03	0.02	0.01
query8	0.05	0.05	0.04
query9	0.60	0.51	0.50
query10	0.56	0.55	0.55
query11	0.15	0.12	0.12
query12	0.15	0.13	0.13
query13	0.63	0.60	0.59
query14	0.80	0.80	0.78
query15	0.86	0.83	0.85
query16	0.38	0.37	0.38
query17	1.06	1.08	1.06
query18	0.18	0.19	0.21
query19	1.92	1.88	1.85
query20	0.02	0.02	0.01
query21	15.36	0.96	0.67
query22	0.78	0.76	0.68
query23	14.80	1.51	0.70
query24	2.24	0.34	0.23
query25	0.15	0.09	0.09
query26	0.28	0.19	0.18
query27	0.08	0.08	0.08
query28	13.44	1.22	0.55
query29	12.67	4.05	3.30
query30	0.25	0.08	0.06
query31	2.83	0.60	0.41
query32	3.23	0.58	0.49
query33	3.01	3.06	3.06
query34	16.37	5.29	4.57
query35	4.66	4.61	4.60
query36	0.61	0.48	0.48
query37	0.20	0.16	0.17
query38	0.17	0.15	0.15
query39	0.06	0.04	0.05
query40	0.16	0.14	0.13
query41	0.11	0.06	0.05
query42	0.07	0.05	0.05
query43	0.05	0.04	0.04
Total cold run time: 102.61 s
Total hot run time: 29.23 s

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (26/26) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 76.47% (21062/27544)
Line Coverage 69.80% (217440/311528)
Region Coverage 67.71% (130010/192008)
Branch Coverage 61.20% (67603/110470)

@bobhan1
Copy link
Contributor Author

bobhan1 commented Sep 5, 2025

run p0

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (26/26) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 76.47% (21062/27544)
Line Coverage 69.79% (217420/311528)
Region Coverage 67.70% (129989/192008)
Branch Coverage 61.18% (67585/110470)

@morrySnow morrySnow changed the title branch-3.1: [Fix](cloud) calc_sync_versions should consider full compaction (#55630) branch-3.1: [Fix](cloud) calc_sync_versions should consider full compaction #55630 Sep 5, 2025
@morrySnow morrySnow merged commit b5602c6 into apache:branch-3.1 Sep 5, 2025
22 of 23 checks passed
@morrySnow morrySnow mentioned this pull request Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants