Skip to content

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #45206

…ck be able to be in different fdb txns (#45206)

#37670 let the meta service return
the tablet compaction stats along with the
`getDeleteBitmapUpdateLockResponse` to FE to let the BE know whether it
should `sync_rowsets()` due to successful compaction on other BEs on the
same tablet. That PR makes the process of reading tablets' stats and
writing the delete bitmap update lock KV in one fdb txn to achieve the
atomic sematic. However, when a load involves a large number of tablets,
the process of reading tablets' stats may take longer than fdb txn's 5
seconds limitation and cause `TXN_TOO_OLD` error.
This PR re-arrange the process so that the read of tablet stats can be
not necessarily in the same fdb txn with the txn which update the
lock_info.lock_id. In detail, we do as the following:
1. gain the delete bitmap update lock in MS (write delete bitmap update
lock KV)
2. read tablets' stats to get the compaction counts.
3. check if the delete bitmap update lock is still held by the current
load.
@Thearas
Copy link
Contributor

Thearas commented Dec 17, 2024

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Dec 17, 2024
@Thearas
Copy link
Contributor

Thearas commented Dec 17, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40329 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 91fb7d8ebdf7e48580df14c6c9e713348c6ae587, data reload: false

------ Round 1 ----------------------------------
q1	17577	7433	7264	7264
q2	2054	187	160	160
q3	10701	1089	1177	1089
q4	10557	744	714	714
q5	7720	2824	2779	2779
q6	236	145	141	141
q7	972	605	601	601
q8	9567	1869	1999	1869
q9	6931	6365	6387	6365
q10	6996	2330	2351	2330
q11	469	254	256	254
q12	398	210	207	207
q13	17780	2947	2946	2946
q14	240	224	223	223
q15	546	520	528	520
q16	672	619	601	601
q17	962	588	531	531
q18	7214	6523	6491	6491
q19	1395	975	1045	975
q20	494	202	197	197
q21	3872	3105	3190	3105
q22	1063	967	987	967
Total cold run time: 108416 ms
Total hot run time: 40329 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7344	7299	7197	7197
q2	327	235	245	235
q3	2857	2816	2875	2816
q4	1990	1819	1785	1785
q5	5633	5681	5720	5681
q6	218	141	140	140
q7	2172	1782	1809	1782
q8	3321	3497	3489	3489
q9	8697	8841	8816	8816
q10	3527	3458	3501	3458
q11	604	517	495	495
q12	804	589	602	589
q13	16064	3138	3143	3138
q14	311	272	283	272
q15	583	537	511	511
q16	734	652	664	652
q17	1852	1602	1587	1587
q18	8238	7808	7598	7598
q19	8639	1590	1597	1590
q20	2106	1835	1821	1821
q21	5448	5307	5258	5258
q22	1063	1000	994	994
Total cold run time: 82532 ms
Total hot run time: 59904 ms

@dataroaring dataroaring merged commit aa49691 into branch-3.0 Dec 26, 2024
21 of 24 checks passed
@github-actions github-actions bot deleted the auto-pick-45206-branch-3.0 branch December 26, 2024 01:26
suxiaogang223 pushed a commit to suxiaogang223/doris that referenced this pull request Jul 15, 2025
…pdate delete bitmap update lock be able to be in different fdb txns apache#45206 (apache#45559) (apache#3910)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants