Skip to content

Conversation

@kaka11chen
Copy link
Contributor

What problem does this PR solve?

Problem Summary:

Release note

[opt] (multi-catalog) Optimize remote scan concurrency.

  1. Use ScannerScheduler::get_remote_scan_thread_num() to replace config::doris_scanner_thread_pool_thread_num when calculate max scanners in the external table case.
  2. Remove parallel_scan_max_scanners_count calculation logic.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    1. set enable_profile=true; set profile_level=2;
    2. run sql and check profile MaxScanConcurrency.
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@kaka11chen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34374 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 797a633f654132000cc21fa75d6b2a426223cc13, data reload: false

------ Round 1 ----------------------------------
q1	27214	5134	5115	5115
q2	2006	283	184	184
q3	10379	1230	689	689
q4	10230	1000	544	544
q5	7555	2387	2371	2371
q6	185	165	131	131
q7	930	741	636	636
q8	9325	1308	1101	1101
q9	6941	5287	5142	5142
q10	6874	2363	1895	1895
q11	504	291	278	278
q12	361	353	220	220
q13	18158	3774	3139	3139
q14	230	234	214	214
q15	568	479	488	479
q16	428	444	382	382
q17	617	864	373	373
q18	7494	7282	7171	7171
q19	1641	964	569	569
q20	339	345	238	238
q21	3938	3208	2439	2439
q22	1064	1109	1064	1064
Total cold run time: 116981 ms
Total hot run time: 34374 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5205	5118	5074	5074
q2	252	335	223	223
q3	2303	2782	2473	2473
q4	1449	2003	1511	1511
q5	4826	4659	4617	4617
q6	219	170	126	126
q7	2083	1983	1828	1828
q8	2637	2687	2530	2530
q9	7480	7171	6952	6952
q10	3042	3180	2765	2765
q11	578	525	482	482
q12	661	772	613	613
q13	3554	3925	3303	3303
q14	294	306	287	287
q15	528	484	494	484
q16	432	472	444	444
q17	1138	1581	1341	1341
q18	7750	7614	7403	7403
q19	813	845	927	845
q20	2020	1997	1817	1817
q21	4824	4533	4438	4438
q22	1092	1111	1051	1051
Total cold run time: 53180 ms
Total hot run time: 50607 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 194476 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 797a633f654132000cc21fa75d6b2a426223cc13, data reload: false

query1	1416	1123	1081	1081
query2	6199	1933	1916	1916
query3	10980	4504	4515	4504
query4	54816	24741	23605	23605
query5	5179	551	495	495
query6	367	219	204	204
query7	4923	534	311	311
query8	304	235	231	231
query9	5752	2778	2797	2778
query10	453	363	287	287
query11	15055	15404	14922	14922
query12	164	111	107	107
query13	1066	548	426	426
query14	10244	6371	6410	6371
query15	222	206	184	184
query16	7106	669	526	526
query17	1088	767	606	606
query18	1568	416	331	331
query19	223	215	182	182
query20	142	141	131	131
query21	213	129	109	109
query22	4424	4343	4299	4299
query23	34603	33605	33711	33605
query24	6542	2467	2510	2467
query25	475	476	413	413
query26	726	275	149	149
query27	2233	523	354	354
query28	2987	2293	2258	2258
query29	583	581	450	450
query30	287	224	192	192
query31	862	854	783	783
query32	74	65	64	64
query33	452	383	326	326
query34	814	889	567	567
query35	790	847	754	754
query36	999	1039	932	932
query37	120	98	79	79
query38	4280	4333	4224	4224
query39	1546	1481	1468	1468
query40	223	124	109	109
query41	66	59	66	59
query42	130	110	113	110
query43	530	526	504	504
query44	1385	887	897	887
query45	191	177	175	175
query46	883	1050	648	648
query47	1838	1860	1785	1785
query48	431	458	353	353
query49	671	521	389	389
query50	708	701	422	422
query51	4217	4284	4236	4236
query52	117	120	119	119
query53	235	260	193	193
query54	610	596	542	542
query55	91	91	93	91
query56	311	318	316	316
query57	1188	1183	1102	1102
query58	280	284	277	277
query59	2839	2972	2834	2834
query60	342	345	340	340
query61	130	126	138	126
query62	749	733	673	673
query63	234	204	199	199
query64	1775	1045	724	724
query65	4276	4154	4110	4110
query66	728	400	303	303
query67	16038	15656	15407	15407
query68	6961	934	546	546
query69	550	308	278	278
query70	1215	1151	1116	1116
query71	517	366	314	314
query72	6004	4862	4854	4854
query73	1411	706	379	379
query74	8913	8878	8820	8820
query75	3872	3198	2751	2751
query76	4264	1206	771	771
query77	628	383	295	295
query78	10178	10212	9382	9382
query79	3404	814	574	574
query80	658	514	460	460
query81	492	256	220	220
query82	440	127	99	99
query83	361	256	242	242
query84	355	116	94	94
query85	800	355	311	311
query86	372	319	280	280
query87	4379	4452	4353	4353
query88	3505	2327	2308	2308
query89	421	315	294	294
query90	1915	206	209	206
query91	146	152	117	117
query92	75	63	60	60
query93	1979	982	614	614
query94	678	411	316	316
query95	387	310	298	298
query96	522	578	288	288
query97	2706	2759	2615	2615
query98	236	208	207	207
query99	1439	1435	1290	1290
Total cold run time: 301276 ms
Total hot run time: 194476 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.22 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 797a633f654132000cc21fa75d6b2a426223cc13, data reload: false

query1	0.04	0.04	0.04
query2	0.12	0.11	0.12
query3	0.25	0.20	0.19
query4	1.60	0.20	0.11
query5	0.44	0.42	0.43
query6	1.17	0.67	0.67
query7	0.02	0.01	0.02
query8	0.04	0.04	0.04
query9	0.58	0.52	0.54
query10	0.60	0.59	0.57
query11	0.16	0.11	0.11
query12	0.16	0.12	0.12
query13	0.62	0.60	0.60
query14	0.81	0.80	0.83
query15	0.88	0.87	0.87
query16	0.40	0.40	0.38
query17	1.04	1.04	1.06
query18	0.23	0.22	0.22
query19	1.90	1.83	1.84
query20	0.02	0.01	0.01
query21	15.39	0.93	0.56
query22	0.77	1.16	0.69
query23	14.92	1.39	0.62
query24	7.00	2.02	0.77
query25	0.46	0.14	0.07
query26	0.61	0.19	0.16
query27	0.05	0.05	0.05
query28	9.53	0.93	0.47
query29	12.61	4.04	3.35
query30	0.26	0.10	0.08
query31	2.82	0.62	0.38
query32	3.22	0.56	0.48
query33	3.03	3.11	3.20
query34	15.81	5.13	4.45
query35	4.50	4.52	4.51
query36	0.65	0.51	0.49
query37	0.09	0.07	0.06
query38	0.05	0.04	0.03
query39	0.04	0.03	0.03
query40	0.17	0.13	0.13
query41	0.09	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.04	0.03
Total cold run time: 103.22 s
Total hot run time: 29.22 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 72.73% (8/11) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 56.05% (15023/26801)
Line Coverage 44.93% (134128/298553)
Region Coverage 44.04% (67482/153213)
Branch Coverage 38.60% (34567/89552)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (11/11) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.55% (20988/26384)
Line Coverage 72.64% (216835/298521)
Region Coverage 70.83% (127590/180127)
Branch Coverage 64.54% (66070/102372)

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 4, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Jun 4, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 4, 2025

PR approved by anyone and no changes requested.

@morningman morningman merged commit d333900 into apache:master Jun 5, 2025
27 of 29 checks passed
kaka11chen added a commit to kaka11chen/doris that referenced this pull request Jun 30, 2025
Problem Summary:

[opt] (multi-catalog) Optimize remote scan concurrency.
1. Use `ScannerScheduler::get_remote_scan_thread_num()` to replace
`config::doris_scanner_thread_pool_thread_num` when calculate max
scanners in the external table case.
2. Remove `parallel_scan_max_scanners_count` calculation logic.
kaka11chen added a commit to kaka11chen/doris that referenced this pull request Jun 30, 2025
Problem Summary:

[opt] (multi-catalog) Optimize remote scan concurrency.
1. Use `ScannerScheduler::get_remote_scan_thread_num()` to replace
`config::doris_scanner_thread_pool_thread_num` when calculate max
scanners in the external table case.
2. Remove `parallel_scan_max_scanners_count` calculation logic.
kaka11chen added a commit to kaka11chen/doris that referenced this pull request Jun 30, 2025
Problem Summary:

[opt] (multi-catalog) Optimize remote scan concurrency.
1. Use `ScannerScheduler::get_remote_scan_thread_num()` to replace
`config::doris_scanner_thread_pool_thread_num` when calculate max
scanners in the external table case.
2. Remove `parallel_scan_max_scanners_count` calculation logic.
morningman pushed a commit to kaka11chen/doris that referenced this pull request Jul 3, 2025
Problem Summary:

[opt] (multi-catalog) Optimize remote scan concurrency.
1. Use `ScannerScheduler::get_remote_scan_thread_num()` to replace
`config::doris_scanner_thread_pool_thread_num` when calculate max
scanners in the external table case.
2. Remove `parallel_scan_max_scanners_count` calculation logic.
dataroaring pushed a commit that referenced this pull request Jul 8, 2025
Cherry-pick #51415 

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.7-merged dev/3.1.0-merged p0_r reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants