Skip to content

Conversation

@gnehil
Copy link
Contributor

@gnehil gnehil commented Jul 14, 2025

cherry-picked from #45937

@gnehil gnehil requested a review from morrySnow as a code owner July 14, 2025 09:11
@Thearas
Copy link
Contributor

Thearas commented Jul 14, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40099 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3a321cde94837ed6943e354fdbf1d9d72c5df491, data reload: false

------ Round 1 ----------------------------------
q1	17597	6826	6711	6711
q2	2081	194	166	166
q3	10654	1183	1203	1183
q4	10524	732	733	732
q5	7764	3112	2884	2884
q6	219	138	143	138
q7	995	623	606	606
q8	9363	1989	2060	1989
q9	6690	6383	6444	6383
q10	7016	2277	2284	2277
q11	459	267	266	266
q12	393	211	210	210
q13	17789	2987	2983	2983
q14	233	208	219	208
q15	511	480	467	467
q16	477	385	375	375
q17	998	584	526	526
q18	7402	6644	6677	6644
q19	1358	989	1064	989
q20	494	212	210	210
q21	4058	3161	3151	3151
q22	1140	1003	1001	1001
Total cold run time: 108215 ms
Total hot run time: 40099 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6651	6655	6625	6625
q2	326	245	230	230
q3	2894	2903	2961	2903
q4	2098	1886	1841	1841
q5	5715	5734	5712	5712
q6	212	131	133	131
q7	2281	1804	1826	1804
q8	3358	3623	3576	3576
q9	8855	8972	8897	8897
q10	3554	3520	3514	3514
q11	610	477	486	477
q12	774	622	601	601
q13	8507	3173	3219	3173
q14	290	273	282	273
q15	526	473	469	469
q16	499	447	451	447
q17	1871	1640	1601	1601
q18	8269	7751	7847	7751
q19	1711	1537	1657	1537
q20	2080	1835	1877	1835
q21	5250	4986	5128	4986
q22	1143	1041	994	994
Total cold run time: 67474 ms
Total hot run time: 59377 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197754 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3a321cde94837ed6943e354fdbf1d9d72c5df491, data reload: false

query1	1278	933	906	906
query2	7073	2028	1967	1967
query3	10833	4395	4561	4395
query4	33447	23780	23383	23383
query5	4085	457	447	447
query6	266	184	177	177
query7	3985	311	320	311
query8	300	241	230	230
query9	9639	2632	2593	2593
query10	484	268	262	262
query11	17915	15123	15608	15123
query12	159	106	106	106
query13	1545	426	419	419
query14	10247	7586	7448	7448
query15	252	183	197	183
query16	8011	503	498	498
query17	1669	607	622	607
query18	2138	321	310	310
query19	372	171	172	171
query20	128	121	113	113
query21	218	111	115	111
query22	4634	4432	4333	4333
query23	34498	34182	34198	34182
query24	11640	2929	2966	2929
query25	682	428	435	428
query26	1730	177	172	172
query27	2825	367	356	356
query28	7557	2154	2185	2154
query29	1023	447	463	447
query30	260	174	169	169
query31	1039	811	835	811
query32	101	57	59	57
query33	779	308	299	299
query34	921	499	536	499
query35	948	746	725	725
query36	1110	920	966	920
query37	176	66	71	66
query38	4113	4011	3997	3997
query39	1547	1474	1472	1472
query40	291	107	112	107
query41	50	47	48	47
query42	122	105	102	102
query43	541	507	492	492
query44	1308	816	819	816
query45	188	171	168	168
query46	1166	727	719	719
query47	2033	1894	1960	1894
query48	437	350	353	350
query49	1072	396	391	391
query50	833	432	425	425
query51	7672	7302	7328	7302
query52	105	96	97	96
query53	260	192	189	189
query54	1262	473	479	473
query55	81	79	81	79
query56	275	251	253	251
query57	1343	1242	1198	1198
query58	248	208	216	208
query59	3292	3223	3135	3135
query60	289	259	257	257
query61	109	109	107	107
query62	851	716	696	696
query63	228	206	194	194
query64	5167	676	629	629
query65	3366	3331	3283	3283
query66	1415	302	297	297
query67	16407	15784	15417	15417
query68	5244	595	565	565
query69	448	268	259	259
query70	1136	1085	1078	1078
query71	318	290	257	257
query72	6165	4025	3955	3955
query73	755	358	347	347
query74	10291	9238	9291	9238
query75	3378	2663	2672	2663
query76	3032	1151	1067	1067
query77	404	279	283	279
query78	10467	9542	9601	9542
query79	1709	607	592	592
query80	1089	421	420	420
query81	542	231	222	222
query82	922	92	90	90
query83	231	149	146	146
query84	229	77	86	77
query85	1369	323	296	296
query86	437	263	301	263
query87	4341	4305	4241	4241
query88	3795	2391	2398	2391
query89	405	303	291	291
query90	2022	185	187	185
query91	147	107	107	107
query92	64	52	51	51
query93	2458	553	558	553
query94	932	295	297	295
query95	363	253	256	253
query96	626	281	280	280
query97	3294	3171	3168	3168
query98	224	209	196	196
query99	1491	1289	1333	1289
Total cold run time: 308366 ms
Total hot run time: 197754 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.65 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 3a321cde94837ed6943e354fdbf1d9d72c5df491, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.03
query3	0.23	0.07	0.07
query4	1.62	0.11	0.10
query5	0.51	0.53	0.52
query6	1.13	0.73	0.74
query7	0.03	0.02	0.01
query8	0.04	0.02	0.03
query9	0.55	0.50	0.52
query10	0.54	0.55	0.55
query11	0.14	0.11	0.10
query12	0.14	0.12	0.10
query13	0.61	0.60	0.60
query14	0.79	0.78	0.80
query15	0.84	0.84	0.83
query16	0.40	0.38	0.40
query17	1.02	1.03	1.05
query18	0.23	0.22	0.22
query19	1.90	1.77	1.76
query20	0.01	0.01	0.01
query21	15.39	0.58	0.57
query22	2.36	3.00	1.83
query23	17.02	1.04	0.77
query24	3.12	0.58	1.30
query25	0.19	0.08	0.26
query26	0.39	0.14	0.13
query27	0.05	0.05	0.05
query28	10.55	0.52	0.48
query29	12.57	3.26	3.25
query30	0.24	0.06	0.06
query31	2.87	0.38	0.40
query32	3.23	0.45	0.45
query33	3.02	3.00	2.95
query34	17.00	4.55	4.45
query35	4.47	4.51	4.48
query36	0.67	0.48	0.49
query37	0.08	0.07	0.06
query38	0.04	0.03	0.03
query39	0.03	0.02	0.02
query40	0.17	0.13	0.12
query41	0.08	0.03	0.02
query42	0.03	0.03	0.02
query43	0.04	0.03	0.02
Total cold run time: 104.44 s
Total hot run time: 29.65 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/36) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 45.23% (12511/27662)
Line Coverage 36.12% (111082/307572)
Region Coverage 35.25% (57490/163081)
Branch Coverage 32.35% (31211/96472)

gnehil and others added 2 commits July 16, 2025 14:45
### What problem does this PR solve?

Problem Summary:

Ingestion Load is used to load pre-processed data into doris.

Preprocessing refers to writing the result data to an external storage
system after the data is processed according to the partitioning,
bucketing and aggregation methods defined by the doris table.

The preprocessing is completed by the external system, and then the BE
reads the data and converts it into segment files and saves it.

The basic flow is as follows:

![ingestion_load](https://github.com/apache/doris/assets/30104232/aa468cd4-90bf-4d9d-b69b-0425b66b15f4)

### Release note
[feature](load) new insgestion load

(cherry picked from commit 6580f6b)
@gnehil gnehil force-pushed the ingestion-load-3.1 branch from 3a321cd to 4123798 Compare July 16, 2025 06:49
@morrySnow morrySnow changed the title [feature](load)(cherry-pick) new insgestion load branch-3.1: [feature](load) new insgestion load #45937 Jul 16, 2025
@morrySnow
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39958 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4123798e29eda56903f0bfdead3ede490f693910, data reload: false

------ Round 1 ----------------------------------
q1	17628	6909	6613	6613
q2	2091	191	168	168
q3	10644	1159	1186	1159
q4	10410	782	785	782
q5	7732	2896	2895	2895
q6	218	137	136	136
q7	1020	618	606	606
q8	9359	1996	2028	1996
q9	6670	6409	6447	6409
q10	7002	2293	2278	2278
q11	474	266	263	263
q12	414	221	211	211
q13	17789	2994	3022	2994
q14	238	206	207	206
q15	507	466	463	463
q16	479	383	370	370
q17	991	588	594	588
q18	7508	6678	6690	6678
q19	1325	1039	921	921
q20	485	203	194	194
q21	3863	3178	3045	3045
q22	1091	983	1013	983
Total cold run time: 107938 ms
Total hot run time: 39958 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6647	6583	6598	6583
q2	320	241	235	235
q3	2932	2924	2957	2924
q4	2072	1832	1775	1775
q5	5726	5779	5758	5758
q6	201	126	134	126
q7	2193	1818	1816	1816
q8	3396	3499	3558	3499
q9	8738	8919	8910	8910
q10	3558	3491	3519	3491
q11	590	488	487	487
q12	801	580	597	580
q13	10063	3191	3175	3175
q14	307	266	266	266
q15	503	472	459	459
q16	514	444	448	444
q17	1846	1638	1576	1576
q18	8171	7801	7720	7720
q19	1703	1577	1601	1577
q20	2051	1796	1877	1796
q21	5166	4948	5003	4948
q22	1099	1022	1034	1022
Total cold run time: 68597 ms
Total hot run time: 59167 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 196581 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4123798e29eda56903f0bfdead3ede490f693910, data reload: false

query1	1321	900	911	900
query2	6277	1921	1863	1863
query3	10778	4342	4291	4291
query4	33462	23745	23447	23447
query5	4173	477	456	456
query6	290	187	190	187
query7	3998	319	316	316
query8	289	233	232	232
query9	9641	2591	2581	2581
query10	492	267	273	267
query11	18033	15125	15224	15125
query12	163	101	102	101
query13	1549	421	418	418
query14	8542	7301	6723	6723
query15	233	186	196	186
query16	7975	535	441	441
query17	1652	615	609	609
query18	2085	312	327	312
query19	222	170	171	170
query20	130	120	118	118
query21	208	111	108	108
query22	4681	4531	4390	4390
query23	35256	34252	34227	34227
query24	11952	2971	2986	2971
query25	702	443	434	434
query26	1774	177	177	177
query27	2845	354	348	348
query28	7888	2180	2152	2152
query29	1071	469	461	461
query30	266	161	162	161
query31	1103	845	834	834
query32	99	58	58	58
query33	780	349	327	327
query34	936	525	521	521
query35	858	741	762	741
query36	1113	947	944	944
query37	266	71	77	71
query38	4080	3985	3908	3908
query39	1521	1446	1463	1446
query40	266	104	101	101
query41	48	47	48	47
query42	119	105	101	101
query43	518	481	493	481
query44	1351	828	840	828
query45	192	167	170	167
query46	1154	727	740	727
query47	2006	1890	1901	1890
query48	456	335	356	335
query49	1038	410	385	385
query50	831	434	433	433
query51	7558	7206	7266	7206
query52	104	93	93	93
query53	268	188	188	188
query54	1270	490	478	478
query55	83	79	78	78
query56	262	247	260	247
query57	1324	1240	1234	1234
query58	255	216	212	212
query59	3117	2978	3052	2978
query60	289	276	288	276
query61	122	111	112	111
query62	867	686	704	686
query63	228	203	193	193
query64	5008	703	664	664
query65	3364	3291	3289	3289
query66	1292	291	327	291
query67	16033	15484	15416	15416
query68	4841	586	585	585
query69	442	269	263	263
query70	1209	1103	1126	1103
query71	338	283	255	255
query72	6172	4033	4061	4033
query73	746	347	356	347
query74	10017	8937	9126	8937
query75	3381	2650	2683	2650
query76	2605	1102	1057	1057
query77	385	272	266	266
query78	10560	9617	9575	9575
query79	1392	605	621	605
query80	1004	450	430	430
query81	550	224	213	213
query82	1044	93	98	93
query83	240	147	140	140
query84	236	76	78	76
query85	1324	309	302	302
query86	381	288	306	288
query87	4455	4167	4228	4167
query88	3655	2412	2397	2397
query89	412	285	309	285
query90	1967	190	192	190
query91	147	108	112	108
query92	65	52	53	52
query93	1374	547	555	547
query94	910	294	294	294
query95	370	260	263	260
query96	598	283	277	277
query97	3319	3132	3126	3126
query98	217	204	209	204
query99	1482	1315	1299	1299
Total cold run time: 303965 ms
Total hot run time: 196581 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.71 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 4123798e29eda56903f0bfdead3ede490f693910, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.03
query3	0.23	0.07	0.06
query4	1.62	0.11	0.11
query5	0.50	0.52	0.50
query6	1.13	0.73	0.73
query7	0.02	0.02	0.02
query8	0.03	0.03	0.03
query9	0.57	0.49	0.50
query10	0.55	0.55	0.55
query11	0.14	0.10	0.11
query12	0.14	0.12	0.12
query13	0.61	0.60	0.59
query14	0.79	0.79	0.79
query15	0.83	0.82	0.83
query16	0.38	0.38	0.40
query17	1.08	1.04	1.06
query18	0.23	0.21	0.22
query19	1.96	1.84	1.86
query20	0.01	0.01	0.02
query21	15.39	0.60	0.59
query22	2.35	2.02	2.24
query23	16.91	0.92	0.77
query24	2.95	0.40	1.67
query25	0.17	0.11	0.05
query26	0.54	0.14	0.13
query27	0.05	0.06	0.05
query28	10.72	0.50	0.45
query29	12.61	3.18	3.19
query30	0.24	0.06	0.06
query31	2.87	0.40	0.39
query32	3.22	0.46	0.46
query33	2.97	3.00	3.02
query34	17.12	4.44	4.47
query35	4.51	4.57	4.48
query36	0.69	0.50	0.49
query37	0.09	0.06	0.06
query38	0.04	0.03	0.04
query39	0.03	0.03	0.02
query40	0.16	0.13	0.12
query41	0.08	0.02	0.03
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 104.71 s
Total hot run time: 29.71 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/36) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 45.24% (12539/27718)
Line Coverage 36.11% (111410/308530)
Region Coverage 35.23% (57612/163512)
Branch Coverage 32.37% (31310/96712)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 0.00% (0/34) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.56% (14668/27385)
Line Coverage 44.18% (136214/308292)
Region Coverage 41.74% (79093/189504)
Branch Coverage 36.70% (40031/109078)

@morningman
Copy link
Contributor

use this: #53582

@morningman morningman closed this Jul 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants