Skip to content

Conversation

@CalvinKirs
Copy link
Member

@CalvinKirs CalvinKirs commented Jun 24, 2025

bp #50552

…ileformat (apache#50552)

Issue Number: apache#50238

Problem Summary:

Previously, we refactored the code of the fileFormat attribute (apache#50225).
However, we only added the relevant code without modifying the business
code. This pull request modifies the code of the `RoutineLoad` feature
that is related to the fileformat.

(cherry picked from commit b3abfab)
@CalvinKirs CalvinKirs requested a review from morrySnow as a code owner June 24, 2025 03:17
@Thearas
Copy link
Contributor

Thearas commented Jun 24, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@CalvinKirs
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39757 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4fe3a38b3f0b6d092e596beb1034869582b7ca28, data reload: false

------ Round 1 ----------------------------------
q1	17568	6977	6621	6621
q2	2072	173	166	166
q3	10573	1095	1166	1095
q4	10566	807	705	705
q5	7735	2868	2855	2855
q6	217	135	139	135
q7	990	622	598	598
q8	9340	1957	2067	1957
q9	6687	6380	6411	6380
q10	7043	2292	2291	2291
q11	474	257	266	257
q12	403	209	207	207
q13	17793	2977	2989	2977
q14	232	207	213	207
q15	519	461	469	461
q16	483	380	375	375
q17	977	549	546	546
q18	7449	6671	6784	6671
q19	1332	976	947	947
q20	475	213	201	201
q21	3908	3150	3247	3150
q22	1090	955	955	955
Total cold run time: 107926 ms
Total hot run time: 39757 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6553	6570	6510	6510
q2	326	233	228	228
q3	2918	2692	2835	2692
q4	2038	1800	1799	1799
q5	5738	5764	5718	5718
q6	217	129	133	129
q7	2193	1799	1772	1772
q8	3377	3612	3526	3526
q9	8886	8789	8953	8789
q10	3582	3560	3503	3503
q11	595	504	487	487
q12	827	645	623	623
q13	8592	3196	3126	3126
q14	314	266	262	262
q15	511	463	465	463
q16	487	431	445	431
q17	1878	1626	1605	1605
q18	8364	7691	7828	7691
q19	1731	1704	1632	1632
q20	2014	1834	1803	1803
q21	5163	4999	5064	4999
q22	1132	1044	1034	1034
Total cold run time: 67436 ms
Total hot run time: 58822 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 196152 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4fe3a38b3f0b6d092e596beb1034869582b7ca28, data reload: false

query1	1284	907	887	887
query2	6232	1853	1851	1851
query3	10858	4371	4209	4209
query4	61664	29662	23332	23332
query5	5179	456	461	456
query6	399	174	174	174
query7	5478	307	315	307
query8	325	235	227	227
query9	8726	2586	2583	2583
query10	485	270	261	261
query11	17613	15176	15669	15176
query12	163	105	104	104
query13	1444	452	437	437
query14	10089	7865	6904	6904
query15	204	182	179	179
query16	7100	502	491	491
query17	1137	624	600	600
query18	1918	326	334	326
query19	211	171	163	163
query20	119	109	112	109
query21	205	103	105	103
query22	4716	4555	4508	4508
query23	34629	34233	33942	33942
query24	6188	3047	2937	2937
query25	534	435	441	435
query26	656	172	178	172
query27	1803	373	360	360
query28	4135	2190	2151	2151
query29	716	498	425	425
query30	237	158	152	152
query31	945	832	828	828
query32	66	53	64	53
query33	396	297	299	297
query34	936	532	528	528
query35	837	741	753	741
query36	1069	941	939	939
query37	108	68	70	68
query38	4086	3999	3995	3995
query39	1511	1466	1464	1464
query40	206	97	101	97
query41	54	47	47	47
query42	115	106	106	106
query43	512	475	483	475
query44	1235	852	829	829
query45	194	174	178	174
query46	1173	724	732	724
query47	2006	1915	1942	1915
query48	438	337	355	337
query49	736	398	396	396
query50	839	429	432	429
query51	7364	7238	7225	7225
query52	107	89	94	89
query53	250	184	192	184
query54	582	468	469	468
query55	79	77	85	77
query56	266	270	243	243
query57	1325	1192	1217	1192
query58	234	213	215	213
query59	3114	3080	2910	2910
query60	287	263	260	260
query61	110	105	107	105
query62	807	714	708	708
query63	232	199	189	189
query64	1389	661	647	647
query65	3249	3198	3198	3198
query66	723	292	294	292
query67	15865	15456	15347	15347
query68	4166	581	606	581
query69	428	262	259	259
query70	1188	1086	1078	1078
query71	350	265	256	256
query72	6326	4080	4092	4080
query73	737	359	355	355
query74	10251	9050	8932	8932
query75	3355	2682	2660	2660
query76	1950	1100	1117	1100
query77	515	263	273	263
query78	10682	9523	9499	9499
query79	2158	591	615	591
query80	1374	429	428	428
query81	524	218	218	218
query82	1258	89	92	89
query83	270	145	143	143
query84	283	76	83	76
query85	1040	310	295	295
query86	397	288	271	271
query87	4421	4266	4333	4266
query88	3941	2378	2339	2339
query89	429	290	296	290
query90	1966	186	188	186
query91	134	108	106	106
query92	63	51	50	50
query93	2875	568	556	556
query94	776	285	284	284
query95	359	258	256	256
query96	628	290	282	282
query97	3349	3134	3143	3134
query98	208	205	200	200
query99	1611	1316	1279	1279
Total cold run time: 315904 ms
Total hot run time: 196152 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.64 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 4fe3a38b3f0b6d092e596beb1034869582b7ca28, data reload: false

query1	0.04	0.03	0.02
query2	0.07	0.03	0.03
query3	0.24	0.07	0.07
query4	1.63	0.10	0.10
query5	0.53	0.51	0.51
query6	1.13	0.72	0.74
query7	0.02	0.02	0.04
query8	0.04	0.03	0.03
query9	0.57	0.50	0.50
query10	0.55	0.56	0.55
query11	0.15	0.10	0.11
query12	0.14	0.11	0.11
query13	0.60	0.60	0.59
query14	0.78	0.79	0.81
query15	0.85	0.82	0.82
query16	0.39	0.39	0.40
query17	0.99	1.07	1.05
query18	0.22	0.22	0.21
query19	1.96	1.78	1.91
query20	0.01	0.01	0.01
query21	15.41	0.57	0.57
query22	2.75	1.79	1.76
query23	17.07	0.95	0.74
query24	3.40	1.00	0.67
query25	0.32	0.15	0.08
query26	0.26	0.13	0.14
query27	0.06	0.03	0.04
query28	10.45	0.54	0.49
query29	12.64	3.18	3.19
query30	0.25	0.06	0.06
query31	2.86	0.38	0.38
query32	3.25	0.46	0.46
query33	2.97	2.99	3.04
query34	17.04	4.46	4.44
query35	4.51	4.51	4.49
query36	0.71	0.50	0.47
query37	0.08	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.03	0.02
query40	0.16	0.12	0.12
query41	0.08	0.03	0.03
query42	0.04	0.02	0.02
query43	0.03	0.02	0.02
Total cold run time: 105.32 s
Total hot run time: 29.64 s

@CalvinKirs
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39622 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 91014c8da6b4b7541f5a5e8fc4558c3184a7b0bb, data reload: false

------ Round 1 ----------------------------------
q1	17595	6879	6771	6771
q2	2081	170	159	159
q3	10974	1053	1201	1053
q4	10538	735	656	656
q5	7776	2853	2818	2818
q6	215	139	141	139
q7	978	628	606	606
q8	9356	1959	2013	1959
q9	6607	6391	6374	6374
q10	7011	2287	2286	2286
q11	457	264	252	252
q12	391	214	205	205
q13	17779	2959	2958	2958
q14	237	208	203	203
q15	496	459	463	459
q16	441	372	379	372
q17	981	562	661	562
q18	7457	6556	6677	6556
q19	1321	955	1042	955
q20	475	205	200	200
q21	4161	3108	3097	3097
q22	1086	983	982	982
Total cold run time: 108413 ms
Total hot run time: 39622 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6589	6573	6577	6573
q2	331	233	237	233
q3	2863	2759	2881	2759
q4	2044	1771	1840	1771
q5	5714	5755	5717	5717
q6	213	136	130	130
q7	2154	1779	1825	1779
q8	3296	3601	3515	3515
q9	8944	8804	8936	8804
q10	3556	3505	3549	3505
q11	588	490	513	490
q12	805	566	616	566
q13	7768	3115	3184	3115
q14	297	262	264	262
q15	511	455	471	455
q16	488	449	428	428
q17	1833	1619	1589	1589
q18	8242	7839	7714	7714
q19	1694	1433	1526	1433
q20	2146	1888	1798	1798
q21	5052	4988	5027	4988
q22	1149	1053	1065	1053
Total cold run time: 66277 ms
Total hot run time: 58677 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 196644 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 91014c8da6b4b7541f5a5e8fc4558c3184a7b0bb, data reload: false

query1	1306	935	873	873
query2	6366	1899	1845	1845
query3	10817	4337	4293	4293
query4	61379	29194	23820	23820
query5	5205	456	435	435
query6	408	174	184	174
query7	5469	313	307	307
query8	319	228	235	228
query9	8715	2571	2571	2571
query10	464	282	248	248
query11	18016	15246	15798	15246
query12	168	106	109	106
query13	1468	448	448	448
query14	10643	7130	7567	7130
query15	198	182	182	182
query16	7189	495	532	495
query17	1159	586	592	586
query18	1850	331	315	315
query19	207	166	155	155
query20	118	118	111	111
query21	207	109	104	104
query22	4621	4476	4631	4476
query23	34450	34063	34178	34063
query24	6147	2910	2926	2910
query25	581	456	430	430
query26	714	175	170	170
query27	1932	371	376	371
query28	3821	2222	2168	2168
query29	710	458	457	457
query30	237	167	156	156
query31	979	796	834	796
query32	71	59	55	55
query33	407	310	298	298
query34	890	507	518	507
query35	828	711	723	711
query36	1087	969	973	969
query37	107	65	69	65
query38	4041	3975	4084	3975
query39	1518	1535	1479	1479
query40	201	102	113	102
query41	53	48	46	46
query42	119	104	102	102
query43	516	480	458	458
query44	1201	826	841	826
query45	190	173	165	165
query46	1159	745	734	734
query47	2018	1880	1948	1880
query48	438	335	337	335
query49	745	398	392	392
query50	845	437	424	424
query51	7434	7178	7165	7165
query52	106	90	97	90
query53	258	194	186	186
query54	582	478	463	463
query55	79	77	79	77
query56	273	261	256	256
query57	1320	1209	1214	1209
query58	230	212	234	212
query59	3174	2961	3011	2961
query60	322	267	267	267
query61	154	113	115	113
query62	755	656	683	656
query63	207	188	187	187
query64	2212	650	626	626
query65	3252	3203	3238	3203
query66	712	301	300	300
query67	15911	15598	15397	15397
query68	4151	592	614	592
query69	433	261	270	261
query70	1135	1082	1061	1061
query71	339	262	249	249
query72	6367	4009	3714	3714
query73	744	348	357	348
query74	10331	8980	9164	8980
query75	3360	2667	2701	2667
query76	1885	1191	1030	1030
query77	489	277	276	276
query78	10573	9551	9597	9551
query79	1727	591	595	591
query80	1425	419	424	419
query81	515	223	222	222
query82	1257	93	87	87
query83	280	141	144	141
query84	279	84	80	80
query85	1016	319	302	302
query86	401	300	297	297
query87	4395	4288	4233	4233
query88	3792	2380	2352	2352
query89	418	286	294	286
query90	1968	185	185	185
query91	138	109	112	109
query92	62	52	51	51
query93	2235	564	562	562
query94	772	294	296	294
query95	356	254	254	254
query96	612	280	288	280
query97	3313	3126	3151	3126
query98	209	201	205	201
query99	1605	1283	1294	1283
Total cold run time: 315860 ms
Total hot run time: 196644 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.02 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 91014c8da6b4b7541f5a5e8fc4558c3184a7b0bb, data reload: false

query1	0.03	0.03	0.03
query2	0.06	0.03	0.03
query3	0.23	0.07	0.06
query4	1.63	0.11	0.11
query5	0.52	0.50	0.49
query6	1.14	0.74	0.73
query7	0.02	0.04	0.01
query8	0.04	0.03	0.04
query9	0.55	0.50	0.49
query10	0.55	0.56	0.55
query11	0.13	0.10	0.10
query12	0.14	0.10	0.10
query13	0.61	0.60	0.59
query14	0.76	0.80	0.79
query15	0.84	0.82	0.82
query16	0.40	0.38	0.39
query17	1.07	1.02	1.00
query18	0.23	0.21	0.21
query19	1.91	1.89	1.74
query20	0.01	0.01	0.01
query21	15.41	0.57	0.59
query22	2.59	2.46	1.89
query23	17.15	0.93	0.94
query24	3.02	0.97	0.86
query25	0.32	0.09	0.06
query26	0.41	0.14	0.13
query27	0.04	0.04	0.04
query28	10.72	0.47	0.45
query29	12.60	3.20	3.17
query30	0.25	0.06	0.06
query31	2.86	0.38	0.38
query32	3.26	0.45	0.46
query33	2.97	3.02	3.04
query34	17.07	4.47	4.46
query35	4.54	4.50	4.50
query36	0.66	0.48	0.47
query37	0.08	0.06	0.06
query38	0.04	0.03	0.03
query39	0.03	0.02	0.02
query40	0.17	0.13	0.13
query41	0.07	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 105.19 s
Total hot run time: 30.02 s

@CalvinKirs CalvinKirs force-pushed the branch-3.1-routineload-50552 branch from 91014c8 to 89e1244 Compare June 24, 2025 07:11
@CalvinKirs
Copy link
Member Author

run buildall

@CalvinKirs
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39596 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit aa807514d6b71cf08d1e8100c997f7ee28bc31ab, data reload: false

------ Round 1 ----------------------------------
q1	17594	6712	6628	6628
q2	2064	166	157	157
q3	10571	1146	1176	1146
q4	10209	719	741	719
q5	7726	2805	2713	2713
q6	217	131	135	131
q7	986	632	618	618
q8	9354	1943	1997	1943
q9	6644	6416	6367	6367
q10	7013	2298	2247	2247
q11	460	264	264	264
q12	404	208	207	207
q13	17769	2987	2974	2974
q14	228	204	208	204
q15	522	463	469	463
q16	470	383	367	367
q17	978	568	533	533
q18	7443	6672	6675	6672
q19	1350	979	956	956
q20	471	220	199	199
q21	3971	3262	3118	3118
q22	1076	970	1011	970
Total cold run time: 107520 ms
Total hot run time: 39596 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6642	6532	7376	6532
q2	324	232	232	232
q3	2947	2726	2872	2726
q4	2068	1808	1743	1743
q5	5692	5737	5767	5737
q6	201	129	124	124
q7	2149	1784	1819	1784
q8	3341	3500	3495	3495
q9	8902	8814	8847	8814
q10	3566	3505	3613	3505
q11	592	490	497	490
q12	815	622	589	589
q13	9648	3142	3129	3129
q14	326	269	264	264
q15	525	461	469	461
q16	487	436	431	431
q17	1831	1626	1598	1598
q18	8173	7611	7664	7611
q19	1686	1616	1532	1532
q20	2081	1885	1878	1878
q21	5179	4980	4998	4980
q22	1083	1011	984	984
Total cold run time: 68258 ms
Total hot run time: 58639 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 189370 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit aa807514d6b71cf08d1e8100c997f7ee28bc31ab, data reload: false

query1	975	369	396	369
query2	6517	1945	1902	1902
query3	6699	211	223	211
query4	33910	23381	23477	23381
query5	4386	460	450	450
query6	291	176	193	176
query7	4634	311	305	305
query8	286	244	230	230
query9	9462	2592	2575	2575
query10	486	282	259	259
query11	18144	15226	15108	15108
query12	146	103	102	102
query13	1649	426	409	409
query14	8942	7361	6589	6589
query15	235	170	173	170
query16	8087	476	460	460
query17	1616	566	567	566
query18	2129	300	308	300
query19	261	165	149	149
query20	115	106	107	106
query21	206	105	106	105
query22	4319	4082	4093	4082
query23	34458	34017	33444	33444
query24	11652	2902	2826	2826
query25	717	417	421	417
query26	1814	168	170	168
query27	3034	346	342	342
query28	8043	2113	2098	2098
query29	1042	438	438	438
query30	326	167	163	163
query31	1036	819	814	814
query32	101	60	63	60
query33	779	302	314	302
query34	950	485	504	485
query35	834	714	714	714
query36	1080	883	923	883
query37	149	72	81	72
query38	3969	3752	3843	3752
query39	1489	1403	1414	1403
query40	297	102	102	102
query41	53	51	53	51
query42	118	103	104	103
query43	528	474	484	474
query44	1328	804	813	804
query45	185	172	175	172
query46	1140	702	718	702
query47	1886	1831	1808	1808
query48	420	336	345	336
query49	1284	405	412	405
query50	812	427	408	408
query51	7353	7038	7195	7038
query52	107	90	93	90
query53	269	189	184	184
query54	1267	462	494	462
query55	76	79	76	76
query56	263	258	246	246
query57	1270	1147	1192	1147
query58	246	210	205	205
query59	3081	2897	2907	2897
query60	277	268	254	254
query61	124	126	117	117
query62	874	675	674	674
query63	218	181	189	181
query64	5255	647	663	647
query65	3281	3166	3226	3166
query66	1433	308	319	308
query67	15943	15526	15518	15518
query68	4711	570	585	570
query69	447	270	257	257
query70	1157	1137	1100	1100
query71	338	264	251	251
query72	6375	4035	3940	3940
query73	743	343	361	343
query74	10588	9228	9149	9149
query75	3355	2640	2605	2605
query76	2929	1076	1018	1018
query77	367	271	273	271
query78	10428	9641	9625	9625
query79	1595	612	603	603
query80	1183	429	414	414
query81	539	226	225	225
query82	932	87	89	87
query83	225	154	160	154
query84	235	85	86	85
query85	1341	311	301	301
query86	410	297	283	283
query87	4339	4180	4195	4180
query88	3588	2409	2343	2343
query89	412	290	290	290
query90	2029	186	186	186
query91	140	107	109	107
query92	66	51	51	51
query93	1310	564	556	556
query94	948	302	287	287
query95	357	269	261	261
query96	616	281	286	281
query97	3299	3154	3106	3106
query98	217	202	189	189
query99	1565	1322	1292	1292
Total cold run time: 301140 ms
Total hot run time: 189370 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.49 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit aa807514d6b71cf08d1e8100c997f7ee28bc31ab, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.02	0.03
query3	0.23	0.07	0.07
query4	1.63	0.11	0.10
query5	0.53	0.51	0.53
query6	1.14	0.73	0.72
query7	0.03	0.02	0.02
query8	0.06	0.03	0.03
query9	0.56	0.49	0.49
query10	0.56	0.55	0.55
query11	0.13	0.10	0.10
query12	0.13	0.11	0.12
query13	0.62	0.59	0.60
query14	0.77	0.79	0.80
query15	0.86	0.81	0.82
query16	0.37	0.41	0.38
query17	1.04	0.99	1.06
query18	0.24	0.21	0.22
query19	1.99	1.79	1.91
query20	0.01	0.01	0.02
query21	15.41	0.59	0.60
query22	2.42	1.72	1.98
query23	17.03	0.91	0.88
query24	3.32	1.38	0.39
query25	0.13	0.12	0.12
query26	0.45	0.13	0.14
query27	0.06	0.05	0.06
query28	10.75	0.52	0.44
query29	12.58	3.23	3.22
query30	0.26	0.06	0.05
query31	2.85	0.39	0.39
query32	3.23	0.46	0.46
query33	2.90	2.99	3.01
query34	16.85	4.46	4.56
query35	4.53	4.59	4.50
query36	0.66	0.49	0.48
query37	0.09	0.06	0.06
query38	0.04	0.03	0.03
query39	0.04	0.02	0.02
query40	0.15	0.12	0.12
query41	0.07	0.02	0.03
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 104.91 s
Total hot run time: 29.49 s

@morningman morningman changed the title [branch-3.1:[feat](refactor-param) refactor routineLoad's code about fileformat branch-3.1: [feat](refactor-param) refactor routineLoad's code about fileformat #50552 Jun 25, 2025
@morrySnow morrySnow merged commit c219df6 into apache:branch-3.1 Jun 25, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants