Skip to content

Conversation

@924060929
Copy link
Contributor

cherry pick from #40202 and #51925

… statement (apache#40202)

improve lots of values in `insert into values` statement by bypass NereidsPlanner

the main logic is
1. `InsertUtils.normalizePlan` use `FoldConstantRuleOnFE` to reduce the
expression, e.g. `values(date(now())`
2. `FastInsertIntoValuesPlanner` skip most of rules to analyze and
rewrite `LogicalInlineTable` to `LogicalUnion` or
`LogicalOneRowRelation`
3. fast parse date time string without date format
4. getHintMap and normal lexer share the same tokens
5. `set enable_fast_analyze_into_values=false` can force to execute all
optimize rules, when we meet some bugs in `FastInsertIntoValuesPlanner`

test: insert 1000 rows with 1000 columns, the columns contains int,
bigint, decimal(26,7), date, datetime, varchar(10 chinese chars)

+---------------------------------+------------------------------------------------------+--------------------------+--------------------------+
|FastInsertIntoValuesPlanner      |NereidsPlanner(enable_fast_analyze_into_values=false) |Legacy optimizer in 2.1.6 | Nereids planner in 2.1.6 |
+---------------------------------+------------------------------------------------------+--------------------------+--------------------------+
|16s(bottleneck is antlr's lexer) |32s                                                   |16s                       |80s                       |
+---------------------------------+------------------------------------------------------+--------------------------+--------------------------+

If you use FastInsertIntoValuesPlanner with group commit in a
transaction, the time can reduce to 12s.

TODO: build a custom lexer. in my hand write lexer test,
FastInsertIntoValuesPlanner without group commit can reduce 16s to 12s,
but it will take more effort: RegularExpression -> NFA -> DFA -> minimal
DFA -> Lexer codegen

(cherry picked from commit 81f3c48)
@924060929 924060929 requested a review from morrySnow as a code owner June 19, 2025 09:28
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929
Copy link
Contributor Author

run buildall

@924060929 924060929 changed the title [enhancement](nereids) improve lots of values in insert into values statement (#40202) [enhancement](nereids) improve lots of values in insert into values statement (#40202) (#51925) Jun 19, 2025
@doris-robot
Copy link

TPC-H: Total hot run time: 39974 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit dabdc1c252b5770cc3f770e05b11c69c8c678f47, data reload: false

------ Round 1 ----------------------------------
q1	17589	7006	6620	6620
q2	2056	160	161	160
q3	10569	1119	1152	1119
q4	10226	759	734	734
q5	7739	3264	2889	2889
q6	211	138	137	137
q7	974	632	608	608
q8	9372	1946	2082	1946
q9	6728	6447	6418	6418
q10	6956	2229	2243	2229
q11	456	272	255	255
q12	393	213	216	213
q13	17765	2979	2983	2979
q14	248	202	202	202
q15	506	483	467	467
q16	473	403	370	370
q17	987	569	567	567
q18	7409	6752	6735	6735
q19	1383	1069	1046	1046
q20	480	207	205	205
q21	4061	3082	3150	3082
q22	1124	993	1004	993
Total cold run time: 107705 ms
Total hot run time: 39974 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6573	6549	6537	6537
q2	328	236	236	236
q3	2944	2812	2945	2812
q4	2118	1779	1915	1779
q5	5664	5787	5722	5722
q6	208	126	127	126
q7	2194	1810	1835	1810
q8	3412	3536	3529	3529
q9	8952	8831	8883	8831
q10	3547	3506	3544	3506
q11	594	499	476	476
q12	790	600	627	600
q13	9143	3202	3184	3184
q14	297	288	278	278
q15	509	472	469	469
q16	484	448	444	444
q17	1830	1630	1612	1612
q18	8191	7740	7783	7740
q19	1720	1558	1622	1558
q20	2101	1852	1857	1852
q21	5232	4845	4849	4845
q22	1092	1011	1000	1000
Total cold run time: 67923 ms
Total hot run time: 58946 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190000 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit dabdc1c252b5770cc3f770e05b11c69c8c678f47, data reload: false

query1	967	377	378	377
query2	6518	1854	1851	1851
query3	6707	221	225	221
query4	33716	23441	23272	23272
query5	4418	446	450	446
query6	291	191	191	191
query7	4620	311	307	307
query8	294	224	221	221
query9	9723	2573	2568	2568
query10	481	266	258	258
query11	18682	15297	15434	15297
query12	151	102	101	101
query13	1632	442	410	410
query14	10154	6780	6542	6542
query15	214	172	182	172
query16	7944	425	435	425
query17	1650	582	571	571
query18	2058	328	302	302
query19	210	178	160	160
query20	119	108	108	108
query21	205	103	102	102
query22	4384	4156	4287	4156
query23	34548	33384	33769	33384
query24	11601	2888	2800	2800
query25	693	404	403	403
query26	1592	173	171	171
query27	3028	344	346	344
query28	7752	2108	2131	2108
query29	999	423	425	423
query30	321	156	153	153
query31	1012	769	784	769
query32	95	59	58	58
query33	795	294	290	290
query34	951	481	510	481
query35	923	719	707	707
query36	1087	942	940	940
query37	135	69	65	65
query38	3965	3821	3812	3812
query39	1491	1434	1438	1434
query40	286	100	109	100
query41	52	49	49	49
query42	114	104	102	102
query43	508	468	458	458
query44	1254	794	780	780
query45	180	165	171	165
query46	1157	719	696	696
query47	1928	1857	1823	1823
query48	462	366	369	366
query49	1253	382	384	382
query50	825	415	416	415
query51	7277	7198	7170	7170
query52	107	90	90	90
query53	252	185	184	184
query54	1263	468	459	459
query55	80	77	80	77
query56	264	231	247	231
query57	1286	1199	1172	1172
query58	230	210	212	210
query59	3099	2849	2831	2831
query60	284	266	269	266
query61	115	111	106	106
query62	904	673	671	671
query63	219	186	183	183
query64	5293	679	631	631
query65	3258	3211	3191	3191
query66	1439	305	298	298
query67	15866	15641	15557	15557
query68	4516	579	567	567
query69	439	263	272	263
query70	1101	1114	1100	1100
query71	345	249	257	249
query72	6437	4245	4191	4191
query73	751	350	364	350
query74	10328	9291	9300	9291
query75	3408	2677	2668	2668
query76	2988	1003	1135	1003
query77	419	285	278	278
query78	10493	9522	9617	9522
query79	1652	596	600	596
query80	1207	441	423	423
query81	518	220	220	220
query82	945	92	89	89
query83	236	154	150	150
query84	243	83	87	83
query85	1311	303	289	289
query86	360	291	291	291
query87	4361	4216	4252	4216
query88	3693	2423	2402	2402
query89	414	303	296	296
query90	1980	188	184	184
query91	175	152	154	152
query92	63	51	55	51
query93	1166	555	541	541
query94	911	298	295	295
query95	356	260	257	257
query96	608	286	286	286
query97	3348	3135	3193	3135
query98	211	202	198	198
query99	1515	1327	1301	1301
Total cold run time: 301879 ms
Total hot run time: 190000 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.33 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit dabdc1c252b5770cc3f770e05b11c69c8c678f47, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.04
query3	0.24	0.06	0.06
query4	1.64	0.10	0.10
query5	0.54	0.53	0.51
query6	1.14	0.73	0.74
query7	0.02	0.02	0.01
query8	0.04	0.03	0.03
query9	0.56	0.49	0.49
query10	0.55	0.54	0.56
query11	0.14	0.10	0.11
query12	0.13	0.13	0.11
query13	0.61	0.59	0.59
query14	0.77	0.79	0.80
query15	0.86	0.83	0.82
query16	0.37	0.38	0.38
query17	1.02	1.05	1.07
query18	0.23	0.21	0.22
query19	1.98	1.86	1.85
query20	0.01	0.01	0.01
query21	15.40	0.60	0.57
query22	2.53	1.67	1.95
query23	17.01	1.01	0.77
query24	2.93	1.30	1.28
query25	0.13	0.11	0.08
query26	0.60	0.13	0.13
query27	0.04	0.04	0.05
query28	10.03	0.47	0.52
query29	12.56	3.28	3.26
query30	0.25	0.06	0.05
query31	2.86	0.40	0.38
query32	3.23	0.46	0.47
query33	2.98	3.02	2.96
query34	17.02	4.47	4.45
query35	4.51	4.58	4.55
query36	0.67	0.48	0.48
query37	0.09	0.05	0.06
query38	0.05	0.04	0.03
query39	0.03	0.03	0.02
query40	0.16	0.12	0.12
query41	0.07	0.03	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 104.18 s
Total hot run time: 30.33 s

@morrySnow morrySnow merged commit 0ce9b58 into apache:branch-3.1 Jun 20, 2025
22 of 24 checks passed
@924060929 924060929 deleted the branch-3.1-opt-insert-into-values branch June 20, 2025 02:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants