Skip to content

Conversation

@yujun777
Copy link
Contributor

@yujun777 yujun777 commented Nov 27, 2025

What problem does this PR solve?

Join extract OR expressions from case when expression.

  1. extract conditions for one side, latter can push down the one side condition:
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b) + t2.b + t2.c > 10)
    =>
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b end) + t2.b + t2.c > 10)
                  AND (not (t2.a + t2.b + t2.c > 10) or not (t2.b + t2.b + t2.c > 10))
  1. extract condition for both sides, which use for OR EXPANSION rule:

    the OR EXPANSION condition is an OR expression and all its disjuncts are all hash condition.

    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
    =>
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
                AND (t2.a = t1.a + t1.b or t2.b = t1.a + t1.b)

Notice We don't extract more than one case when like expressions.
because it may generate expressions with combinatorial explosion.

for example:

 (((case c1 then p1 else p2 end) + (case when d1 then q1 else q2 end))) + a  > 10
 =>
 (p1 + q1 + a > 10)
     or (p1 + q2 + a > 10)
     or (p2 + q1 + a > 10)
     or (p2 + q2 + a > 10)

so we only extract at most one case when like expression for each condition.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Nov 27, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@yujun777
Copy link
Contributor Author

run buildall

@yujun777 yujun777 force-pushed the join-extract-case-when branch from 620375c to ec4225c Compare November 27, 2025 01:08
@yujun777
Copy link
Contributor Author

run buildall

@yujun777 yujun777 force-pushed the join-extract-case-when branch from ec4225c to cdcdf48 Compare November 27, 2025 01:15
@yujun777
Copy link
Contributor Author

run buildall

@yujun777 yujun777 force-pushed the join-extract-case-when branch from cdcdf48 to e0360b8 Compare November 27, 2025 02:30
@yujun777
Copy link
Contributor Author

run buildall

1 similar comment
@yujun777
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34179 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 99eb24ab2b70eb4b04e3b55e326ce41a9f254662, data reload: false

------ Round 1 ----------------------------------
q1	17688	5082	4936	4936
q2	2224	314	211	211
q3	10296	1284	712	712
q4	10228	834	375	375
q5	7503	2323	2324	2323
q6	180	170	144	144
q7	916	768	631	631
q8	9341	1315	1068	1068
q9	7044	5248	5284	5248
q10	7181	2243	1832	1832
q11	526	306	284	284
q12	365	359	225	225
q13	17777	3583	3015	3015
q14	222	251	209	209
q15	594	511	509	509
q16	1044	1019	970	970
q17	574	852	375	375
q18	7532	7085	7109	7085
q19	1168	940	534	534
q20	363	344	230	230
q21	3733	3161	2306	2306
q22	1020	1016	957	957
Total cold run time: 107519 ms
Total hot run time: 34179 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5000	4993	4940	4940
q2	330	426	328	328
q3	2182	2677	2307	2307
q4	1351	1753	1333	1333
q5	4174	4522	4505	4505
q6	221	178	132	132
q7	2013	2006	1848	1848
q8	2669	2544	2586	2544
q9	7474	7603	7423	7423
q10	2994	3310	2864	2864
q11	582	519	504	504
q12	675	761	614	614
q13	3531	3852	3436	3436
q14	284	294	272	272
q15	551	522	529	522
q16	1077	1066	1076	1066
q17	1167	1554	1381	1381
q18	7833	7757	7498	7498
q19	817	840	1058	840
q20	2039	2036	1818	1818
q21	4684	4373	4261	4261
q22	1076	1025	988	988
Total cold run time: 52724 ms
Total hot run time: 51424 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184153 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 99eb24ab2b70eb4b04e3b55e326ce41a9f254662, data reload: false

query1	1160	414	390	390
query2	6649	1530	1555	1530
query3	6843	225	225	225
query4	26022	22760	22342	22342
query5	4424	651	477	477
query6	356	237	223	223
query7	4674	523	303	303
query8	311	262	246	246
query9	8701	2546	2577	2546
query10	489	346	317	317
query11	15234	14815	14722	14722
query12	175	126	112	112
query13	1692	579	453	453
query14	10421	9193	8892	8892
query15	216	205	183	183
query16	7657	732	511	511
query17	1229	740	619	619
query18	2020	443	322	322
query19	205	196	180	180
query20	127	122	126	122
query21	267	147	124	124
query22	3866	4051	3942	3942
query23	32927	32011	31948	31948
query24	8427	2411	2405	2405
query25	618	508	452	452
query26	1299	278	162	162
query27	2682	501	352	352
query28	4357	2134	2123	2123
query29	785	621	493	493
query30	332	234	212	212
query31	877	719	645	645
query32	84	75	73	73
query33	600	382	330	330
query34	796	871	536	536
query35	832	837	738	738
query36	906	965	859	859
query37	149	109	82	82
query38	3350	3436	3329	3329
query39	1481	1585	1412	1412
query40	225	130	123	123
query41	65	67	64	64
query42	126	113	112	112
query43	482	461	454	454
query44	1256	758	765	758
query45	199	188	186	186
query46	898	1008	644	644
query47	1654	1694	1628	1628
query48	421	448	338	338
query49	775	515	393	393
query50	653	675	401	401
query51	4421	3920	3955	3920
query52	109	112	110	110
query53	242	277	199	199
query54	312	294	273	273
query55	99	98	97	97
query56	349	342	312	312
query57	1148	1180	1098	1098
query58	286	280	276	276
query59	2469	2414	2434	2414
query60	356	346	341	341
query61	197	184	186	184
query62	790	720	658	658
query63	231	203	199	199
query64	4586	1218	892	892
query65	4067	3954	3973	3954
query66	1121	440	345	345
query67	15049	15346	14953	14953
query68	7041	908	633	633
query69	546	346	303	303
query70	1285	1267	1232	1232
query71	426	333	318	318
query72	5938	4949	5133	4949
query73	653	605	356	356
query74	8550	8894	8308	8308
query75	3318	3332	2823	2823
query76	3258	1139	739	739
query77	524	456	332	332
query78	9371	9678	8919	8919
query79	2041	813	579	579
query80	710	618	528	528
query81	506	271	241	241
query82	523	169	133	133
query83	267	283	264	264
query84	263	118	104	104
query85	1047	502	444	444
query86	389	310	283	283
query87	3483	3465	3377	3377
query88	3617	2257	2264	2257
query89	396	337	300	300
query90	1907	229	227	227
query91	188	170	143	143
query92	80	65	66	65
query93	1771	980	663	663
query94	756	439	346	346
query95	501	401	395	395
query96	498	582	290	290
query97	2899	2964	2863	2863
query98	247	214	215	214
query99	1314	1384	1267	1267
Total cold run time: 270139 ms
Total hot run time: 184153 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.36 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 99eb24ab2b70eb4b04e3b55e326ce41a9f254662, data reload: false

query1	0.05	0.05	0.05
query2	0.12	0.05	0.05
query3	0.26	0.08	0.08
query4	1.62	0.11	0.10
query5	0.28	0.24	0.25
query6	1.20	0.64	0.65
query7	0.04	0.03	0.03
query8	0.06	0.04	0.04
query9	0.61	0.52	0.51
query10	0.58	0.58	0.58
query11	0.17	0.11	0.12
query12	0.15	0.12	0.13
query13	0.62	0.61	0.60
query14	1.00	1.02	1.00
query15	0.85	0.84	0.83
query16	0.40	0.39	0.38
query17	1.02	1.00	1.06
query18	0.22	0.21	0.20
query19	1.89	1.79	1.81
query20	0.02	0.01	0.01
query21	15.44	0.17	0.13
query22	5.15	0.07	0.05
query23	15.71	0.26	0.11
query24	3.53	0.72	0.41
query25	0.08	0.07	0.07
query26	0.13	0.14	0.14
query27	0.06	0.05	0.05
query28	4.54	1.17	0.93
query29	12.58	3.83	3.18
query30	0.28	0.14	0.11
query31	2.82	0.61	0.40
query32	3.23	0.55	0.47
query33	3.01	3.02	3.10
query34	15.74	5.22	4.53
query35	4.56	4.57	4.56
query36	0.69	0.51	0.49
query37	0.10	0.07	0.07
query38	0.06	0.04	0.03
query39	0.03	0.02	0.02
query40	0.18	0.15	0.14
query41	0.09	0.03	0.03
query42	0.03	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 99.24 s
Total hot run time: 27.36 s

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 87.36% (152/174) 🎉
Increment coverage report
Complete coverage report

@yujun777
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34437 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit dec88c48ed92790afe85cf70b8e10720a2e7f633, data reload: false

------ Round 1 ----------------------------------
q1	17858	5141	4938	4938
q2	2059	364	208	208
q3	10194	1300	708	708
q4	10237	959	381	381
q5	7509	2382	2327	2327
q6	187	174	144	144
q7	928	768	639	639
q8	9353	1391	1137	1137
q9	6991	5342	5336	5336
q10	6871	2231	1827	1827
q11	497	298	289	289
q12	335	359	230	230
q13	17783	3637	3032	3032
q14	227	244	215	215
q15	581	514	512	512
q16	1036	1017	966	966
q17	587	862	357	357
q18	7414	7162	7217	7162
q19	1357	961	567	567
q20	359	342	236	236
q21	3713	2540	2266	2266
q22	1021	999	960	960
Total cold run time: 107097 ms
Total hot run time: 34437 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4960	4997	4943	4943
q2	323	394	320	320
q3	2157	2634	2265	2265
q4	1343	1803	1318	1318
q5	4208	4370	4557	4370
q6	205	168	124	124
q7	2073	1948	1892	1892
q8	2698	2586	2539	2539
q9	7545	7499	7497	7497
q10	3058	3299	2816	2816
q11	583	546	505	505
q12	700	796	636	636
q13	3701	3791	3338	3338
q14	302	339	295	295
q15	556	498	503	498
q16	1075	1134	1086	1086
q17	1168	1585	1372	1372
q18	7968	7650	7645	7645
q19	789	801	844	801
q20	2014	2050	1796	1796
q21	4674	4302	4149	4149
q22	1098	1061	1019	1019
Total cold run time: 53198 ms
Total hot run time: 51224 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184194 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit dec88c48ed92790afe85cf70b8e10720a2e7f633, data reload: false

query1	1070	406	391	391
query2	6572	1587	1574	1574
query3	6752	224	217	217
query4	25472	22828	22679	22679
query5	4450	634	490	490
query6	343	236	215	215
query7	4647	497	296	296
query8	302	258	248	248
query9	8758	2579	2581	2579
query10	530	361	304	304
query11	15039	14771	14635	14635
query12	172	132	119	119
query13	1676	569	458	458
query14	10321	9029	8950	8950
query15	219	199	188	188
query16	7436	675	526	526
query17	1273	736	632	632
query18	2006	445	325	325
query19	210	203	173	173
query20	132	128	131	128
query21	213	131	111	111
query22	3900	4014	3780	3780
query23	33020	32035	31867	31867
query24	8443	2412	2394	2394
query25	596	516	452	452
query26	1243	274	168	168
query27	2749	493	370	370
query28	4402	2145	2124	2124
query29	816	612	488	488
query30	306	241	210	210
query31	838	730	620	620
query32	87	71	72	71
query33	589	384	322	322
query34	807	867	542	542
query35	811	839	761	761
query36	919	948	882	882
query37	127	112	86	86
query38	3322	3347	3313	3313
query39	1490	1453	1425	1425
query40	224	131	121	121
query41	67	62	61	61
query42	128	113	114	113
query43	455	470	444	444
query44	1262	746	758	746
query45	195	193	186	186
query46	879	995	644	644
query47	1659	1718	1664	1664
query48	406	437	335	335
query49	792	490	401	401
query50	660	680	407	407
query51	3935	3957	3944	3944
query52	113	114	111	111
query53	231	273	201	201
query54	314	292	281	281
query55	98	103	90	90
query56	352	320	315	315
query57	1143	1173	1091	1091
query58	287	276	268	268
query59	2381	2576	2335	2335
query60	346	362	359	359
query61	184	158	166	158
query62	791	691	649	649
query63	228	198	194	194
query64	4531	1185	873	873
query65	4069	3955	3993	3955
query66	1184	422	333	333
query67	15029	14998	14734	14734
query68	4840	914	631	631
query69	523	334	302	302
query70	1293	1232	1188	1188
query71	415	331	303	303
query72	6080	5113	5095	5095
query73	609	580	354	354
query74	8749	8811	8358	8358
query75	3334	3326	2896	2896
query76	3297	1131	725	725
query77	530	420	335	335
query78	9510	9894	8917	8917
query79	1744	852	586	586
query80	1721	621	550	550
query81	554	275	235	235
query82	427	163	137	137
query83	359	267	244	244
query84	258	111	104	104
query85	917	496	446	446
query86	385	299	300	299
query87	3462	3541	3386	3386
query88	2935	2256	2256	2256
query89	390	326	295	295
query90	1750	227	224	224
query91	166	170	136	136
query92	69	67	65	65
query93	1171	981	640	640
query94	718	442	334	334
query95	500	414	406	406
query96	505	568	286	286
query97	2899	2953	2867	2867
query98	241	220	211	211
query99	1297	1359	1257	1257
Total cold run time: 265649 ms
Total hot run time: 184194 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.46 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit dec88c48ed92790afe85cf70b8e10720a2e7f633, data reload: false

query1	0.06	0.05	0.05
query2	0.09	0.04	0.04
query3	0.25	0.08	0.08
query4	1.60	0.11	0.11
query5	0.27	0.26	0.25
query6	1.17	0.66	0.63
query7	0.03	0.03	0.02
query8	0.05	0.04	0.05
query9	0.58	0.52	0.51
query10	0.55	0.57	0.58
query11	0.16	0.11	0.11
query12	0.14	0.12	0.12
query13	0.64	0.60	0.62
query14	1.00	0.99	1.00
query15	0.85	0.83	0.85
query16	0.38	0.38	0.38
query17	1.00	1.04	1.01
query18	0.21	0.20	0.20
query19	1.87	1.82	1.79
query20	0.02	0.01	0.01
query21	15.43	0.21	0.13
query22	5.01	0.06	0.04
query23	15.67	0.27	0.10
query24	2.59	0.58	1.18
query25	0.09	0.06	0.07
query26	0.14	0.13	0.13
query27	0.07	0.06	0.05
query28	4.46	1.16	0.94
query29	12.57	3.85	3.24
query30	0.29	0.13	0.12
query31	2.81	0.60	0.38
query32	3.23	0.56	0.47
query33	3.04	3.00	3.03
query34	15.81	5.15	4.48
query35	4.56	4.59	4.58
query36	0.68	0.51	0.49
query37	0.10	0.07	0.06
query38	0.07	0.04	0.04
query39	0.03	0.03	0.04
query40	0.17	0.14	0.13
query41	0.08	0.03	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 97.9 s
Total hot run time: 27.46 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 21.26% (37/174) 🎉
Increment coverage report
Complete coverage report

@yujun777 yujun777 force-pushed the join-extract-case-when branch from 61bc61e to 7bcf8ec Compare November 27, 2025 10:23
@yujun777
Copy link
Contributor Author

run buildall

@yujun777 yujun777 force-pushed the join-extract-case-when branch from 9e80c6a to d98053f Compare November 28, 2025 01:48
@yujun777
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34397 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d98053f3d9ac0e4cf97090dbaf9e0b2fe140f4b4, data reload: false

------ Round 1 ----------------------------------
q1	17672	5080	4970	4970
q2	2276	314	211	211
q3	10193	1292	744	744
q4	10753	916	374	374
q5	7552	2546	2278	2278
q6	190	169	139	139
q7	962	806	675	675
q8	9379	1344	1106	1106
q9	6907	5298	5262	5262
q10	7012	2249	1834	1834
q11	542	303	297	297
q12	345	369	220	220
q13	17800	3686	3049	3049
q14	234	238	216	216
q15	581	525	520	520
q16	1047	1007	949	949
q17	578	744	534	534
q18	7402	7006	6966	6966
q19	1097	951	570	570
q20	354	351	230	230
q21	3921	2620	2322	2322
q22	1024	999	931	931
Total cold run time: 107821 ms
Total hot run time: 34397 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5038	4976	4911	4911
q2	344	391	306	306
q3	2121	2698	2309	2309
q4	1349	1779	1318	1318
q5	4199	4541	4566	4541
q6	225	173	142	142
q7	2039	2035	1888	1888
q8	2618	2601	2476	2476
q9	7587	7426	7508	7426
q10	3068	3252	2876	2876
q11	619	525	499	499
q12	740	782	658	658
q13	3657	3932	3351	3351
q14	297	319	293	293
q15	550	511	500	500
q16	1065	1167	1074	1074
q17	1184	1410	1450	1410
q18	8112	7672	7390	7390
q19	817	745	775	745
q20	1892	1964	1798	1798
q21	4752	4227	4168	4168
q22	1090	1011	998	998
Total cold run time: 53363 ms
Total hot run time: 51077 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185628 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d98053f3d9ac0e4cf97090dbaf9e0b2fe140f4b4, data reload: false

query1	1315	426	385	385
query2	6619	1579	1582	1579
query3	6919	237	225	225
query4	25209	23284	23000	23000
query5	5380	644	492	492
query6	346	233	224	224
query7	4684	502	306	306
query8	299	273	256	256
query9	8725	2558	2608	2558
query10	569	364	316	316
query11	15448	15262	14993	14993
query12	255	123	111	111
query13	1704	582	459	459
query14	11412	9115	8958	8958
query15	241	207	188	188
query16	7760	691	494	494
query17	1408	767	626	626
query18	2134	439	352	352
query19	227	212	192	192
query20	140	126	123	123
query21	505	138	118	118
query22	3973	3909	3883	3883
query23	33017	32181	32038	32038
query24	8410	2474	2439	2439
query25	649	552	502	502
query26	1253	288	172	172
query27	2712	486	362	362
query28	4315	2143	2119	2119
query29	820	649	521	521
query30	314	243	210	210
query31	832	716	635	635
query32	86	80	76	76
query33	624	398	360	360
query34	840	877	582	582
query35	824	848	768	768
query36	877	914	843	843
query37	127	106	86	86
query38	3375	3387	3345	3345
query39	1641	1396	1415	1396
query40	260	127	119	119
query41	65	64	62	62
query42	123	107	122	107
query43	444	463	437	437
query44	1327	761	762	761
query45	202	191	192	191
query46	882	1044	651	651
query47	1663	1708	1632	1632
query48	393	450	330	330
query49	778	499	401	401
query50	660	688	414	414
query51	3944	3837	3818	3818
query52	118	116	111	111
query53	242	268	193	193
query54	312	295	281	281
query55	98	96	95	95
query56	354	327	317	317
query57	1145	1165	1088	1088
query58	290	280	279	279
query59	2322	2485	2335	2335
query60	354	356	345	345
query61	160	158	154	154
query62	799	744	667	667
query63	244	192	203	192
query64	4530	1227	889	889
query65	4068	3998	3972	3972
query66	1059	450	337	337
query67	15232	15215	14895	14895
query68	8425	979	628	628
query69	536	350	316	316
query70	1245	1176	1171	1171
query71	491	348	316	316
query72	5926	4886	4891	4886
query73	716	582	356	356
query74	8767	8938	8652	8652
query75	4041	3332	2828	2828
query76	3730	1170	739	739
query77	812	403	328	328
query78	9534	9476	8843	8843
query79	2280	860	603	603
query80	652	588	519	519
query81	516	275	241	241
query82	499	160	134	134
query83	292	274	258	258
query84	269	115	102	102
query85	895	494	456	456
query86	397	302	294	294
query87	3439	3466	3402	3402
query88	4205	2323	2295	2295
query89	401	356	301	301
query90	2123	240	236	236
query91	174	174	144	144
query92	87	69	69	69
query93	1882	1024	653	653
query94	730	447	355	355
query95	497	419	402	402
query96	516	559	300	300
query97	2926	3009	2940	2940
query98	237	220	213	213
query99	1344	1411	1283	1283
Total cold run time: 276640 ms
Total hot run time: 185628 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.42 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d98053f3d9ac0e4cf97090dbaf9e0b2fe140f4b4, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.05	0.05
query3	0.25	0.08	0.08
query4	1.60	0.12	0.11
query5	0.27	0.26	0.26
query6	1.17	0.64	0.62
query7	0.03	0.03	0.03
query8	0.05	0.04	0.04
query9	0.57	0.51	0.50
query10	0.57	0.57	0.55
query11	0.16	0.12	0.11
query12	0.15	0.12	0.12
query13	0.63	0.60	0.61
query14	1.00	0.99	0.98
query15	0.82	0.80	0.80
query16	0.42	0.40	0.41
query17	1.04	1.06	0.99
query18	0.24	0.21	0.22
query19	1.95	1.75	1.84
query20	0.02	0.01	0.01
query21	15.44	0.25	0.14
query22	4.80	0.06	0.04
query23	15.89	0.28	0.10
query24	1.85	0.33	0.62
query25	0.12	0.09	0.06
query26	0.14	0.14	0.13
query27	0.08	0.05	0.05
query28	5.71	1.21	1.03
query29	12.63	3.92	3.24
query30	0.28	0.15	0.11
query31	2.81	0.60	0.43
query32	3.23	0.55	0.47
query33	3.05	3.06	3.12
query34	16.92	5.18	4.57
query35	4.57	4.59	4.54
query36	0.64	0.50	0.49
query37	0.10	0.06	0.07
query38	0.07	0.05	0.04
query39	0.05	0.03	0.03
query40	0.18	0.15	0.14
query41	0.08	0.03	0.03
query42	0.04	0.03	0.03
query43	0.05	0.04	0.04
Total cold run time: 99.81 s
Total hot run time: 27.42 s

starocean999
starocean999 previously approved these changes Nov 28, 2025
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 28, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@yujun777 yujun777 force-pushed the join-extract-case-when branch from 4ee3dc5 to 88c442b Compare December 1, 2025 08:09
@yujun777
Copy link
Contributor Author

yujun777 commented Dec 1, 2025

run buildall

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 1, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 1, 2025

PR approved by at least one committer and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 34015 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 88c442b80571f19408097d418c6c0fad81030bc5, data reload: false

------ Round 1 ----------------------------------
q1	17640	5119	4902	4902
q2	2037	317	207	207
q3	10274	1340	739	739
q4	10211	839	324	324
q5	7464	2359	2129	2129
q6	192	178	142	142
q7	966	828	643	643
q8	9350	1363	1032	1032
q9	6981	5329	5374	5329
q10	6824	2187	1758	1758
q11	512	303	287	287
q12	337	372	232	232
q13	17760	3693	2998	2998
q14	238	233	208	208
q15	588	520	505	505
q16	839	861	807	807
q17	601	751	529	529
q18	7391	7158	7058	7058
q19	1084	942	570	570
q20	366	344	230	230
q21	3555	3178	2464	2464
q22	1022	973	922	922
Total cold run time: 106232 ms
Total hot run time: 34015 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4946	4896	4906	4896
q2	325	402	317	317
q3	2157	2654	2272	2272
q4	1299	1728	1303	1303
q5	4226	4254	4529	4254
q6	228	185	136	136
q7	2054	1947	1827	1827
q8	2721	2517	2570	2517
q9	7488	7465	7518	7465
q10	3210	3343	2892	2892
q11	588	519	500	500
q12	680	807	613	613
q13	3539	3821	3334	3334
q14	290	308	273	273
q15	544	512	513	512
q16	906	935	891	891
q17	1290	1441	1377	1377
q18	7812	7743	7472	7472
q19	813	756	802	756
q20	1998	2116	1966	1966
q21	4822	4667	4255	4255
q22	1053	1006	976	976
Total cold run time: 52989 ms
Total hot run time: 50804 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 181988 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 88c442b80571f19408097d418c6c0fad81030bc5, data reload: false

query1	1094	405	384	384
query2	6589	1201	1168	1168
query3	6745	225	227	225
query4	25334	23246	22994	22994
query5	5179	625	492	492
query6	355	241	225	225
query7	4654	491	323	323
query8	322	248	251	248
query9	8762	2649	2612	2612
query10	561	346	308	308
query11	15541	14985	14573	14573
query12	199	124	116	116
query13	1690	571	451	451
query14	9089	6002	6002	6002
query15	228	207	184	184
query16	7684	687	554	554
query17	1209	778	642	642
query18	2023	421	327	327
query19	200	204	174	174
query20	130	123	118	118
query21	223	133	115	115
query22	3867	3920	3797	3797
query23	32824	31902	32018	31902
query24	8484	2397	2403	2397
query25	608	503	469	469
query26	1229	272	167	167
query27	2750	485	342	342
query28	4307	2168	2151	2151
query29	805	616	517	517
query30	308	240	212	212
query31	787	695	664	664
query32	84	77	76	76
query33	613	382	331	331
query34	802	864	554	554
query35	775	816	742	742
query36	888	896	831	831
query37	121	119	87	87
query38	3876	3904	3823	3823
query39	1470	1394	1453	1394
query40	222	130	127	127
query41	66	63	62	62
query42	127	116	118	116
query43	433	447	414	414
query44	1271	752	734	734
query45	196	189	187	187
query46	862	998	637	637
query47	1680	1743	1693	1693
query48	409	422	337	337
query49	758	498	418	418
query50	648	685	416	416
query51	3856	3887	3900	3887
query52	120	112	108	108
query53	233	259	195	195
query54	324	291	292	291
query55	99	93	98	93
query56	364	331	319	319
query57	1142	1168	1102	1102
query58	287	281	274	274
query59	2339	2384	2311	2311
query60	355	348	334	334
query61	189	162	161	161
query62	760	716	664	664
query63	234	195	201	195
query64	4536	1200	913	913
query65	4047	3966	3990	3966
query66	1107	430	331	331
query67	15227	14948	14706	14706
query68	4656	964	624	624
query69	530	337	310	310
query70	1102	1026	1017	1017
query71	418	333	325	325
query72	5917	5131	5112	5112
query73	676	588	362	362
query74	8469	8745	8663	8663
query75	3051	3039	2560	2560
query76	3264	1133	741	741
query77	501	428	338	338
query78	9388	9549	8919	8919
query79	2708	844	587	587
query80	718	618	563	563
query81	501	270	232	232
query82	486	138	114	114
query83	271	257	252	252
query84	270	117	98	98
query85	915	503	453	453
query86	378	283	299	283
query87	4105	3994	3973	3973
query88	4203	2300	2339	2300
query89	395	331	298	298
query90	2005	221	224	221
query91	173	173	148	148
query92	82	72	65	65
query93	2377	999	652	652
query94	782	454	330	330
query95	494	419	409	409
query96	497	556	287	287
query97	2598	2623	2559	2559
query98	250	231	217	217
query99	1342	1386	1294	1294
Total cold run time: 267822 ms
Total hot run time: 181988 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.48 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 88c442b80571f19408097d418c6c0fad81030bc5, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.04	0.05
query3	0.26	0.09	0.08
query4	1.60	0.11	0.11
query5	0.31	0.27	0.25
query6	1.17	0.63	0.65
query7	0.03	0.02	0.02
query8	0.06	0.04	0.04
query9	0.57	0.52	0.51
query10	0.55	0.56	0.56
query11	0.16	0.11	0.12
query12	0.15	0.12	0.12
query13	0.63	0.60	0.60
query14	0.99	0.99	0.98
query15	0.83	0.80	0.80
query16	0.42	0.42	0.40
query17	1.03	1.04	0.97
query18	0.24	0.22	0.21
query19	1.89	1.87	1.88
query20	0.02	0.01	0.01
query21	15.44	0.26	0.13
query22	4.92	0.05	0.05
query23	16.25	0.27	0.10
query24	0.93	0.24	0.31
query25	0.08	0.08	0.07
query26	0.14	0.13	0.12
query27	0.07	0.05	0.06
query28	3.36	1.20	1.02
query29	12.59	4.05	3.40
query30	0.29	0.14	0.11
query31	2.80	0.61	0.38
query32	3.23	0.56	0.45
query33	3.01	3.11	3.07
query34	16.90	5.17	4.58
query35	4.54	4.53	4.62
query36	0.65	0.50	0.48
query37	0.10	0.06	0.06
query38	0.07	0.04	0.04
query39	0.04	0.02	0.02
query40	0.18	0.14	0.15
query41	0.09	0.02	0.02
query42	0.05	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 96.84 s
Total hot run time: 27.48 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 14.52% (27/186) 🎉
Increment coverage report
Complete coverage report

@yujun777
Copy link
Contributor Author

yujun777 commented Dec 1, 2025

run external

@starocean999 starocean999 merged commit 3e785ee into apache:master Dec 2, 2025
27 checks passed
yujun777 added a commit to yujun777/doris that referenced this pull request Dec 2, 2025
…llif expression (apache#58430)

Join extract OR expressions from case when  expression.

1. extract conditions for one side, latter can push down the one side
condition:

```
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b) + t2.b + t2.c > 10)
    =>
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b end) + t2.b + t2.c > 10)
                  AND (not (t2.a + t2.b + t2.c > 10) or not (t2.b + t2.b + t2.c > 10))
```

2. extract condition for both sides, which use for OR EXPANSION rule:
 
the OR EXPANSION condition is an OR expression and all its disjuncts are
all hash condition.

```
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
    =>
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
                AND (t2.a = t1.a + t1.b or t2.b = t1.a + t1.b)
```


Notice We don't extract more than one case when like expressions.
because it may generate expressions with combinatorial explosion.

for example:

```
 (((case c1 then p1 else p2 end) + (case when d1 then q1 else q2 end))) + a  > 10
 =>
 (p1 + q1 + a > 10)
     or (p1 + q2 + a > 10)
     or (p2 + q1 + a > 10)
     or (p2 + q2 + a > 10)
```
so we only extract at most one case when like expression for each
condition.
yujun777 added a commit to yujun777/doris that referenced this pull request Dec 2, 2025
…llif expression (apache#58430)

Join extract OR expressions from case when  expression.

1. extract conditions for one side, latter can push down the one side
condition:

```
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b) + t2.b + t2.c > 10)
    =>
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b end) + t2.b + t2.c > 10)
                  AND (not (t2.a + t2.b + t2.c > 10) or not (t2.b + t2.b + t2.c > 10))
```

2. extract condition for both sides, which use for OR EXPANSION rule:
 
the OR EXPANSION condition is an OR expression and all its disjuncts are
all hash condition.

```
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
    =>
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
                AND (t2.a = t1.a + t1.b or t2.b = t1.a + t1.b)
```


Notice We don't extract more than one case when like expressions.
because it may generate expressions with combinatorial explosion.

for example:

```
 (((case c1 then p1 else p2 end) + (case when d1 then q1 else q2 end))) + a  > 10
 =>
 (p1 + q1 + a > 10)
     or (p1 + q2 + a > 10)
     or (p2 + q1 + a > 10)
     or (p2 + q2 + a > 10)
```
so we only extract at most one case when like expression for each
condition.
yujun777 added a commit to yujun777/doris that referenced this pull request Dec 8, 2025
…llif expression (apache#58430)

Join extract OR expressions from case when  expression.

1. extract conditions for one side, latter can push down the one side
condition:

```
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b) + t2.b + t2.c > 10)
    =>
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b end) + t2.b + t2.c > 10)
                  AND (not (t2.a + t2.b + t2.c > 10) or not (t2.b + t2.b + t2.c > 10))
```

2. extract condition for both sides, which use for OR EXPANSION rule:
 
the OR EXPANSION condition is an OR expression and all its disjuncts are
all hash condition.

```
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
    =>
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
                AND (t2.a = t1.a + t1.b or t2.b = t1.a + t1.b)
```


Notice We don't extract more than one case when like expressions.
because it may generate expressions with combinatorial explosion.

for example:

```
 (((case c1 then p1 else p2 end) + (case when d1 then q1 else q2 end))) + a  > 10
 =>
 (p1 + q1 + a > 10)
     or (p1 + q2 + a > 10)
     or (p2 + q1 + a > 10)
     or (p2 + q2 + a > 10)
```
so we only extract at most one case when like expression for each
condition.
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
…llif expression (apache#58430)

Join extract OR expressions from case when  expression.

1. extract conditions for one side, latter can push down the one side
condition:

```
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b) + t2.b + t2.c > 10)
    =>
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b end) + t2.b + t2.c > 10)
                  AND (not (t2.a + t2.b + t2.c > 10) or not (t2.b + t2.b + t2.c > 10))
```

2. extract condition for both sides, which use for OR EXPANSION rule:
 
the OR EXPANSION condition is an OR expression and all its disjuncts are
all hash condition.

```
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
    =>
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
                AND (t2.a = t1.a + t1.b or t2.b = t1.a + t1.b)
```


Notice We don't extract more than one case when like expressions.
because it may generate expressions with combinatorial explosion.

for example:

```
 (((case c1 then p1 else p2 end) + (case when d1 then q1 else q2 end))) + a  > 10
 =>
 (p1 + q1 + a > 10)
     or (p1 + q2 + a > 10)
     or (p2 + q1 + a > 10)
     or (p2 + q2 + a > 10)
```
so we only extract at most one case when like expression for each
condition.
yujun777 added a commit to yujun777/doris that referenced this pull request Jan 9, 2026
…llif expression (apache#58430)

Join extract OR expressions from case when  expression.

1. extract conditions for one side, latter can push down the one side
condition:

```
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b) + t2.b + t2.c > 10)
    =>
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b end) + t2.b + t2.c > 10)
                  AND (not (t2.a + t2.b + t2.c > 10) or not (t2.b + t2.b + t2.c > 10))
```

2. extract condition for both sides, which use for OR EXPANSION rule:
 
the OR EXPANSION condition is an OR expression and all its disjuncts are
all hash condition.

```
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
    =>
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
                AND (t2.a = t1.a + t1.b or t2.b = t1.a + t1.b)
```


Notice We don't extract more than one case when like expressions.
because it may generate expressions with combinatorial explosion.

for example:

```
 (((case c1 then p1 else p2 end) + (case when d1 then q1 else q2 end))) + a  > 10
 =>
 (p1 + q1 + a > 10)
     or (p1 + q2 + a > 10)
     or (p2 + q1 + a > 10)
     or (p2 + q2 + a > 10)
```
so we only extract at most one case when like expression for each
condition.
yujun777 added a commit to yujun777/doris that referenced this pull request Jan 9, 2026
…llif expression (apache#58430)

Join extract OR expressions from case when  expression.

1. extract conditions for one side, latter can push down the one side
condition:

```
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b) + t2.b + t2.c > 10)
    =>
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b end) + t2.b + t2.c > 10)
                  AND (not (t2.a + t2.b + t2.c > 10) or not (t2.b + t2.b + t2.c > 10))
```

2. extract condition for both sides, which use for OR EXPANSION rule:
 
the OR EXPANSION condition is an OR expression and all its disjuncts are
all hash condition.

```
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
    =>
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
                AND (t2.a = t1.a + t1.b or t2.b = t1.a + t1.b)
```


Notice We don't extract more than one case when like expressions.
because it may generate expressions with combinatorial explosion.

for example:

```
 (((case c1 then p1 else p2 end) + (case when d1 then q1 else q2 end))) + a  > 10
 =>
 (p1 + q1 + a > 10)
     or (p1 + q2 + a > 10)
     or (p2 + q1 + a > 10)
     or (p2 + q2 + a > 10)
```
so we only extract at most one case when like expression for each
condition.
yujun777 added a commit to yujun777/doris that referenced this pull request Jan 12, 2026
…llif expression (apache#58430)

Join extract OR expressions from case when  expression.

1. extract conditions for one side, latter can push down the one side
condition:

```
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b) + t2.b + t2.c > 10)
    =>
    t1 join t2 on not (case when t1.a = 1 then t2.a else t2.b end) + t2.b + t2.c > 10)
                  AND (not (t2.a + t2.b + t2.c > 10) or not (t2.b + t2.b + t2.c > 10))
```

2. extract condition for both sides, which use for OR EXPANSION rule:
 
the OR EXPANSION condition is an OR expression and all its disjuncts are
all hash condition.

```
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
    =>
    t1 join t2 on (case when t1.a = 1 then t2.a else t2.b end) = t1.a + t1.b
                AND (t2.a = t1.a + t1.b or t2.b = t1.a + t1.b)
```


Notice We don't extract more than one case when like expressions.
because it may generate expressions with combinatorial explosion.

for example:

```
 (((case c1 then p1 else p2 end) + (case when d1 then q1 else q2 end))) + a  > 10
 =>
 (p1 + q1 + a > 10)
     or (p1 + q2 + a > 10)
     or (p2 + q1 + a > 10)
     or (p2 + q2 + a > 10)
```
so we only extract at most one case when like expression for each
condition.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants