[feat] Virtual Slot Ref #52701

zhiqiang-hhhh · 2025-07-03T03:14:52Z

What problem does this PR solve?

TL;DR: Introduce virtual slot ref to eliminate redundant computation of common sub-expressions

Problem to solve

Consider the following queries:

select funcC(funcA(colA)), funcB(funcA(colA)) from table;

select funcA(colA) as sub from table where funcB(funcA(colA)) > 0;

select l2_distance(colA, [10]) as distance from table where l2_distance(colA, [10]) > 0

The common characteristic of these SQL statements is that certain expressions appear multiple times in different places—whether in the projection, in predicates, or during index computation (e.g., for ANN index in Q3). These identical repeated expressions are currently computed multiple times, but they could actually be computed just once.

We introduce virtual slot ref to address this issue.

In the storage layer, we implement a VirtualColumnIterator. The behavior of VirtualColumnIterator is identical to other ColumnIterators, except it is not used to read any physical column. Instead, it is dedicated to reading the result of expressions computed from the index (for example, the distance returned by an ANN index, or in the future, the relevance score from a full-text index). Once an expression result is computed via the index, we use VirtualColumnIterator::prepare_materialization to store the data source. If a segment does not have the corresponding index, the data source of the VirtualColumnIterator will be a special ColumnNothing type (this is an important design trick that allows virtual slot ref to elegantly handle the case where a segment does not yet have the index built).

We also modify SegmentIterator. Before processing each block, we first initialize the positions of the virtual slot ref in the block as ColumnNothing. Before actually returning a block, we check whether the virtual slot ref have been materialized; if not, we execute the expressions corresponding to the virtual slot ref (e.g., l2_distance or a score function) to generate the actual virtual slot ref, ensuring that every virtual column in the block returned to the computation layer has been materialized.

For expression evaluation, we introduce VirtualSlotRef, which is essentially SlotRef + FunctionCall. When the expression tree executes a node of this type, it automatically checks whether the corresponding expression has been materialized: if it has, VirtualSlotRef behaves like a SlotRef; if it hasn’t, it behaves like a FunctionCall.

Modification on planner

Here’s an example to better illustrate the execution of virtual slot ref:

select func(colA) from table where func(colA) > 0;

For this SQL, our current ScanNode is:

ScanNode {
    predicates: func(colA[#0]) > 0
    final projection: func(colA[#0])
    final projection tuple id: 1
    tuple_id: 0
}

TupleDesc[id=0]{
    SlotDesc{id=0, col=colA)}
}

TupleDesc[id=1] {
    SlotDesc{id=1, col=null, ..., type=float64)
}

After this pr, our plan will become:

ScanNode {
    predicate: function(colA)[#1] > 0
    final projection: function(colA)[#1]
    final projection tuple id: 1
    tuple_id: 0
}

TupleDesc[id=0]{
    SlotDesc{id=0, col="colA")},
    SlotDesc{id=1, col=virtual_column_1, expr=function1(colA[#0])}
}

TupleDesc[id=1] {
    SlotDesc{id=2, name=virtual_column_1[#1])
}

Note that we added a VirtualSlot in Tuple 0, and other places that originally required computing the expression are transformed to reference this VirtualSlot. In this way, redundant computation of common expressions is eliminated.

Benchmark

disable the plan rules

mysql> set disable_nereids_rules='PUSH_DOWN_VIRTUAL_COLUMNS_INTO_OLAP_SCAN';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT counterid,        Count(*)               AS hit_count,        Count(DISTINCT userid) AS unique_users FROM   hits WHERE  ( Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) = 'GOOGLE.COM'           OR Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) =              'GOOGLE.RU'           OR Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) LIKE              '%GOOGLE%' )        AND ( Length(Regexp_extract(referer, '^https?://([^/]+)', 1)) > 3               OR Regexp_extract(referer, '^https?://([^/]+)', 1) != ''
              OR Regexp_extract(referer, '^https?://([^/]+)', 1) IS NOT NULL )        AND eventdate = '2013-07-15' GROUP  BY counterid HAVING hit_count > 100 ORDER  BY hit_count DESC LIMIT  20;
+-----------+-----------+--------------+
| counterid | hit_count | unique_users |
+-----------+-----------+--------------+
|    105857 |   1919075 |      1412926 |
|    117917 |    200018 |        50285 |
|     99062 |    114384 |        71408 |
|      1634 |     43839 |        14975 |
|        59 |     31328 |         6668 |
|    114157 |     28852 |        19729 |
|        62 |     22549 |        14130 |
|      1483 |      8425 |         5677 |
|        38 |      5436 |         1805 |
|      1060 |      4043 |         2948 |
|     76221 |      2060 |         1325 |
|    128858 |      1690 |          825 |
|    102847 |      1500 |          350 |
|     89761 |      1419 |          274 |
|     92040 |      1180 |          978 |
|      1089 |      1067 |          961 |
|      2004 |       880 |          698 |
|      1213 |       597 |          219 |
|     77729 |       448 |          108 |
|     71099 |       289 |           70 |
+-----------+-----------+--------------+
20 rows in set (1.50 sec)

reopen the rule:

mysql> unset variable disable_nereids_rules;
--------------
unset variable disable_nereids_rules
--------------

Query OK, 0 rows affected (0.00 sec)

mysql> -- 查询 1: 分析从 Google 中获得最多点击的 20 个网站
mysql> SELECT counterid,
    ->        Count(*)               AS hit_count,
    ->        Count(DISTINCT userid) AS unique_users
    -> FROM   hits
    -> WHERE  ( Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) = 'GOOGLE.COM'
    ->           OR Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) =
    ->              'GOOGLE.RU'
    ->           OR Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) LIKE
    ->              '%GOOGLE%' )
    ->        AND ( Length(Regexp_extract(referer, '^https?://([^/]+)', 1)) > 3
    ->               OR Regexp_extract(referer, '^https?://([^/]+)', 1) != ''
    ->               OR Regexp_extract(referer, '^https?://([^/]+)', 1) IS NOT NULL )
    ->        AND eventdate = '2013-07-15'
    -> GROUP  BY counterid
    -> HAVING hit_count > 100
    -> ORDER  BY hit_count DESC
    -> LIMIT  20;
+-----------+-----------+--------------+
| counterid | hit_count | unique_users |
+-----------+-----------+--------------+
|    105857 |   1919075 |      1412926 |
|    117917 |    200018 |        50285 |
|     99062 |    114384 |        71408 |
|      1634 |     43839 |        14975 |
|        59 |     31328 |         6668 |
|    114157 |     28852 |        19729 |
|        62 |     22549 |        14130 |
|      1483 |      8425 |         5677 |
|        38 |      5436 |         1805 |
|      1060 |      4043 |         2948 |
|     76221 |      2060 |         1325 |
|    128858 |      1690 |          825 |
|    102847 |      1500 |          350 |
|     89761 |      1419 |          274 |
|     92040 |      1180 |          978 |
|      1089 |      1067 |          961 |
|      2004 |       880 |          698 |
|      1213 |       597 |          219 |
|     77729 |       448 |          108 |
|     71099 |       289 |           70 |
+-----------+-----------+--------------+
20 rows in set (0.57 sec)

About 300% optimization.

TODO

In the future, we can leverage virtual slot ref to implement more functionalities, including:

ANN Index
Relevance scoring based on full-text indexes
Generated Columns (of NOT ALWAYS type)
Index-Only Scan (which will require modifying SlotRef computation in SegmentIterator to be pull-based)
CSE replace rule on FE is very basic, but enough to use for now. Need a further modification on fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PushDownVirtualColumnsIntoOlapScanTest.java

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

Thearas · 2025-07-03T03:14:57Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

zhiqiang-hhhh · 2025-07-03T03:16:14Z

run buildall

hello-stephen · 2025-07-03T04:32:08Z

FE UT Coverage Report

Increment line coverage `` 🎉
Increment coverage report
Complete coverage report

zhiqiang-hhhh · 2025-07-03T06:25:40Z

run buildall

doris-robot · 2025-07-03T06:43:29Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	82.56% (1188/1439)
Line Coverage	67.35% (20731/30779)
Region Coverage	66.98% (10310/15393)
Branch Coverage	56.30% (5388/9570)

zhiqiang-hhhh · 2025-07-03T07:15:41Z

run buildall

doris-robot · 2025-07-03T07:28:04Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	82.56% (1188/1439)
Line Coverage	67.33% (20725/30779)
Region Coverage	67.02% (10317/15393)
Branch Coverage	56.29% (5387/9570)

hello-stephen · 2025-07-03T08:25:23Z

FE UT Coverage Report

Increment line coverage `` 🎉
Increment coverage report
Complete coverage report

zhiqiang-hhhh · 2025-07-03T08:31:06Z

run buildall

hello-stephen · 2025-07-03T08:45:28Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	82.56% (1188/1439)
Line Coverage	67.33% (20724/30779)
Region Coverage	66.95% (10306/15393)
Branch Coverage	56.29% (5387/9570)

zhiqiang-hhhh · 2025-07-03T08:47:26Z

run buildall

doris-robot · 2025-07-03T08:54:41Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	82.56% (1188/1439)
Line Coverage	67.41% (20749/30779)
Region Coverage	67.02% (10316/15393)
Branch Coverage	56.30% (5388/9570)

hello-stephen · 2025-07-03T09:52:09Z

FE UT Coverage Report

Increment line coverage `` 🎉
Increment coverage report
Complete coverage report

zhiqiang-hhhh · 2025-07-03T13:26:10Z

run buildall

zhiqiang-hhhh · 2025-07-03T13:38:10Z

run buildall

zhiqiang-hhhh · 2025-07-03T13:40:01Z

run buildall

doris-robot · 2025-07-03T13:47:54Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	82.92% (1214/1464)
Line Coverage	67.42% (20971/31107)
Region Coverage	67.15% (10439/15546)
Branch Coverage	56.51% (5463/9668)

doris-robot · 2025-07-03T15:31:01Z

TPC-H: Total hot run time: 33935 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5218a6fe4f3382e54341f8ce171bf8ab4e93b24e, data reload: false

------ Round 1 ----------------------------------
q1	17595	5242	5084	5084
q2	1969	274	177	177
q3	10433	1337	735	735
q4	10252	1031	554	554
q5	7830	2400	2355	2355
q6	178	159	129	129
q7	889	745	598	598
q8	9325	1367	1126	1126
q9	7456	5124	5180	5124
q10	6871	2389	1948	1948
q11	498	293	282	282
q12	337	349	212	212
q13	17771	3688	3060	3060
q14	234	235	218	218
q15	551	484	471	471
q16	432	420	377	377
q17	596	852	351	351
q18	7595	7212	7147	7147
q19	1225	954	584	584
q20	322	336	208	208
q21	3796	3154	2277	2277
q22	1014	996	918	918
Total cold run time: 107169 ms
Total hot run time: 33935 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5171	5101	5159	5101
q2	241	321	218	218
q3	2165	2676	2319	2319
q4	1336	1756	1311	1311
q5	4226	4645	4526	4526
q6	206	170	130	130
q7	2071	1942	1872	1872
q8	2621	2546	2561	2546
q9	7446	7320	7149	7149
q10	3176	3349	2929	2929
q11	565	501	501	501
q12	697	813	639	639
q13	3599	3939	3319	3319
q14	292	324	274	274
q15	521	472	481	472
q16	472	507	456	456
q17	1181	1575	1377	1377
q18	8292	7851	7859	7851
q19	830	875	869	869
q20	1999	1981	1964	1964
q21	4846	4380	4337	4337
q22	1033	997	1000	997
Total cold run time: 52986 ms
Total hot run time: 51157 ms

doris-robot · 2025-07-03T15:42:25Z

TPC-DS: Total hot run time: 184427 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5218a6fe4f3382e54341f8ce171bf8ab4e93b24e, data reload: false

query1	1013	387	397	387
query2	6507	1664	1671	1664
query3	6740	207	205	205
query4	26335	23771	23066	23066
query5	4364	582	439	439
query6	298	220	201	201
query7	4638	506	295	295
query8	256	215	217	215
query9	8607	2681	2685	2681
query10	472	347	265	265
query11	15448	15001	14850	14850
query12	146	104	103	103
query13	1647	531	406	406
query14	8839	5655	5756	5655
query15	201	182	193	182
query16	7191	617	489	489
query17	1179	688	560	560
query18	1983	399	290	290
query19	190	202	154	154
query20	122	119	110	110
query21	209	123	108	108
query22	4034	4233	4186	4186
query23	33820	32878	32846	32846
query24	8437	2376	2400	2376
query25	553	503	421	421
query26	891	273	154	154
query27	2746	524	351	351
query28	4335	2154	2133	2133
query29	713	569	432	432
query30	280	216	193	193
query31	906	840	774	774
query32	68	67	60	60
query33	567	369	308	308
query34	800	839	531	531
query35	800	840	720	720
query36	967	958	875	875
query37	115	100	76	76
query38	4217	4051	4085	4051
query39	1478	1384	1404	1384
query40	215	120	104	104
query41	55	54	51	51
query42	139	111	113	111
query43	478	497	461	461
query44	1352	827	822	822
query45	178	166	160	160
query46	862	1036	639	639
query47	1724	1777	1690	1690
query48	395	427	301	301
query49	699	482	412	412
query50	651	704	441	441
query51	4165	4093	4126	4093
query52	109	109	102	102
query53	236	252	189	189
query54	568	568	497	497
query55	82	80	83	80
query56	305	303	284	284
query57	1164	1162	1121	1121
query58	265	256	283	256
query59	2506	2654	2526	2526
query60	341	313	306	306
query61	125	155	117	117
query62	816	689	653	653
query63	221	190	193	190
query64	3496	1022	638	638
query65	4302	4173	4180	4173
query66	1008	404	316	316
query67	15759	15431	15357	15357
query68	5650	902	534	534
query69	505	303	273	273
query70	1196	1197	1094	1094
query71	415	323	291	291
query72	5600	4803	4551	4551
query73	629	581	357	357
query74	8902	9105	8844	8844
query75	3179	3353	2673	2673
query76	3153	1143	714	714
query77	464	437	291	291
query78	9914	10143	9341	9341
query79	1516	836	591	591
query80	597	533	450	450
query81	492	253	221	221
query82	176	127	98	98
query83	250	260	236	236
query84	253	107	89	89
query85	745	358	310	310
query86	367	302	297	297
query87	4350	4436	4273	4273
query88	2975	2316	2313	2313
query89	372	315	275	275
query90	1816	215	214	214
query91	151	156	128	128
query92	76	64	58	58
query93	1799	964	592	592
query94	702	424	321	321
query95	386	305	298	298
query96	500	588	281	281
query97	2635	2778	2639	2639
query98	238	203	199	199
query99	1324	1373	1295	1295
Total cold run time: 265655 ms
Total hot run time: 184427 ms

doris-robot · 2025-07-03T15:47:47Z

ClickBench: Total hot run time: 30.34 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 5218a6fe4f3382e54341f8ce171bf8ab4e93b24e, data reload: false

query1	0.04	0.03	0.04
query2	0.12	0.06	0.06
query3	0.29	0.06	0.06
query4	1.62	0.08	0.08
query5	0.43	0.41	0.40
query6	1.15	0.66	0.66
query7	0.02	0.01	0.02
query8	0.06	0.05	0.05
query9	0.64	0.52	0.52
query10	0.59	0.57	0.57
query11	0.25	0.13	0.13
query12	0.26	0.14	0.14
query13	0.65	0.62	0.62
query14	0.81	0.84	0.83
query15	0.96	0.89	0.89
query16	0.38	0.39	0.38
query17	1.09	1.10	1.10
query18	0.24	0.23	0.24
query19	2.06	1.95	1.91
query20	0.01	0.01	0.02
query21	15.38	0.95	0.65
query22	0.94	1.05	0.83
query23	14.70	1.56	0.75
query24	5.01	0.60	0.30
query25	0.17	0.09	0.09
query26	0.56	0.23	0.19
query27	0.09	0.09	0.09
query28	11.12	1.20	0.58
query29	12.57	4.05	3.38
query30	0.28	0.08	0.06
query31	2.85	0.66	0.44
query32	3.23	0.60	0.50
query33	3.19	3.09	3.09
query34	16.90	5.41	4.71
query35	4.75	4.84	4.83
query36	0.65	0.52	0.49
query37	0.19	0.18	0.17
query38	0.18	0.16	0.15
query39	0.05	0.04	0.05
query40	0.18	0.16	0.19
query41	0.10	0.06	0.06
query42	0.07	0.05	0.06
query43	0.06	0.05	0.05
Total cold run time: 104.89 s
Total hot run time: 30.34 s

doris-robot · 2025-07-03T16:48:11Z

BE UT Coverage Report

Increment line coverage 27.09% (146/539) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	57.11% (15408/26981)
Line Coverage	46.14% (139759/302912)
Region Coverage	45.44% (70829/155870)
Branch Coverage	40.22% (37367/92910)

be/src/olap/rowset/segment_v2/virtual_column_iterator.cpp

gensrc/thrift/Exprs.thrift

be/src/common/consts.h

zhiqiang-hhhh · 2025-07-07T14:38:02Z

run buildall

doris-robot · 2025-07-07T14:54:01Z

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	83.02% (1222/1472)
Line Coverage	67.52% (21170/31355)
Region Coverage	67.26% (10542/15673)
Branch Coverage	56.61% (5547/9798)

doris-robot · 2025-07-07T17:59:00Z

BE UT Coverage Report

Increment line coverage 29.32% (163/556) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	57.28% (15509/27074)
Line Coverage	46.26% (140988/304749)
Region Coverage	45.51% (71277/156631)
Branch Coverage	40.24% (37557/93332)

doris-robot · 2025-07-25T19:57:25Z

ClickBench: Total hot run time: 32.96 s

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit af5fc37ffc558c01d31ea8935f747c53d1ed8220, data reload: false

query1	0.05	0.04	0.04
query2	0.08	0.04	0.05
query3	0.25	0.07	0.08
query4	1.61	0.11	0.11
query5	0.43	0.45	0.43
query6	1.17	0.70	0.69
query7	0.02	0.02	0.01
query8	0.05	0.03	0.03
query9	0.55	0.48	0.47
query10	0.54	0.53	0.53
query11	0.15	0.10	0.11
query12	0.15	0.11	0.12
query13	0.64	0.65	0.64
query14	0.93	1.26	1.01
query15	0.92	0.91	0.92
query16	0.39	0.40	0.39
query17	1.10	1.11	1.08
query18	0.22	0.20	0.21
query19	2.00	1.93	1.85
query20	0.02	0.01	0.01
query21	15.37	0.86	0.55
query22	0.78	1.08	0.82
query23	14.83	1.14	0.61
query24	6.46	2.04	0.53
query25	0.51	0.13	0.18
query26	0.68	0.16	0.13
query27	0.06	0.06	0.06
query28	9.37	0.84	0.44
query29	12.60	3.83	3.34
query30	3.02	2.96	2.93
query31	2.81	0.56	0.39
query32	3.24	0.56	0.49
query33	3.05	3.22	3.27
query34	15.99	5.33	4.90
query35	4.87	5.04	4.87
query36	0.69	0.51	0.50
query37	0.10	0.08	0.07
query38	0.05	0.06	0.04
query39	0.04	0.02	0.03
query40	0.17	0.14	0.13
query41	0.07	0.02	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.04
Total cold run time: 106.11 s
Total hot run time: 32.96 s

doris-robot · 2025-07-25T21:07:39Z

BE UT Coverage Report

Increment line coverage 40.13% (256/638) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	57.58% (15964/27724)
Line Coverage	46.34% (143555/309804)
Region Coverage	35.72% (108162/302775)
Branch Coverage	38.30% (47768/124727)

hello-stephen · 2025-07-25T21:51:00Z

BE Regression && UT Coverage Report

Increment line coverage 81.70% (518/634) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	81.09% (22069/27214)
Line Coverage	73.70% (228016/309376)
Region Coverage	61.36% (190548/310523)
Branch Coverage	65.10% (82110/126135)

HappenLee

LGTM

github-actions · 2025-07-28T02:00:08Z

PR approved by at least one committer and no changes requested.

airborne12

LGTM

### What problem does this PR solve? **TL;DR:** Introduce virtual slot ref to eliminate redundant computation of common sub-expressions ### Problem to solve Consider the following queries: ```sql select funcC(funcA(colA)), funcB(funcA(colA)) from table; ``` ```sql select funcA(colA) as sub from table where funcB(funcA(colA)) > 0; ``` ```sql select l2_distance(colA, [10]) as distance from table where l2_distance(colA, [10]) > 0 ``` The common characteristic of these SQL statements is that certain expressions appear multiple times in different places—whether in the projection, in predicates, or during index computation (e.g., for ANN index in Q3). These identical repeated expressions are currently computed multiple times, but they could actually be computed just once. We introduce **virtual slot ref** to address this issue. In the storage layer, we implement a `VirtualColumnIterator`. The behavior of `VirtualColumnIterator` is identical to other `ColumnIterator`s, except it is not used to read any physical column. Instead, it is dedicated to reading the result of expressions computed from the index (for example, the distance returned by an ANN index, or in the future, the relevance score from a full-text index). Once an expression result is computed via the index, we use `VirtualColumnIterator::prepare_materialization` to store the data source. If a segment does not have the corresponding index, the data source of the `VirtualColumnIterator` will be a special `ColumnNothing` type (this is an important design trick that allows virtual slot ref to elegantly handle the case where a segment does not yet have the index built). We also modify `SegmentIterator`. Before processing each block, we first initialize the positions of the virtual slot ref in the block as `ColumnNothing`. Before actually returning a block, we check whether the virtual slot ref have been materialized; if not, we execute the expressions corresponding to the virtual slot ref (e.g., `l2_distance` or a `score` function) to generate the actual virtual slot ref, ensuring that every virtual column in the block returned to the computation layer has been materialized. For expression evaluation, we introduce `VirtualSlotRef`, which is essentially `SlotRef` + `FunctionCall`. When the expression tree executes a node of this type, it automatically checks whether the corresponding expression has been materialized: if it has, `VirtualSlotRef` behaves like a `SlotRef`; if it hasn’t, it behaves like a `FunctionCall`. ### Modification on planner Here’s an example to better illustrate the execution of virtual slot ref: ```sql select func(colA) from table where func(colA) > 0; ``` For this SQL, our current `ScanNode` is: ``` ScanNode { predicates: func(colA[#0]) > 0 final projection: func(colA[#0]) final projection tuple id: 1 tuple_id: 0 } TupleDesc[id=0]{ SlotDesc{id=0, col=colA)} } TupleDesc[id=1] { SlotDesc{id=1, col=null, ..., type=float64) } ``` After this pr, our plan will become: ``` ScanNode { predicate: function(colA)[apache#1] > 0 final projection: function(colA)[apache#1] final projection tuple id: 1 tuple_id: 0 } TupleDesc[id=0]{ SlotDesc{id=0, col="colA")}, SlotDesc{id=1, col=virtual_column_1, expr=function1(colA[#0])} } TupleDesc[id=1] { SlotDesc{id=2, name=virtual_column_1[apache#1]) } ``` Note that we added a `VirtualSlot` in Tuple 0, and other places that originally required computing the expression are transformed to reference this `VirtualSlot`. In this way, redundant computation of common expressions is eliminated. ### Benchmark disable the plan rules ```sql mysql> set disable_nereids_rules='PUSH_DOWN_VIRTUAL_COLUMNS_INTO_OLAP_SCAN'; Query OK, 0 rows affected (0.00 sec) mysql> SELECT counterid, Count(*) AS hit_count, Count(DISTINCT userid) AS unique_users FROM hits WHERE ( Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) = 'GOOGLE.COM' OR Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) = 'GOOGLE.RU' OR Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) LIKE '%GOOGLE%' ) AND ( Length(Regexp_extract(referer, '^https?://([^/]+)', 1)) > 3 OR Regexp_extract(referer, '^https?://([^/]+)', 1) != '' OR Regexp_extract(referer, '^https?://([^/]+)', 1) IS NOT NULL ) AND eventdate = '2013-07-15' GROUP BY counterid HAVING hit_count > 100 ORDER BY hit_count DESC LIMIT 20; +-----------+-----------+--------------+ | counterid | hit_count | unique_users | +-----------+-----------+--------------+ | 105857 | 1919075 | 1412926 | | 117917 | 200018 | 50285 | | 99062 | 114384 | 71408 | | 1634 | 43839 | 14975 | | 59 | 31328 | 6668 | | 114157 | 28852 | 19729 | | 62 | 22549 | 14130 | | 1483 | 8425 | 5677 | | 38 | 5436 | 1805 | | 1060 | 4043 | 2948 | | 76221 | 2060 | 1325 | | 128858 | 1690 | 825 | | 102847 | 1500 | 350 | | 89761 | 1419 | 274 | | 92040 | 1180 | 978 | | 1089 | 1067 | 961 | | 2004 | 880 | 698 | | 1213 | 597 | 219 | | 77729 | 448 | 108 | | 71099 | 289 | 70 | +-----------+-----------+--------------+ 20 rows in set (1.50 sec) ``` reopen the rule: ```text mysql> unset variable disable_nereids_rules; -------------- unset variable disable_nereids_rules -------------- Query OK, 0 rows affected (0.00 sec) mysql> -- 查询 1: 分析从 Google 中获得最多点击的 20 个网站 mysql> SELECT counterid, -> Count(*) AS hit_count, -> Count(DISTINCT userid) AS unique_users -> FROM hits -> WHERE ( Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) = 'GOOGLE.COM' -> OR Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) = -> 'GOOGLE.RU' -> OR Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) LIKE -> '%GOOGLE%' ) -> AND ( Length(Regexp_extract(referer, '^https?://([^/]+)', 1)) > 3 -> OR Regexp_extract(referer, '^https?://([^/]+)', 1) != '' -> OR Regexp_extract(referer, '^https?://([^/]+)', 1) IS NOT NULL ) -> AND eventdate = '2013-07-15' -> GROUP BY counterid -> HAVING hit_count > 100 -> ORDER BY hit_count DESC -> LIMIT 20; +-----------+-----------+--------------+ | counterid | hit_count | unique_users | +-----------+-----------+--------------+ | 105857 | 1919075 | 1412926 | | 117917 | 200018 | 50285 | | 99062 | 114384 | 71408 | | 1634 | 43839 | 14975 | | 59 | 31328 | 6668 | | 114157 | 28852 | 19729 | | 62 | 22549 | 14130 | | 1483 | 8425 | 5677 | | 38 | 5436 | 1805 | | 1060 | 4043 | 2948 | | 76221 | 2060 | 1325 | | 128858 | 1690 | 825 | | 102847 | 1500 | 350 | | 89761 | 1419 | 274 | | 92040 | 1180 | 978 | | 1089 | 1067 | 961 | | 2004 | 880 | 698 | | 1213 | 597 | 219 | | 77729 | 448 | 108 | | 71099 | 289 | 70 | +-----------+-----------+--------------+ 20 rows in set (0.57 sec) ``` About 300% optimization. ### TODO In the future, we can leverage virtual slot ref to implement more functionalities, including: 1. ANN Index 2. Relevance scoring based on full-text indexes 3. Generated Columns (of NOT ALWAYS type) 4. Index-Only Scan (which will require modifying SlotRef computation in `SegmentIterator` to be pull-based) 5. CSE replace rule on FE is very basic, but enough to use for now. Need a further modification on fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PushDownVirtualColumnsIntoOlapScanTest.java --- Co-authored-by: morrySnow <zhangwenxin@selectdb.com>

**TL;DR:** Introduce virtual slot ref to eliminate redundant computation of common sub-expressions Consider the following queries: ```sql select funcC(funcA(colA)), funcB(funcA(colA)) from table; ``` ```sql select funcA(colA) as sub from table where funcB(funcA(colA)) > 0; ``` ```sql select l2_distance(colA, [10]) as distance from table where l2_distance(colA, [10]) > 0 ``` The common characteristic of these SQL statements is that certain expressions appear multiple times in different places—whether in the projection, in predicates, or during index computation (e.g., for ANN index in Q3). These identical repeated expressions are currently computed multiple times, but they could actually be computed just once. We introduce **virtual slot ref** to address this issue. In the storage layer, we implement a `VirtualColumnIterator`. The behavior of `VirtualColumnIterator` is identical to other `ColumnIterator`s, except it is not used to read any physical column. Instead, it is dedicated to reading the result of expressions computed from the index (for example, the distance returned by an ANN index, or in the future, the relevance score from a full-text index). Once an expression result is computed via the index, we use `VirtualColumnIterator::prepare_materialization` to store the data source. If a segment does not have the corresponding index, the data source of the `VirtualColumnIterator` will be a special `ColumnNothing` type (this is an important design trick that allows virtual slot ref to elegantly handle the case where a segment does not yet have the index built). We also modify `SegmentIterator`. Before processing each block, we first initialize the positions of the virtual slot ref in the block as `ColumnNothing`. Before actually returning a block, we check whether the virtual slot ref have been materialized; if not, we execute the expressions corresponding to the virtual slot ref (e.g., `l2_distance` or a `score` function) to generate the actual virtual slot ref, ensuring that every virtual column in the block returned to the computation layer has been materialized. For expression evaluation, we introduce `VirtualSlotRef`, which is essentially `SlotRef` + `FunctionCall`. When the expression tree executes a node of this type, it automatically checks whether the corresponding expression has been materialized: if it has, `VirtualSlotRef` behaves like a `SlotRef`; if it hasn’t, it behaves like a `FunctionCall`. Here’s an example to better illustrate the execution of virtual slot ref: ```sql select func(colA) from table where func(colA) > 0; ``` For this SQL, our current `ScanNode` is: ``` ScanNode { predicates: func(colA[#0]) > 0 final projection: func(colA[#0]) final projection tuple id: 1 tuple_id: 0 } TupleDesc[id=0]{ SlotDesc{id=0, col=colA)} } TupleDesc[id=1] { SlotDesc{id=1, col=null, ..., type=float64) } ``` After this pr, our plan will become: ``` ScanNode { predicate: function(colA)[#1] > 0 final projection: function(colA)[#1] final projection tuple id: 1 tuple_id: 0 } TupleDesc[id=0]{ SlotDesc{id=0, col="colA")}, SlotDesc{id=1, col=virtual_column_1, expr=function1(colA[#0])} } TupleDesc[id=1] { SlotDesc{id=2, name=virtual_column_1[#1]) } ``` Note that we added a `VirtualSlot` in Tuple 0, and other places that originally required computing the expression are transformed to reference this `VirtualSlot`. In this way, redundant computation of common expressions is eliminated. disable the plan rules ```sql mysql> set disable_nereids_rules='PUSH_DOWN_VIRTUAL_COLUMNS_INTO_OLAP_SCAN'; Query OK, 0 rows affected (0.00 sec) mysql> SELECT counterid, Count(*) AS hit_count, Count(DISTINCT userid) AS unique_users FROM hits WHERE ( Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) = 'GOOGLE.COM' OR Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) = 'GOOGLE.RU' OR Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) LIKE '%GOOGLE%' ) AND ( Length(Regexp_extract(referer, '^https?://([^/]+)', 1)) > 3 OR Regexp_extract(referer, '^https?://([^/]+)', 1) != '' OR Regexp_extract(referer, '^https?://([^/]+)', 1) IS NOT NULL ) AND eventdate = '2013-07-15' GROUP BY counterid HAVING hit_count > 100 ORDER BY hit_count DESC LIMIT 20; +-----------+-----------+--------------+ | counterid | hit_count | unique_users | +-----------+-----------+--------------+ | 105857 | 1919075 | 1412926 | | 117917 | 200018 | 50285 | | 99062 | 114384 | 71408 | | 1634 | 43839 | 14975 | | 59 | 31328 | 6668 | | 114157 | 28852 | 19729 | | 62 | 22549 | 14130 | | 1483 | 8425 | 5677 | | 38 | 5436 | 1805 | | 1060 | 4043 | 2948 | | 76221 | 2060 | 1325 | | 128858 | 1690 | 825 | | 102847 | 1500 | 350 | | 89761 | 1419 | 274 | | 92040 | 1180 | 978 | | 1089 | 1067 | 961 | | 2004 | 880 | 698 | | 1213 | 597 | 219 | | 77729 | 448 | 108 | | 71099 | 289 | 70 | +-----------+-----------+--------------+ 20 rows in set (1.50 sec) ``` reopen the rule: ```text mysql> unset variable disable_nereids_rules; -------------- unset variable disable_nereids_rules -------------- Query OK, 0 rows affected (0.00 sec) mysql> -- 查询 1: 分析从 Google 中获得最多点击的 20 个网站 mysql> SELECT counterid, -> Count(*) AS hit_count, -> Count(DISTINCT userid) AS unique_users -> FROM hits -> WHERE ( Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) = 'GOOGLE.COM' -> OR Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) = -> 'GOOGLE.RU' -> OR Upper(Regexp_extract(referer, '^https?://([^/]+)', 1)) LIKE -> '%GOOGLE%' ) -> AND ( Length(Regexp_extract(referer, '^https?://([^/]+)', 1)) > 3 -> OR Regexp_extract(referer, '^https?://([^/]+)', 1) != '' -> OR Regexp_extract(referer, '^https?://([^/]+)', 1) IS NOT NULL ) -> AND eventdate = '2013-07-15' -> GROUP BY counterid -> HAVING hit_count > 100 -> ORDER BY hit_count DESC -> LIMIT 20; +-----------+-----------+--------------+ | counterid | hit_count | unique_users | +-----------+-----------+--------------+ | 105857 | 1919075 | 1412926 | | 117917 | 200018 | 50285 | | 99062 | 114384 | 71408 | | 1634 | 43839 | 14975 | | 59 | 31328 | 6668 | | 114157 | 28852 | 19729 | | 62 | 22549 | 14130 | | 1483 | 8425 | 5677 | | 38 | 5436 | 1805 | | 1060 | 4043 | 2948 | | 76221 | 2060 | 1325 | | 128858 | 1690 | 825 | | 102847 | 1500 | 350 | | 89761 | 1419 | 274 | | 92040 | 1180 | 978 | | 1089 | 1067 | 961 | | 2004 | 880 | 698 | | 1213 | 597 | 219 | | 77729 | 448 | 108 | | 71099 | 289 | 70 | +-----------+-----------+--------------+ 20 rows in set (0.57 sec) ``` About 300% optimization. In the future, we can leverage virtual slot ref to implement more functionalities, including: 1. ANN Index 2. Relevance scoring based on full-text indexes 3. Generated Columns (of NOT ALWAYS type) 4. Index-Only Scan (which will require modifying SlotRef computation in `SegmentIterator` to be pull-based) 5. CSE replace rule on FE is very basic, but enough to use for now. Need a further modification on fe/fe-core/src/test/java/org/apache/doris/nereids/rules/rewrite/PushDownVirtualColumnsIntoOlapScanTest.java --- Co-authored-by: morrySnow <zhangwenxin@selectdb.com>

#54223) ### What problem does this PR solve? Related PR: #52701 TimeV2 is a runtime type, it can not be used as VirtualSlotRef. Problem Summary: ### Release note None ### Check List (For Author) - Test  - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason  - Behavior changed: - [ ] No. - [ ] Yes.  - Does this need documentation? - [ ] No. - [ ] Yes.  ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label

Tablet schema that contains virtual column should not be added into SchemaCache. Related PR: #52701

### What problem does this PR solve? Introducing Ann index to doris. This pull request introduces foundational support for ANN (Approximate Nearest Neighbor) vector index functionality in the storage engine, including new runtime structures, configuration options, and initial integration with the build system. The changes lay the groundwork for ANN-based search and statistics collection, and begin integrating ANN index support into various storage and query execution paths. The implementation of ann index is based on [faiss](https://github.com/facebookresearch/faiss). Faiss could return distance directly, so this pr using [virtual slot ref](#52701) to return result from index. Each data segment of doris will have a faiss index if user creates a table with Ann index, and new segment generated by compaction will have a faiss index automatically. Currently, create index and build index is not supported, index defination be added to ddl if you want it. **ANN Index Feature Integration:** * Added new runtime structures and parameters for ANN index operations, including `AnnIndexStats`, `AnnIndexParam`, `RangeSearchParams`, `RangeSearchResult`, and others in `ann_search_params.h`, as well as `RangeSearchRuntimeInfo` for managing ANN range search context. [[1]](diffhunk://#diff-088dbea44296fb3669fe0cd22005df6aff33f8b60b20d5a49a68c6bbd22c29d1R1-R109) [[2]](diffhunk://#diff-ec34e664611a5877cab8f157919c35fe9b697428533c702536c97fd4f05769bdR1-R97) [[3]](diffhunk://#diff-d41283d91ba2756db2b45cadf964a78ad1a5c3360e1b854cb1a3d1f60817c804R1-R44) * Extended `StorageReadOptions` and `RowsetReaderContext` to include `ann_topn_runtime` for passing ANN runtime information through the storage read path. [[1]](diffhunk://#diff-8dec3cee74c5e0821835a7125bede7e3358bd3b5067b2748262193cb4cf80d48R126) [[2]](diffhunk://#diff-19fb296aa338021a0806017aa78ddadbea1791de3f4545724e34f8d9683a6551R95) [[3]](diffhunk://#diff-66ec81528c724b2f69e242d61f35d8a54d7bb5286a7e334c486166a3cc946642R106) * Added new ANN-related statistics fields (timing and row counts) to `OlapReaderStatistics` for monitoring ANN index operations. **Build System and Dependency Updates:** * Added `doris-faiss` and `doris-openblas` as submodules for ANN/vector index support, and integrated the new `Vector` library into the build process and as a dependency for relevant targets. [[1]](diffhunk://#diff-fe7afb5c9c916e521401d3fcfb4277d5071798c3baf83baf11d6071742823584R24-R31) [[2]](diffhunk://#diff-3507aac2aff9b5fe5f66d28967f3aa848491d4ced2466f6bf201ab3a97531837R532) [[3]](diffhunk://#diff-3507aac2aff9b5fe5f66d28967f3aa848491d4ced2466f6bf201ab3a97531837R787-R788) [[4]](diffhunk://#diff-d67261040b7ca84a64e8aeef5f7e1a8bab5efcc20fcdd3402f24160f56c29959R26) **Index Handling and Schema Integration:** * Updated index file writer accessors and naming from "inverted_index" to more generic "index" to accommodate ANN and other index types. [[1]](diffhunk://#diff-0c1c144f791918ef5b05ded169a7efb22a0ae67565e641cc03c31d4c2872729eL747-R748) [[2]](diffhunk://#diff-60cd05e044b4218e4a4d774abe89636fa0f6e1290dd0ff7892231d30770cd2b1L193-R193) * Changed index creation logic in `SegmentFlusher` to use `has_extra_index()` (supporting both inverted and ANN indexes) instead of `has_inverted_index()`. [[1]](diffhunk://#diff-7e9f53b4ef59bdb00d10393a2941be9201ddd46c3aab957d1dae8bc5d8898ebeL139-R139) [[2]](diffhunk://#diff-7e9f53b4ef59bdb00d10393a2941be9201ddd46c3aab957d1dae8bc5d8898ebeL157-R157) [[3]](diffhunk://#diff-7e9f53b4ef59bdb00d10393a2941be9201ddd46c3aab957d1dae8bc5d8898ebeL176-R176) [[4]](diffhunk://#diff-7e9f53b4ef59bdb00d10393a2941be9201ddd46c3aab957d1dae8bc5d8898ebeL193-R193) **Configuration:** * Introduced a new configuration option `opm_threads_limit` to control the maximum number of OpenMP threads used per Doris thread, which is relevant for vectorized/ANN computation. [[1]](diffhunk://#diff-b626e6ab16bc72abf40db76bf5094fcc8ca3c37534c2eb83b63b7805e1b601ffR1578-R1580) [[2]](diffhunk://#diff-46e8c1ada0d43acf8c2965e46e90909089aada1f46531976c10605b837f8da3dR1634-R1635) These changes set up the infrastructure required for future development of ANN vector index features, including search, filtering, and statistics collection. Co-authored-by: chenlinzhong <490103404@qq.com> Co-authored-by: morrySnow <zhangwenxin@selectdb.com>

…55323) Some expression can not handle empty block, such as function `element_at`. So materialize virtual column in advance to avoid errors. Related PR: #52701

…54998) ### What problem does this PR solve? Related PR: #52701 Problem Summary: 1. do not push down WhenClause in case when 2. not generate virtual column which only used once: fix the pattern like `select func_a(x), func_b(func_a(x)), func_c(func_b(func_a(x)))`.

…pache#54998) ### What problem does this PR solve? Related PR: apache#52701 Problem Summary: 1. do not push down WhenClause in case when 2. not generate virtual column which only used once: fix the pattern like `select func_a(x), func_b(func_a(x)), func_c(func_b(func_a(x)))`.

### What problem does this PR solve? need adjust nullable for expression of the virtual slot bug introduced by: #52701 example as follow: SQL: ```sql SELECT t1.*, t2.* FROM tbl_adjust_virtual_slot_nullable_1 AS t1 LEFT JOIN tbl_adjust_virtual_slot_nullable_2 AS t2 ON t1.c_int = t2.c_int WHERE NOT ( day(t2.c_date) IN (1, 3) AND day(t2.c_date) IN (2, 3, 3) ); ``` throw exception: ``` java.sql.SQLException: errCode = 2, detailMessage = (127.0.0.1)[INTERNAL_ERROR]Could not find function dayofmonth, arg c_date return Nullable(TINYINT) ```

…#55694) ### What problem does this PR solve? need adjust nullable for expression of the virtual slot bug introduced by: apache#52701 example as follow: SQL: ```sql SELECT t1.*, t2.* FROM tbl_adjust_virtual_slot_nullable_1 AS t1 LEFT JOIN tbl_adjust_virtual_slot_nullable_2 AS t2 ON t1.c_int = t2.c_int WHERE NOT ( day(t2.c_date) IN (1, 3) AND day(t2.c_date) IN (2, 3, 3) ); ``` throw exception: ``` java.sql.SQLException: errCode = 2, detailMessage = (127.0.0.1)[INTERNAL_ERROR]Could not find function dayofmonth, arg c_date return Nullable(TINYINT) ```

### What problem does this PR solve? Related: #52701 1. Functions that processing comprehensive type is too complicated, and may have many unexpected problem, eg. return `array<null_type>`, so do not process them by using virtual slot. 2. lambda function can not be processed by virtual column. So stop removing common sub expression if we meet above cases.

### What problem does this PR solve? Related PR: #52701 1. Do not optimize grouping scalar function. 2. Fix rule type of ann topn push down.

zhiqiang-hhhh marked this pull request as ready for review July 3, 2025 13:28

zhiqiang-hhhh changed the title ~~[feat] Virtual Column~~ [feat] Virtual Slot Ref Jul 3, 2025

yiguolei reviewed Jul 4, 2025

View reviewed changes

be/src/olap/rowset/segment_v2/virtual_column_iterator.cpp Show resolved Hide resolved

yiguolei reviewed Jul 4, 2025

View reviewed changes

gensrc/thrift/Exprs.thrift Outdated Show resolved Hide resolved

yiguolei reviewed Jul 4, 2025

View reviewed changes

be/src/common/consts.h Show resolved Hide resolved

zhiqiang-hhhh force-pushed the feat-virtual-column branch from e61396d to 25ebe62 Compare July 7, 2025 14:36

HappenLee approved these changes Jul 28, 2025

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Jul 28, 2025

airborne12 approved these changes Jul 28, 2025

View reviewed changes

airborne12 merged commit 08a9dbb into apache:master Jul 28, 2025
27 of 28 checks passed

zhiqiang-hhhh deleted the feat-virtual-column branch July 28, 2025 02:40

zhiqiang-hhhh mentioned this pull request Aug 1, 2025

[fix](virtual slot) TimeV2 type should not be used as virtual slot ref #54223

Merged

16 tasks

This was referenced Aug 5, 2025

[fix](virtual slot) Fix tablet schema cache of virtual slot #54326

Merged

[feat](index) Ann Index #54276

Merged

airborne12 pushed a commit that referenced this pull request Aug 18, 2025

[fix](virtual slot) Fix tablet schema cache of virtual slot (#54326)

a20d8a6

Tablet schema that contains virtual column should not be added into SchemaCache. Related PR: #52701

zhiqiang-hhhh mentioned this pull request Aug 26, 2025

[fix](virtual slot) Fix some expression can not process empty block #55323

Merged

16 tasks

morrySnow mentioned this pull request Aug 27, 2025

[fix](virtual column) should not push down WhenClause in case when #54998

Merged

16 tasks

zhiqiang-hhhh mentioned this pull request Aug 28, 2025

[fix](virtual slot) Shrink char column before projection #55435

Merged

16 tasks

yujun777 mentioned this pull request Sep 5, 2025

[fix](virtual slot) adjust virtual column expression nullable #55694

Merged

16 tasks

zhiqiang-hhhh mentioned this pull request Sep 10, 2025

[fix](virtual slot) Fix complex type and lambda function #55869

Merged

16 tasks

zhiqiang-hhhh mentioned this pull request Sep 17, 2025

[fix](virtual slot) Fix grouping scalar function #56127

Merged

16 tasks

englefly pushed a commit that referenced this pull request Sep 17, 2025

[fix](virtual slot) Fix grouping scalar function (#56127)

f05cf95

### What problem does this PR solve? Related PR: #52701 1. Do not optimize grouping scalar function. 2. Fix rule type of ann topn push down.

[feat] Virtual Slot Ref #52701

[feat] Virtual Slot Ref #52701

Uh oh!

Conversation

zhiqiang-hhhh commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

Problem to solve

Modification on planner

Benchmark

TODO

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

Thearas commented Jul 3, 2025

Uh oh!

zhiqiang-hhhh commented Jul 3, 2025

Uh oh!

hello-stephen commented Jul 3, 2025

FE UT Coverage Report

Uh oh!

zhiqiang-hhhh commented Jul 3, 2025

Uh oh!

doris-robot commented Jul 3, 2025

Cloud UT Coverage Report

Uh oh!

zhiqiang-hhhh commented Jul 3, 2025

Uh oh!

doris-robot commented Jul 3, 2025

Cloud UT Coverage Report

Uh oh!

hello-stephen commented Jul 3, 2025

FE UT Coverage Report

Uh oh!

zhiqiang-hhhh commented Jul 3, 2025

Uh oh!

hello-stephen commented Jul 3, 2025

Cloud UT Coverage Report

Uh oh!

zhiqiang-hhhh commented Jul 3, 2025

Uh oh!

doris-robot commented Jul 3, 2025

Cloud UT Coverage Report

Uh oh!

hello-stephen commented Jul 3, 2025

FE UT Coverage Report

Uh oh!

zhiqiang-hhhh commented Jul 3, 2025

Uh oh!

zhiqiang-hhhh commented Jul 3, 2025

Uh oh!

zhiqiang-hhhh commented Jul 3, 2025

Uh oh!

doris-robot commented Jul 3, 2025

Cloud UT Coverage Report

Uh oh!

doris-robot commented Jul 3, 2025

Uh oh!

doris-robot commented Jul 3, 2025

Uh oh!

doris-robot commented Jul 3, 2025

Uh oh!

doris-robot commented Jul 3, 2025

BE UT Coverage Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zhiqiang-hhhh commented Jul 7, 2025

Uh oh!

doris-robot commented Jul 7, 2025

Cloud UT Coverage Report

Uh oh!

doris-robot commented Jul 7, 2025

BE UT Coverage Report

Uh oh!

doris-robot commented Jul 25, 2025

Uh oh!

doris-robot commented Jul 25, 2025

zhiqiang-hhhh commented Jul 3, 2025 •

edited

Loading