Skip to content

Conversation

@englefly
Copy link
Contributor

What problem does this PR solve?

Add planWithUnknownColumnStats to QueryState to prevent queries from analysis tasks from polluting the column stats cache.
Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@englefly
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34365 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c3e38cfa020d4689d7f50565a922b01df106a226, data reload: false

------ Round 1 ----------------------------------
q1	17646	5331	5089	5089
q2	2025	324	223	223
q3	10295	1279	727	727
q4	10278	866	371	371
q5	8539	2457	2336	2336
q6	208	168	135	135
q7	943	768	646	646
q8	9339	1312	1066	1066
q9	7475	5127	5131	5127
q10	6891	2213	1816	1816
q11	509	314	300	300
q12	373	374	236	236
q13	17796	3726	3046	3046
q14	236	241	211	211
q15	579	509	507	507
q16	1033	997	955	955
q17	600	890	363	363
q18	7753	7229	7129	7129
q19	1456	958	573	573
q20	345	338	232	232
q21	3805	3289	2315	2315
q22	1077	1034	962	962
Total cold run time: 109201 ms
Total hot run time: 34365 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5299	5137	5194	5137
q2	266	326	233	233
q3	2177	2669	2341	2341
q4	1379	1742	1372	1372
q5	4371	4501	4476	4476
q6	213	173	132	132
q7	2076	2026	1815	1815
q8	2677	2531	2606	2531
q9	7484	7315	7294	7294
q10	3098	3219	2826	2826
q11	592	545	503	503
q12	680	990	643	643
q13	3469	3943	3511	3511
q14	285	304	282	282
q15	556	512	483	483
q16	1031	1123	1091	1091
q17	1229	1650	1363	1363
q18	7825	7759	7487	7487
q19	815	812	837	812
q20	2015	2118	1906	1906
q21	4831	4438	4373	4373
q22	1092	1039	1004	1004
Total cold run time: 53460 ms
Total hot run time: 51615 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187715 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c3e38cfa020d4689d7f50565a922b01df106a226, data reload: false

query1	1054	405	405	405
query2	6607	1706	1714	1706
query3	6750	225	228	225
query4	26569	23281	23554	23281
query5	4402	630	447	447
query6	329	239	216	216
query7	4646	489	302	302
query8	306	267	238	238
query9	8707	2580	2569	2569
query10	498	336	297	297
query11	15820	15421	14776	14776
query12	173	120	112	112
query13	1682	563	453	453
query14	10818	9303	9241	9241
query15	201	186	171	171
query16	7271	671	506	506
query17	1228	749	613	613
query18	1977	408	311	311
query19	208	199	189	189
query20	130	125	125	125
query21	215	129	112	112
query22	4028	4156	4024	4024
query23	33801	33341	33153	33153
query24	8457	2416	2396	2396
query25	598	553	484	484
query26	1235	273	165	165
query27	2756	499	355	355
query28	4439	2225	2169	2169
query29	871	647	517	517
query30	308	226	202	202
query31	939	824	748	748
query32	96	81	78	78
query33	607	386	336	336
query34	792	840	524	524
query35	812	839	742	742
query36	961	990	922	922
query37	118	114	89	89
query38	3568	3493	3454	3454
query39	1484	1447	1417	1417
query40	239	138	125	125
query41	71	64	65	64
query42	127	119	117	117
query43	511	497	483	483
query44	1279	742	754	742
query45	192	191	186	186
query46	876	988	635	635
query47	1788	1805	1700	1700
query48	409	422	324	324
query49	829	520	418	418
query50	638	668	407	407
query51	3940	3896	3978	3896
query52	110	115	108	108
query53	241	270	199	199
query54	314	293	279	279
query55	88	91	85	85
query56	329	313	308	308
query57	1159	1163	1113	1113
query58	284	269	273	269
query59	2601	2580	2502	2502
query60	328	351	348	348
query61	152	155	151	151
query62	812	745	664	664
query63	236	197	201	197
query64	4489	1203	852	852
query65	4026	3946	3956	3946
query66	1166	442	336	336
query67	15253	15104	14741	14741
query68	8180	919	597	597
query69	488	321	294	294
query70	1463	1250	1231	1231
query71	449	348	317	317
query72	5806	4906	4931	4906
query73	668	599	364	364
query74	8862	9040	8939	8939
query75	3568	3335	2815	2815
query76	3476	1182	743	743
query77	762	398	312	312
query78	9623	9849	8921	8921
query79	1426	799	587	587
query80	654	577	492	492
query81	531	266	228	228
query82	236	162	132	132
query83	271	265	266	265
query84	249	120	98	98
query85	885	482	435	435
query86	374	320	282	282
query87	3698	3719	3594	3594
query88	2819	2287	2262	2262
query89	391	327	290	290
query90	2028	214	212	212
query91	178	163	137	137
query92	79	70	65	65
query93	1326	985	640	640
query94	708	461	347	347
query95	409	319	304	304
query96	485	562	283	283
query97	2900	2952	2849	2849
query98	230	213	218	213
query99	1305	1403	1296	1296
Total cold run time: 272475 ms
Total hot run time: 187715 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.52 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c3e38cfa020d4689d7f50565a922b01df106a226, data reload: false

query1	0.05	0.05	0.05
query2	0.10	0.05	0.05
query3	0.26	0.08	0.08
query4	1.61	0.11	0.12
query5	0.26	0.26	0.26
query6	1.19	0.64	0.66
query7	0.03	0.02	0.03
query8	0.05	0.04	0.04
query9	0.60	0.53	0.52
query10	0.58	0.58	0.57
query11	0.16	0.12	0.12
query12	0.16	0.11	0.12
query13	0.62	0.59	0.61
query14	1.00	0.98	1.00
query15	0.85	0.84	0.82
query16	0.38	0.39	0.41
query17	1.00	1.01	1.03
query18	0.21	0.20	0.20
query19	1.95	1.84	1.79
query20	0.02	0.01	0.01
query21	15.44	0.19	0.13
query22	5.06	0.07	0.04
query23	15.67	0.26	0.09
query24	2.91	0.56	0.45
query25	0.08	0.06	0.06
query26	0.14	0.15	0.13
query27	0.05	0.06	0.05
query28	4.07	1.13	0.93
query29	12.67	3.94	3.38
query30	0.29	0.14	0.13
query31	2.81	0.59	0.38
query32	3.23	0.55	0.46
query33	3.00	3.01	2.99
query34	15.85	5.11	4.56
query35	4.57	4.53	4.61
query36	0.68	0.50	0.49
query37	0.10	0.07	0.07
query38	0.06	0.04	0.04
query39	0.03	0.03	0.03
query40	0.18	0.14	0.15
query41	0.09	0.03	0.04
query42	0.04	0.03	0.02
query43	0.04	0.04	0.03
Total cold run time: 98.14 s
Total hot run time: 27.52 s

@englefly
Copy link
Contributor Author

run cloud_p0

@englefly
Copy link
Contributor Author

run nonConcurrent

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (7/7) 🎉
Increment coverage report
Complete coverage report

@englefly
Copy link
Contributor Author

run cloud_p0

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (7/7) 🎉
Increment coverage report
Complete coverage report

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 13, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@englefly
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34385 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c3e38cfa020d4689d7f50565a922b01df106a226, data reload: false

------ Round 1 ----------------------------------
q1	17606	5228	5007	5007
q2	2044	312	204	204
q3	10277	1310	756	756
q4	10245	967	380	380
q5	7556	2324	2441	2324
q6	216	169	135	135
q7	909	798	625	625
q8	9354	1355	1112	1112
q9	7122	5188	5221	5188
q10	6911	2240	1808	1808
q11	492	313	283	283
q12	366	370	239	239
q13	17825	3622	3033	3033
q14	236	228	215	215
q15	581	507	510	507
q16	1054	1023	942	942
q17	608	878	385	385
q18	7628	7073	7098	7073
q19	1357	943	562	562
q20	364	356	239	239
q21	3868	3239	2396	2396
q22	1061	1044	972	972
Total cold run time: 107680 ms
Total hot run time: 34385 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5230	5126	5133	5126
q2	255	329	229	229
q3	2194	2706	2331	2331
q4	1373	1754	1335	1335
q5	4180	4502	4528	4502
q6	214	176	134	134
q7	2061	1941	1900	1900
q8	2712	2615	2559	2559
q9	7325	7322	7374	7322
q10	3023	3284	2877	2877
q11	594	530	521	521
q12	686	796	634	634
q13	3494	3964	3230	3230
q14	301	325	288	288
q15	547	503	504	503
q16	1057	1150	1062	1062
q17	1274	1520	1418	1418
q18	7861	7753	7434	7434
q19	876	803	916	803
q20	1994	2058	1817	1817
q21	4724	4351	4263	4263
q22	1080	1045	994	994
Total cold run time: 53055 ms
Total hot run time: 51282 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188224 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c3e38cfa020d4689d7f50565a922b01df106a226, data reload: false

query1	1038	400	385	385
query2	6565	1692	1700	1692
query3	6794	233	226	226
query4	26300	23680	23491	23491
query5	4875	663	500	500
query6	338	249	232	232
query7	4653	500	302	302
query8	312	269	263	263
query9	8711	2628	2645	2628
query10	494	350	285	285
query11	15686	14990	14862	14862
query12	192	121	110	110
query13	1678	564	438	438
query14	11338	9182	9194	9182
query15	196	184	174	174
query16	7670	683	523	523
query17	1185	730	600	600
query18	2034	422	352	352
query19	209	205	180	180
query20	129	127	120	120
query21	217	131	113	113
query22	3990	4109	4018	4018
query23	33799	33091	33178	33091
query24	8451	2400	2422	2400
query25	585	524	442	442
query26	1234	269	159	159
query27	2753	504	372	372
query28	4388	2221	2212	2212
query29	796	611	488	488
query30	308	223	201	201
query31	935	813	738	738
query32	89	72	81	72
query33	592	382	341	341
query34	801	847	516	516
query35	802	833	761	761
query36	978	987	920	920
query37	127	112	83	83
query38	3499	3557	3491	3491
query39	1446	1599	1400	1400
query40	220	130	120	120
query41	64	60	63	60
query42	129	112	113	112
query43	501	497	478	478
query44	1280	769	749	749
query45	190	184	175	175
query46	897	1011	658	658
query47	1783	1810	1702	1702
query48	392	430	342	342
query49	770	527	417	417
query50	645	677	400	400
query51	3889	3875	4111	3875
query52	107	112	101	101
query53	244	271	196	196
query54	312	301	280	280
query55	90	87	90	87
query56	317	310	313	310
query57	1161	1192	1138	1138
query58	310	294	289	289
query59	2520	2642	2595	2595
query60	359	358	369	358
query61	197	196	193	193
query62	836	726	665	665
query63	233	201	207	201
query64	4557	1301	990	990
query65	4005	3944	3989	3944
query66	1100	451	359	359
query67	15246	15222	14916	14916
query68	8174	962	607	607
query69	507	336	309	309
query70	1393	1213	1286	1213
query71	455	351	322	322
query72	6090	4887	4888	4887
query73	658	620	370	370
query74	9186	8976	8644	8644
query75	3660	3322	2781	2781
query76	3461	1152	769	769
query77	780	440	336	336
query78	9483	9709	8829	8829
query79	2506	843	612	612
query80	754	558	511	511
query81	501	266	236	236
query82	454	163	142	142
query83	266	260	252	252
query84	259	116	92	92
query85	926	475	435	435
query86	386	299	312	299
query87	3711	3673	3599	3599
query88	3719	2250	2251	2250
query89	386	327	290	290
query90	1924	222	235	222
query91	169	161	135	135
query92	87	69	66	66
query93	2130	975	650	650
query94	744	445	352	352
query95	409	323	300	300
query96	491	576	284	284
query97	2945	3036	2859	2859
query98	252	220	207	207
query99	1698	1417	1291	1291
Total cold run time: 277105 ms
Total hot run time: 188224 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.31 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c3e38cfa020d4689d7f50565a922b01df106a226, data reload: false

query1	0.05	0.05	0.04
query2	0.09	0.06	0.05
query3	0.25	0.08	0.08
query4	1.61	0.12	0.12
query5	0.27	0.25	0.25
query6	1.17	0.65	0.64
query7	0.03	0.03	0.03
query8	0.06	0.05	0.04
query9	0.58	0.53	0.52
query10	0.58	0.58	0.56
query11	0.16	0.11	0.11
query12	0.15	0.12	0.12
query13	0.63	0.60	0.59
query14	1.02	1.00	1.01
query15	0.84	0.84	0.82
query16	0.38	0.39	0.39
query17	1.01	1.00	1.00
query18	0.21	0.20	0.20
query19	1.87	1.84	1.82
query20	0.02	0.01	0.01
query21	15.46	0.20	0.13
query22	4.96	0.07	0.04
query23	15.69	0.28	0.10
query24	2.46	1.05	0.39
query25	0.07	0.06	0.06
query26	0.16	0.13	0.12
query27	0.06	0.06	0.05
query28	4.16	1.14	0.92
query29	12.54	3.93	3.16
query30	0.29	0.14	0.12
query31	2.81	0.58	0.38
query32	3.23	0.56	0.47
query33	3.06	3.02	3.08
query34	15.84	5.16	4.58
query35	4.60	4.54	4.55
query36	0.69	0.50	0.50
query37	0.10	0.06	0.06
query38	0.06	0.04	0.03
query39	0.03	0.03	0.03
query40	0.18	0.16	0.13
query41	0.09	0.03	0.03
query42	0.04	0.03	0.03
query43	0.05	0.03	0.04
Total cold run time: 97.61 s
Total hot run time: 27.31 s

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (7/7) 🎉
Increment coverage report
Complete coverage report

1 similar comment
@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (7/7) 🎉
Increment coverage report
Complete coverage report

@englefly englefly merged commit e10e29d into apache:master Nov 14, 2025
28 of 29 checks passed
@englefly englefly deleted the mv-use-stats branch November 14, 2025 02:00
github-actions bot pushed a commit that referenced this pull request Dec 5, 2025
…t queries from analysis tasks from polluting the column stats cache. (#57850)

### What problem does this PR solve?
Add planWithUnknownColumnStats to QueryState to prevent queries from
analysis tasks from polluting the column stats cache.
github-actions bot pushed a commit that referenced this pull request Dec 5, 2025
…t queries from analysis tasks from polluting the column stats cache. (#57850)

### What problem does this PR solve?
Add planWithUnknownColumnStats to QueryState to prevent queries from
analysis tasks from polluting the column stats cache.
morrySnow pushed a commit that referenced this pull request Dec 5, 2025
…te to prevent queries from analysis tasks from polluting the column stats cache. #57850 (#58742)

Cherry-picked from #57850

Co-authored-by: minghong <zhouminghong@selectdb.com>
morrySnow pushed a commit that referenced this pull request Dec 8, 2025
### What problem does this PR solve?

Related PR: 
#36760
#57850

Problem Summary:

Fix stats unknown when calc sync mv plan statistics

For SQLs that are related to statistics, we should not collect or
compute statistics. Previously this was determined by the `isInternal`
flag, but `isInternal` is too broad: it covers not only
statistics-related SQL but also SQL used to generate materialized view
plans. Materialized view plan generation requires statistics, so we
introduce a new flag `isPlanWithUnKnownColumnStats` to indicate
connections that are used for statistics-only operations (treat column
statistics as unknown).
github-actions bot pushed a commit that referenced this pull request Dec 8, 2025
### What problem does this PR solve?

Related PR: 
#36760
#57850

Problem Summary:

Fix stats unknown when calc sync mv plan statistics

For SQLs that are related to statistics, we should not collect or
compute statistics. Previously this was determined by the `isInternal`
flag, but `isInternal` is too broad: it covers not only
statistics-related SQL but also SQL used to generate materialized view
plans. Materialized view plan generation requires statistics, so we
introduce a new flag `isPlanWithUnKnownColumnStats` to indicate
connections that are used for statistics-only operations (treat column
statistics as unknown).
yiguolei pushed a commit that referenced this pull request Dec 11, 2025
### What problem does this PR solve?

Related PR: 
#36760
#57850

Problem Summary:

Fix stats unknown when calc sync mv plan statistics

For SQLs that are related to statistics, we should not collect or
compute statistics. Previously this was determined by the `isInternal`
flag, but `isInternal` is too broad: it covers not only
statistics-related SQL but also SQL used to generate materialized view
plans. Materialized view plan generation requires statistics, so we
introduce a new flag `isPlanWithUnKnownColumnStats` to indicate
connections that are used for statistics-only operations (treat column
statistics as unknown).
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
…t queries from analysis tasks from polluting the column stats cache. (apache#57850)

### What problem does this PR solve?
Add planWithUnknownColumnStats to QueryState to prevent queries from
analysis tasks from polluting the column stats cache.
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
…#58426)

### What problem does this PR solve?

Related PR: 
apache#36760
apache#57850

Problem Summary:

Fix stats unknown when calc sync mv plan statistics

For SQLs that are related to statistics, we should not collect or
compute statistics. Previously this was determined by the `isInternal`
flag, but `isInternal` is too broad: it covers not only
statistics-related SQL but also SQL used to generate materialized view
plans. Materialized view plan generation requires statistics, so we
introduce a new flag `isPlanWithUnKnownColumnStats` to indicate
connections that are used for statistics-only operations (treat column
statistics as unknown).
seawinde pushed a commit to seawinde/doris that referenced this pull request Dec 16, 2025
…t queries from analysis tasks from polluting the column stats cache. (apache#57850)

### What problem does this PR solve?
Add planWithUnknownColumnStats to QueryState to prevent queries from
analysis tasks from polluting the column stats cache.
seawinde pushed a commit to seawinde/doris that referenced this pull request Dec 17, 2025
…t queries from analysis tasks from polluting the column stats cache. (apache#57850)

### What problem does this PR solve?
Add planWithUnknownColumnStats to QueryState to prevent queries from
analysis tasks from polluting the column stats cache.
yiguolei pushed a commit that referenced this pull request Dec 18, 2025
…te to prevent queries from analysis tasks from polluting the column stats cache. #57850 (#58741)

Cherry-picked from #57850

Co-authored-by: minghong <zhouminghong@selectdb.com>
yiguolei pushed a commit that referenced this pull request Dec 18, 2025
### What problem does this PR solve?

Related PR: 
#36760
#57850

Problem Summary:

Fix stats unknown when calc sync mv plan statistics

For SQLs that are related to statistics, we should not collect or
compute statistics. Previously this was determined by the `isInternal`
flag, but `isInternal` is too broad: it covers not only
statistics-related SQL but also SQL used to generate materialized view
plans. Materialized view plan generation requires statistics, so we
introduce a new flag `isPlanWithUnKnownColumnStats` to indicate
connections that are used for statistics-only operations (treat column
statistics as unknown).
seawinde added a commit to seawinde/doris that referenced this pull request Dec 22, 2025
…#58426)

### What problem does this PR solve?

Related PR: 
apache#36760
apache#57850

Problem Summary:

Fix stats unknown when calc sync mv plan statistics

For SQLs that are related to statistics, we should not collect or
compute statistics. Previously this was determined by the `isInternal`
flag, but `isInternal` is too broad: it covers not only
statistics-related SQL but also SQL used to generate materialized view
plans. Materialized view plan generation requires statistics, so we
introduce a new flag `isPlanWithUnKnownColumnStats` to indicate
connections that are used for statistics-only operations (treat column
statistics as unknown).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.1.4-merged dev/4.0.3-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants