Skip to content

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #49569

### What problem does this PR solve?

Problem Summary:

```
2025-03-27 11:35:33,694 ERROR (stateListener|95) [EditLog.loadJournal():1231] Operation Type 142
org.apache.doris.common.DdlException: errCode = 2, detailMessage = Failed to find enough backend for ssd storage medium. When setting dynamic_partition.hot_partition_num>0, the hot partitions will store in ssd. Please check the replication num, replication tag and storage medium.
	at org.apache.doris.common.util.DynamicPartitionUtil.checkReplicaAllocation(DynamicPartitionUtil.java:254) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.common.util.DynamicPartitionUtil.checkDynamicPartition(DynamicPartitionUtil.java:190) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.catalog.ColocateTableIndex.replayModifyReplicaAlloc(ColocateTableIndex.java:914) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.catalog.ColocateTableIndex.replayModifyReplicaAlloc(ColocateTableIndex.java:655) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.catalog.Env.replayJournal(Env.java:2759) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.catalog.Env$JournalObserver.runOneCycle(Env.java:116) ~[doris-fe.jar:1.2-SNAPSHOT]
	at org.apache.doris.common.util.Daemon.run(Daemon.java:116) [doris-fe.jar:1.2-SNAPSHOT]
```

We should not throw any exception when checking the properties in replay
logic.
This PR skip the checking logic when replay.
But I am not sure how to reproduce this situation, l can just guess that
after user modify the colocation property
of a table, but some properties of backends are changed, then this issue
may happen.
This PR has been tested by user and it can solve the problem.
@github-actions github-actions bot requested a review from dataroaring as a code owner March 28, 2025 01:58
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Mar 28, 2025
@hello-stephen
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40219 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e6873588156e3c1628779012733d5e1b41c140e6, data reload: false

------ Round 1 ----------------------------------
q1	17564	6747	7107	6747
q2	2065	173	180	173
q3	10548	1066	1187	1066
q4	10513	706	739	706
q5	7747	2862	2758	2758
q6	223	134	132	132
q7	990	612	613	612
q8	9350	1905	2052	1905
q9	6575	6395	6434	6395
q10	7053	2280	2266	2266
q11	472	269	257	257
q12	400	216	214	214
q13	17783	2967	3008	2967
q14	232	200	203	200
q15	502	467	469	467
q16	686	586	596	586
q17	959	624	574	574
q18	7214	6613	6757	6613
q19	1402	1119	1118	1118
q20	447	199	198	198
q21	4196	3256	3327	3256
q22	1127	1010	1009	1009
Total cold run time: 108048 ms
Total hot run time: 40219 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6606	6549	6578	6549
q2	341	237	234	234
q3	2898	2719	2963	2719
q4	2024	1822	1808	1808
q5	5716	5721	5721	5721
q6	218	135	133	133
q7	2209	1837	1831	1831
q8	3397	3539	3524	3524
q9	8789	8890	8885	8885
q10	3544	3548	3544	3544
q11	586	477	507	477
q12	800	613	601	601
q13	8716	3157	3160	3157
q14	303	267	268	267
q15	513	473	463	463
q16	705	661	633	633
q17	1830	1630	1605	1605
q18	8116	7772	7746	7746
q19	1654	1539	1587	1539
q20	2113	1879	1885	1879
q21	5637	5455	5511	5455
q22	1150	1032	1041	1032
Total cold run time: 67865 ms
Total hot run time: 59802 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197029 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e6873588156e3c1628779012733d5e1b41c140e6, data reload: false

query1	1295	931	945	931
query2	6228	2048	2023	2023
query3	10929	4498	4591	4498
query4	65765	27795	23142	23142
query5	5093	483	485	483
query6	434	202	179	179
query7	5521	314	310	310
query8	314	228	222	222
query9	8547	2589	2574	2574
query10	452	290	276	276
query11	17505	15229	15674	15229
query12	153	99	100	99
query13	1428	441	433	433
query14	10789	7927	7059	7059
query15	195	186	185	185
query16	7176	510	518	510
query17	1079	593	579	579
query18	1917	306	307	306
query19	206	163	160	160
query20	121	116	110	110
query21	200	101	100	100
query22	4867	4680	4394	4394
query23	34548	34070	34122	34070
query24	6156	3063	3053	3053
query25	540	448	428	428
query26	723	174	179	174
query27	1992	361	368	361
query28	4318	2443	2435	2435
query29	736	481	467	467
query30	247	180	168	168
query31	1043	830	841	830
query32	69	61	62	61
query33	487	326	331	326
query34	926	522	519	519
query35	859	745	743	743
query36	1090	1000	961	961
query37	124	70	69	69
query38	4192	4064	4005	4005
query39	1699	1480	1482	1480
query40	208	103	106	103
query41	54	51	51	51
query42	124	106	106	106
query43	545	508	494	494
query44	1195	835	831	831
query45	192	176	166	166
query46	1148	766	762	762
query47	2017	1923	1925	1923
query48	482	415	395	395
query49	768	412	420	412
query50	880	445	431	431
query51	7242	7081	6986	6986
query52	97	90	92	90
query53	274	183	183	183
query54	555	466	454	454
query55	77	79	80	79
query56	265	245	243	243
query57	1208	1126	1081	1081
query58	224	208	206	206
query59	3085	2919	2774	2774
query60	286	257	253	253
query61	151	131	110	110
query62	778	684	666	666
query63	223	195	207	195
query64	1403	703	677	677
query65	3301	3219	3207	3207
query66	697	310	299	299
query67	15771	15658	15465	15465
query68	4141	590	580	580
query69	419	268	273	268
query70	1139	1096	1101	1096
query71	345	260	276	260
query72	6351	3971	4031	3971
query73	777	342	372	342
query74	10226	9338	8984	8984
query75	3341	2644	2677	2644
query76	1927	1087	1120	1087
query77	511	291	288	288
query78	10532	9629	9482	9482
query79	1247	606	616	606
query80	803	439	452	439
query81	524	248	238	238
query82	431	94	91	91
query83	168	145	146	145
query84	284	83	76	76
query85	831	323	307	307
query86	336	284	294	284
query87	4459	4191	4289	4191
query88	3820	2388	2355	2355
query89	433	303	301	301
query90	2039	187	185	185
query91	176	149	149	149
query92	64	53	51	51
query93	1558	557	581	557
query94	778	295	298	295
query95	352	260	261	260
query96	607	282	277	277
query97	3275	3192	3125	3125
query98	213	209	204	204
query99	1595	1314	1297	1297
Total cold run time: 316915 ms
Total hot run time: 197029 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.86 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e6873588156e3c1628779012733d5e1b41c140e6, data reload: false

query1	0.03	0.04	0.03
query2	0.07	0.02	0.03
query3	0.24	0.06	0.06
query4	1.62	0.10	0.10
query5	0.52	0.50	0.52
query6	1.14	0.74	0.73
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.58	0.52	0.49
query10	0.54	0.56	0.54
query11	0.15	0.11	0.11
query12	0.14	0.11	0.11
query13	0.64	0.61	0.59
query14	2.72	2.76	2.76
query15	0.90	0.83	0.82
query16	0.38	0.38	0.41
query17	1.07	0.96	1.06
query18	0.25	0.22	0.22
query19	1.97	1.84	2.07
query20	0.01	0.01	0.01
query21	15.37	0.59	0.58
query22	2.42	1.92	2.28
query23	16.92	1.07	0.93
query24	3.42	1.91	0.38
query25	0.21	0.16	0.04
query26	0.43	0.14	0.15
query27	0.05	0.04	0.06
query28	9.81	0.53	0.52
query29	12.61	3.28	3.24
query30	0.26	0.07	0.06
query31	2.86	0.40	0.38
query32	3.23	0.46	0.46
query33	2.98	3.04	3.05
query34	16.85	4.51	4.55
query35	4.61	4.56	4.53
query36	0.71	0.50	0.47
query37	0.09	0.07	0.06
query38	0.05	0.04	0.03
query39	0.04	0.03	0.02
query40	0.17	0.12	0.13
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 106.27 s
Total hot run time: 31.86 s

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 42d6afa into branch-3.0 Apr 22, 2025
23 of 24 checks passed
@github-actions github-actions bot deleted the auto-pick-49569-branch-3.0 branch April 22, 2025 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants