Skip to content

Conversation

@sollhui
Copy link
Contributor

@sollhui sollhui commented May 4, 2025

What problem does this PR solve?

Introduce black list of backend when load job fetch meta to avoid jitter:

  1. Fetching meta operation would select one node randomly, If one node abnormal continuously, fetching meta operation will timeout and cause load speed jitter.

  2. When will one backend added to the blacklist:

  • Fetch meta RPC failed.
  • Retry to other backend success.
  1. When will one backend removed to the blacklist:
  • Two minutes automatic expiration.

Other improvement of fetching meta retry: will not choose be failed in the same request.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@sollhui
Copy link
Contributor Author

sollhui commented May 4, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33555 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3513a7cf474d7b9d52b99e45ad1e5865006c2ba8, data reload: false

------ Round 1 ----------------------------------
q1	26123	4988	4959	4959
q2	2055	272	175	175
q3	10407	1280	701	701
q4	10225	1003	517	517
q5	7507	2350	2276	2276
q6	182	160	135	135
q7	909	741	624	624
q8	9317	1262	1131	1131
q9	6906	5049	5036	5036
q10	6843	2287	1883	1883
q11	483	281	265	265
q12	345	379	210	210
q13	17795	3796	3043	3043
q14	223	223	211	211
q15	522	472	497	472
q16	424	433	376	376
q17	599	853	356	356
q18	7400	7244	7049	7049
q19	1559	977	537	537
q20	333	336	213	213
q21	3940	3338	2436	2436
q22	1023	966	950	950
Total cold run time: 115120 ms
Total hot run time: 33555 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5092	5039	5016	5016
q2	235	324	236	236
q3	2132	2636	2304	2304
q4	1351	1805	1382	1382
q5	4368	4402	4379	4379
q6	217	167	133	133
q7	2028	1945	1766	1766
q8	2583	2559	2551	2551
q9	7275	7267	6945	6945
q10	3014	3183	2724	2724
q11	579	486	476	476
q12	679	752	597	597
q13	3488	3898	3259	3259
q14	279	313	271	271
q15	519	485	479	479
q16	463	470	433	433
q17	1134	1529	1373	1373
q18	7831	7593	7407	7407
q19	827	828	877	828
q20	1978	2015	1819	1819
q21	5042	4759	4780	4759
q22	1096	1024	998	998
Total cold run time: 52210 ms
Total hot run time: 50135 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192319 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3513a7cf474d7b9d52b99e45ad1e5865006c2ba8, data reload: false

query1	1403	1102	1066	1066
query2	6298	1804	1816	1804
query3	11002	4519	4571	4519
query4	54574	26187	23345	23345
query5	5058	532	440	440
query6	323	210	208	208
query7	4870	518	294	294
query8	314	257	241	241
query9	5253	2565	2562	2562
query10	417	341	260	260
query11	14922	15147	14848	14848
query12	162	114	112	112
query13	1021	526	402	402
query14	10147	6374	6311	6311
query15	214	198	175	175
query16	7159	691	498	498
query17	1088	752	594	594
query18	1567	420	329	329
query19	202	208	176	176
query20	132	125	121	121
query21	210	132	107	107
query22	4353	4574	4297	4297
query23	34264	33494	33338	33338
query24	6580	2474	2418	2418
query25	474	482	399	399
query26	719	270	158	158
query27	2504	501	339	339
query28	2927	2121	2122	2121
query29	581	568	446	446
query30	271	230	192	192
query31	897	863	778	778
query32	74	65	64	64
query33	477	382	324	324
query34	801	886	532	532
query35	799	846	766	766
query36	941	1000	879	879
query37	112	100	85	85
query38	4213	4276	4143	4143
query39	1511	1451	1409	1409
query40	215	123	108	108
query41	53	54	53	53
query42	134	113	113	113
query43	499	511	492	492
query44	1306	801	803	801
query45	181	178	170	170
query46	853	1034	668	668
query47	1858	1879	1778	1778
query48	399	422	314	314
query49	671	519	432	432
query50	660	757	408	408
query51	4191	4265	4162	4162
query52	111	103	96	96
query53	229	261	199	199
query54	602	629	551	551
query55	83	79	81	79
query56	319	296	307	296
query57	1176	1205	1112	1112
query58	268	288	272	272
query59	2653	2763	2661	2661
query60	329	314	307	307
query61	132	131	130	130
query62	761	746	703	703
query63	226	192	188	188
query64	2066	1060	716	716
query65	4341	4271	4204	4204
query66	737	411	304	304
query67	15721	15534	15699	15534
query68	7442	877	501	501
query69	530	299	258	258
query70	1237	1096	1106	1096
query71	480	317	297	297
query72	5919	4799	4876	4799
query73	1339	628	351	351
query74	9294	9136	8595	8595
query75	3910	3201	2679	2679
query76	4129	1183	753	753
query77	657	448	284	284
query78	10132	10188	9254	9254
query79	2598	806	571	571
query80	649	511	446	446
query81	487	259	215	215
query82	628	129	99	99
query83	258	246	244	244
query84	303	110	86	86
query85	797	354	364	354
query86	371	290	276	276
query87	4341	4360	4326	4326
query88	3548	2205	2185	2185
query89	407	312	274	274
query90	1804	202	208	202
query91	157	144	115	115
query92	70	61	58	58
query93	2343	930	579	579
query94	644	402	304	304
query95	373	294	286	286
query96	488	555	275	275
query97	3173	3254	3116	3116
query98	235	206	199	199
query99	1371	1400	1291	1291
Total cold run time: 299722 ms
Total hot run time: 192319 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.15 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 3513a7cf474d7b9d52b99e45ad1e5865006c2ba8, data reload: false

query1	0.03	0.03	0.03
query2	0.12	0.11	0.10
query3	0.26	0.20	0.19
query4	1.59	0.20	0.10
query5	0.56	0.55	0.59
query6	1.17	0.72	0.72
query7	0.02	0.02	0.02
query8	0.04	0.03	0.04
query9	0.59	0.52	0.52
query10	0.56	0.57	0.56
query11	0.15	0.11	0.11
query12	0.15	0.11	0.12
query13	0.62	0.59	0.60
query14	0.78	0.81	0.79
query15	0.87	0.85	0.84
query16	0.37	0.39	0.38
query17	0.99	1.03	1.06
query18	0.21	0.19	0.20
query19	1.94	1.81	1.76
query20	0.01	0.02	0.01
query21	15.39	0.91	0.56
query22	0.74	1.24	0.60
query23	14.96	1.39	0.62
query24	6.97	1.18	1.00
query25	0.50	0.38	0.10
query26	0.51	0.15	0.14
query27	0.05	0.06	0.05
query28	10.18	0.91	0.45
query29	12.54	3.96	3.26
query30	0.25	0.10	0.08
query31	2.82	0.60	0.38
query32	3.23	0.55	0.46
query33	3.03	3.12	3.01
query34	15.73	5.10	4.49
query35	4.48	4.50	4.49
query36	0.68	0.49	0.50
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.03	0.03
query40	0.16	0.14	0.14
query41	0.08	0.03	0.02
query42	0.04	0.02	0.04
query43	0.04	0.03	0.03
Total cold run time: 103.58 s
Total hot run time: 29.15 s

@sollhui
Copy link
Contributor Author

sollhui commented May 6, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33544 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d0ca34d5f121e0e1cba4ecaa5dcf2d7d14949b15, data reload: false

------ Round 1 ----------------------------------
q1	25797	5173	5037	5037
q2	2075	283	181	181
q3	10391	1238	659	659
q4	10228	1008	530	530
q5	7537	2235	2363	2235
q6	180	167	131	131
q7	898	804	638	638
q8	9327	1290	1073	1073
q9	6890	5075	5016	5016
q10	6850	2270	1876	1876
q11	470	277	263	263
q12	345	351	222	222
q13	17777	3692	3060	3060
q14	245	230	210	210
q15	529	489	483	483
q16	417	420	365	365
q17	593	844	359	359
q18	7450	7115	7012	7012
q19	2049	985	567	567
q20	353	343	226	226
q21	4163	3246	2439	2439
q22	1022	989	962	962
Total cold run time: 115586 ms
Total hot run time: 33544 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5227	5046	5088	5046
q2	228	323	228	228
q3	2129	2575	2257	2257
q4	1315	1771	1357	1357
q5	4405	4411	4456	4411
q6	212	171	127	127
q7	2005	1888	1787	1787
q8	2580	2527	2509	2509
q9	7206	7069	7003	7003
q10	3056	3214	2730	2730
q11	573	505	511	505
q12	666	777	630	630
q13	3557	3896	3235	3235
q14	285	323	272	272
q15	515	476	473	473
q16	430	484	446	446
q17	1150	1559	1342	1342
q18	7677	7577	7355	7355
q19	825	787	837	787
q20	2001	2048	1866	1866
q21	4995	4733	4699	4699
q22	1078	1030	1006	1006
Total cold run time: 52115 ms
Total hot run time: 50071 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191777 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d0ca34d5f121e0e1cba4ecaa5dcf2d7d14949b15, data reload: false

query1	1410	1108	1034	1034
query2	6344	1803	1826	1803
query3	11013	4597	4400	4400
query4	54775	24885	23050	23050
query5	5061	593	472	472
query6	340	207	200	200
query7	4879	517	287	287
query8	306	261	239	239
query9	5219	2578	2578	2578
query10	445	330	255	255
query11	14908	15047	14822	14822
query12	166	108	99	99
query13	1022	525	411	411
query14	11004	6333	6473	6333
query15	211	198	166	166
query16	7182	652	529	529
query17	1097	741	611	611
query18	1629	411	343	343
query19	193	185	166	166
query20	121	133	122	122
query21	204	126	108	108
query22	4298	4452	4237	4237
query23	34097	33265	33521	33265
query24	6025	2453	2480	2453
query25	486	477	404	404
query26	708	291	152	152
query27	1870	488	342	342
query28	2624	2124	2084	2084
query29	580	559	460	460
query30	269	217	186	186
query31	831	868	777	777
query32	71	61	64	61
query33	446	357	312	312
query34	790	857	528	528
query35	789	859	762	762
query36	934	980	902	902
query37	122	107	81	81
query38	4334	4265	4243	4243
query39	1478	1427	1484	1427
query40	207	119	109	109
query41	55	55	54	54
query42	123	109	106	106
query43	500	509	480	480
query44	1374	818	833	818
query45	180	173	171	171
query46	869	1034	651	651
query47	1838	1871	1765	1765
query48	387	416	322	322
query49	675	539	410	410
query50	684	707	411	411
query51	4236	4212	4137	4137
query52	108	113	101	101
query53	243	263	181	181
query54	606	597	514	514
query55	88	89	85	85
query56	317	300	321	300
query57	1155	1191	1102	1102
query58	264	264	266	264
query59	2729	2825	2802	2802
query60	337	322	314	314
query61	130	131	127	127
query62	731	776	688	688
query63	232	196	190	190
query64	1987	1054	727	727
query65	4373	4247	4212	4212
query66	736	397	304	304
query67	15769	15372	15385	15372
query68	7181	879	509	509
query69	536	300	264	264
query70	1140	1097	1056	1056
query71	504	373	286	286
query72	5781	4733	4590	4590
query73	1562	624	345	345
query74	9307	9063	8932	8932
query75	3958	3223	2721	2721
query76	4191	1218	779	779
query77	667	375	284	284
query78	10216	9992	9133	9133
query79	2751	810	568	568
query80	663	521	451	451
query81	482	258	216	216
query82	525	123	96	96
query83	252	247	231	231
query84	302	103	83	83
query85	770	364	312	312
query86	370	314	281	281
query87	4460	4361	4308	4308
query88	3501	2211	2217	2211
query89	404	326	348	326
query90	1788	204	212	204
query91	147	143	113	113
query92	75	64	61	61
query93	2136	956	569	569
query94	669	417	304	304
query95	369	288	285	285
query96	485	559	278	278
query97	3130	3192	3088	3088
query98	226	206	195	195
query99	1417	1397	1277	1277
Total cold run time: 299179 ms
Total hot run time: 191777 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.05 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d0ca34d5f121e0e1cba4ecaa5dcf2d7d14949b15, data reload: false

query1	0.04	0.04	0.03
query2	0.12	0.11	0.11
query3	0.25	0.20	0.19
query4	1.59	0.19	0.20
query5	0.60	0.58	0.59
query6	1.21	0.70	0.71
query7	0.02	0.02	0.02
query8	0.04	0.04	0.03
query9	0.59	0.52	0.52
query10	0.57	0.57	0.56
query11	0.16	0.11	0.11
query12	0.14	0.11	0.11
query13	0.61	0.59	0.59
query14	0.77	0.80	0.81
query15	0.87	0.86	0.84
query16	0.39	0.37	0.37
query17	1.07	1.04	1.01
query18	0.20	0.20	0.19
query19	1.88	1.76	1.79
query20	0.01	0.01	0.02
query21	15.39	0.88	0.55
query22	0.75	1.16	0.84
query23	14.72	1.39	0.62
query24	7.66	1.18	0.68
query25	0.51	0.12	0.21
query26	0.54	0.16	0.15
query27	0.06	0.05	0.04
query28	9.61	0.88	0.44
query29	12.56	4.00	3.27
query30	0.26	0.08	0.06
query31	2.83	0.60	0.37
query32	3.21	0.54	0.45
query33	3.08	2.99	3.00
query34	15.77	5.10	4.45
query35	4.51	4.50	4.49
query36	0.67	0.49	0.48
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.03	0.02
query40	0.17	0.14	0.13
query41	0.08	0.03	0.03
query42	0.03	0.03	0.02
query43	0.04	0.03	0.03
Total cold run time: 103.75 s
Total hot run time: 29.05 s

@sollhui
Copy link
Contributor Author

sollhui commented May 8, 2025

run buildall

@sollhui
Copy link
Contributor Author

sollhui commented May 8, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33676 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit b4d9dce7e3747193b2c69c4c3f1ac9851772e8bd, data reload: false

------ Round 1 ----------------------------------
q1	25896	4990	4986	4986
q2	2058	286	180	180
q3	10411	1254	695	695
q4	10226	994	511	511
q5	7513	2329	2288	2288
q6	181	159	132	132
q7	902	740	637	637
q8	9324	1230	1085	1085
q9	6738	5056	5062	5056
q10	6833	2307	1881	1881
q11	475	277	277	277
q12	350	359	211	211
q13	17785	3677	3062	3062
q14	221	231	221	221
q15	544	485	481	481
q16	433	425	372	372
q17	614	860	362	362
q18	7393	7071	7185	7071
q19	1894	933	574	574
q20	339	330	219	219
q21	3776	3461	2415	2415
q22	1021	1023	960	960
Total cold run time: 114927 ms
Total hot run time: 33676 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5155	5053	5064	5053
q2	231	316	227	227
q3	2141	2668	2317	2317
q4	1343	1774	1317	1317
q5	4499	4396	4426	4396
q6	215	170	131	131
q7	1967	1934	1753	1753
q8	2591	2454	2464	2454
q9	7212	7237	6853	6853
q10	3057	3188	2740	2740
q11	560	515	493	493
q12	701	755	577	577
q13	3482	3919	3250	3250
q14	301	291	272	272
q15	526	502	476	476
q16	455	515	442	442
q17	1133	1552	1383	1383
q18	7722	7592	7363	7363
q19	798	839	990	839
q20	1991	2039	1854	1854
q21	5050	4814	4712	4712
q22	1061	1047	1023	1023
Total cold run time: 52191 ms
Total hot run time: 49925 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192259 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b4d9dce7e3747193b2c69c4c3f1ac9851772e8bd, data reload: false

query1	1403	1096	1058	1058
query2	6224	1781	1762	1762
query3	10995	4737	4446	4446
query4	53969	24577	23056	23056
query5	5189	582	457	457
query6	371	208	200	200
query7	4968	501	297	297
query8	331	256	249	249
query9	5997	2583	2574	2574
query10	454	321	287	287
query11	15142	15074	14951	14951
query12	159	111	105	105
query13	1100	511	421	421
query14	10209	6360	6216	6216
query15	209	216	197	197
query16	7057	671	446	446
query17	1086	746	598	598
query18	1560	417	330	330
query19	213	196	174	174
query20	134	125	116	116
query21	205	172	107	107
query22	4276	4467	4297	4297
query23	34355	33467	33550	33467
query24	6659	2460	2463	2460
query25	476	474	407	407
query26	709	274	154	154
query27	2237	498	342	342
query28	2918	2124	2120	2120
query29	581	565	425	425
query30	271	229	184	184
query31	816	879	791	791
query32	72	70	64	64
query33	456	357	305	305
query34	787	863	527	527
query35	853	852	730	730
query36	938	1009	921	921
query37	111	106	77	77
query38	4227	4266	4342	4266
query39	1497	1420	1418	1418
query40	210	125	110	110
query41	56	52	57	52
query42	135	110	111	110
query43	476	522	500	500
query44	1345	817	821	817
query45	179	176	173	173
query46	843	1030	671	671
query47	1904	1865	1827	1827
query48	398	410	300	300
query49	717	519	414	414
query50	657	692	414	414
query51	4274	4187	4166	4166
query52	107	111	106	106
query53	228	254	191	191
query54	605	584	535	535
query55	83	96	83	83
query56	307	323	292	292
query57	1154	1184	1149	1149
query58	260	257	259	257
query59	2755	2699	2663	2663
query60	336	330	308	308
query61	136	159	133	133
query62	730	739	687	687
query63	234	192	191	191
query64	1875	1030	733	733
query65	4400	4247	4226	4226
query66	722	397	301	301
query67	15689	15821	15270	15270
query68	6986	895	514	514
query69	535	303	271	271
query70	1170	1081	1050	1050
query71	507	320	301	301
query72	5800	4959	4893	4893
query73	1376	660	351	351
query74	8867	9097	8700	8700
query75	4015	3195	2734	2734
query76	4298	1180	741	741
query77	686	371	292	292
query78	10028	10078	9234	9234
query79	2194	860	562	562
query80	585	502	437	437
query81	490	258	219	219
query82	466	123	97	97
query83	257	251	225	225
query84	296	108	80	80
query85	795	374	320	320
query86	370	304	280	280
query87	4377	4411	4276	4276
query88	3752	2233	2270	2233
query89	406	320	271	271
query90	1853	220	213	213
query91	147	145	124	124
query92	85	62	58	58
query93	1734	950	581	581
query94	676	418	305	305
query95	384	302	290	290
query96	489	554	270	270
query97	3170	3216	3190	3190
query98	227	204	197	197
query99	1705	1390	1269	1269
Total cold run time: 298755 ms
Total hot run time: 192259 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.22 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit b4d9dce7e3747193b2c69c4c3f1ac9851772e8bd, data reload: false

query1	0.04	0.03	0.03
query2	0.13	0.10	0.12
query3	0.25	0.19	0.18
query4	1.59	0.20	0.11
query5	0.57	0.56	0.55
query6	1.17	0.71	0.71
query7	0.02	0.02	0.01
query8	0.04	0.03	0.03
query9	0.56	0.52	0.51
query10	0.57	0.58	0.55
query11	0.15	0.10	0.11
query12	0.14	0.12	0.12
query13	0.62	0.59	0.59
query14	0.78	0.81	0.80
query15	0.87	0.84	0.89
query16	0.38	0.37	0.40
query17	1.01	1.02	1.00
query18	0.21	0.20	0.19
query19	1.92	1.82	1.82
query20	0.02	0.01	0.01
query21	15.43	0.89	0.55
query22	0.77	1.17	0.76
query23	14.83	1.42	0.60
query24	6.95	1.69	0.98
query25	0.50	0.14	0.08
query26	0.71	0.18	0.14
query27	0.05	0.05	0.05
query28	10.12	0.89	0.44
query29	12.52	3.89	3.26
query30	0.27	0.09	0.06
query31	2.82	0.59	0.38
query32	3.23	0.55	0.46
query33	3.06	3.05	3.02
query34	15.86	5.03	4.52
query35	4.49	4.49	4.51
query36	0.69	0.49	0.48
query37	0.08	0.06	0.06
query38	0.06	0.04	0.04
query39	0.03	0.03	0.03
query40	0.17	0.15	0.13
query41	0.08	0.02	0.02
query42	0.04	0.03	0.02
query43	0.04	0.04	0.03
Total cold run time: 103.84 s
Total hot run time: 29.22 s

liaoxin01
liaoxin01 previously approved these changes May 8, 2025
Copy link
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

github-actions bot commented May 8, 2025

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels May 8, 2025
@github-actions
Copy link
Contributor

github-actions bot commented May 8, 2025

PR approved by anyone and no changes requested.

@sollhui
Copy link
Contributor Author

sollhui commented May 12, 2025

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label May 12, 2025
@sollhui
Copy link
Contributor Author

sollhui commented May 14, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33884 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e6183d20c13e1abd4fa49e9a32a7ef49cbaf92d4, data reload: false

------ Round 1 ----------------------------------
q1	26535	5071	5089	5071
q2	2079	310	193	193
q3	10355	1306	706	706
q4	10235	997	536	536
q5	7514	2323	2368	2323
q6	184	161	134	134
q7	917	748	632	632
q8	9330	1337	1114	1114
q9	6880	4975	5014	4975
q10	6868	2348	1916	1916
q11	479	284	281	281
q12	355	359	222	222
q13	17761	3669	3053	3053
q14	243	230	215	215
q15	544	495	484	484
q16	431	436	388	388
q17	598	867	364	364
q18	7558	7208	7111	7111
q19	1452	972	575	575
q20	336	332	219	219
q21	4012	3234	2398	2398
q22	1033	990	974	974
Total cold run time: 115699 ms
Total hot run time: 33884 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5163	5271	5091	5091
q2	244	328	227	227
q3	2185	2659	2353	2353
q4	1403	1808	1350	1350
q5	4519	4441	4450	4441
q6	221	172	128	128
q7	2023	1994	1780	1780
q8	2625	2554	2572	2554
q9	7197	7144	7034	7034
q10	3049	3202	2779	2779
q11	580	510	515	510
q12	720	780	630	630
q13	3471	3913	3366	3366
q14	287	304	274	274
q15	541	482	460	460
q16	460	485	452	452
q17	1195	1552	1399	1399
q18	7746	7587	7478	7478
q19	870	814	853	814
q20	1912	2008	1829	1829
q21	5007	4481	4484	4481
q22	1063	1066	1005	1005
Total cold run time: 52481 ms
Total hot run time: 50435 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 194463 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e6183d20c13e1abd4fa49e9a32a7ef49cbaf92d4, data reload: false

query1	1395	1105	1055	1055
query2	6326	1870	1837	1837
query3	10953	4465	4570	4465
query4	57451	25764	23365	23365
query5	5189	479	479	479
query6	394	202	186	186
query7	5273	500	294	294
query8	329	254	245	245
query9	7152	2655	2675	2655
query10	421	334	264	264
query11	15139	15011	14844	14844
query12	171	112	102	102
query13	1274	539	420	420
query14	10034	6272	6337	6272
query15	211	220	172	172
query16	7047	627	488	488
query17	1076	751	574	574
query18	1527	402	307	307
query19	198	194	171	171
query20	122	122	121	121
query21	205	124	113	113
query22	4600	4609	4640	4609
query23	34129	33667	33562	33562
query24	6633	2433	2461	2433
query25	474	476	433	433
query26	675	273	152	152
query27	2232	508	345	345
query28	2917	2208	2184	2184
query29	618	596	460	460
query30	279	236	196	196
query31	861	869	789	789
query32	79	65	70	65
query33	471	387	323	323
query34	761	854	545	545
query35	803	844	740	740
query36	958	997	897	897
query37	110	106	83	83
query38	4286	4322	4351	4322
query39	1504	1461	1465	1461
query40	224	123	114	114
query41	62	64	58	58
query42	131	125	114	114
query43	520	514	502	502
query44	1360	846	863	846
query45	189	177	173	173
query46	907	1033	660	660
query47	1867	1885	1855	1855
query48	400	427	323	323
query49	693	507	426	426
query50	654	690	414	414
query51	4263	4276	4208	4208
query52	112	106	111	106
query53	228	256	190	190
query54	597	599	541	541
query55	97	81	82	81
query56	318	300	293	293
query57	1222	1203	1142	1142
query58	274	270	257	257
query59	2794	2881	2819	2819
query60	329	356	351	351
query61	137	125	120	120
query62	745	742	681	681
query63	240	191	193	191
query64	1412	979	703	703
query65	4342	4248	4257	4248
query66	733	393	358	358
query67	15961	15751	15561	15561
query68	7378	897	533	533
query69	556	301	271	271
query70	1185	1131	1111	1111
query71	495	333	299	299
query72	5731	4916	5089	4916
query73	1210	699	362	362
query74	8955	9097	8868	8868
query75	3702	3252	2741	2741
query76	4255	1207	753	753
query77	634	366	285	285
query78	10017	10083	9317	9317
query79	2297	818	582	582
query80	662	501	435	435
query81	475	262	219	219
query82	449	132	98	98
query83	405	252	234	234
query84	291	107	83	83
query85	858	352	302	302
query86	403	312	271	271
query87	4461	4494	4395	4395
query88	3572	2323	2299	2299
query89	398	308	282	282
query90	1915	204	218	204
query91	144	141	114	114
query92	72	58	60	58
query93	1803	944	588	588
query94	653	420	297	297
query95	363	292	283	283
query96	499	569	281	281
query97	3201	3216	3143	3143
query98	238	216	201	201
query99	1433	1445	1275	1275
Total cold run time: 303798 ms
Total hot run time: 194463 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.39 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e6183d20c13e1abd4fa49e9a32a7ef49cbaf92d4, data reload: false

query1	0.04	0.04	0.03
query2	0.12	0.11	0.11
query3	0.27	0.19	0.19
query4	1.59	0.19	0.19
query5	0.59	0.59	0.58
query6	1.19	0.73	0.72
query7	0.02	0.01	0.01
query8	0.04	0.03	0.04
query9	0.57	0.52	0.51
query10	0.56	0.58	0.57
query11	0.16	0.12	0.11
query12	0.15	0.12	0.12
query13	0.63	0.60	0.60
query14	0.78	0.83	0.80
query15	0.88	0.86	0.85
query16	0.38	0.38	0.38
query17	1.04	1.04	1.08
query18	0.23	0.21	0.20
query19	1.92	1.82	1.76
query20	0.01	0.01	0.01
query21	15.41	0.87	0.53
query22	0.75	1.16	0.70
query23	14.95	1.39	0.58
query24	6.68	1.36	0.96
query25	0.49	0.25	0.07
query26	0.49	0.17	0.15
query27	0.05	0.05	0.05
query28	9.42	0.89	0.45
query29	12.95	4.07	3.33
query30	0.25	0.09	0.07
query31	2.81	0.59	0.39
query32	3.23	0.55	0.47
query33	3.07	3.09	3.05
query34	15.82	5.05	4.49
query35	4.53	4.52	4.50
query36	0.65	0.51	0.48
query37	0.09	0.06	0.06
query38	0.06	0.03	0.03
query39	0.03	0.03	0.02
query40	0.16	0.14	0.13
query41	0.09	0.03	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 103.23 s
Total hot run time: 29.39 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/3) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 55.81% (14901/26698)
Line Coverage 44.61% (131928/295753)
Region Coverage 43.66% (66323/151924)
Branch Coverage 38.26% (33979/88800)

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label May 15, 2025
@sollhui
Copy link
Contributor Author

sollhui commented May 15, 2025

run buildall

@sollhui
Copy link
Contributor Author

sollhui commented May 15, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34042 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f725a4b43f6f95b20bc2b1b808be312e1e29682c, data reload: false

------ Round 1 ----------------------------------
q1	26553	5046	5326	5046
q2	2089	278	190	190
q3	10389	1241	683	683
q4	10218	1003	515	515
q5	7529	2400	2370	2370
q6	180	164	138	138
q7	923	741	608	608
q8	9297	1273	1136	1136
q9	6809	5052	5077	5052
q10	6822	2313	1902	1902
q11	477	288	266	266
q12	351	352	219	219
q13	17765	3722	3115	3115
q14	232	237	211	211
q15	528	489	488	488
q16	419	435	384	384
q17	607	875	354	354
q18	7688	7283	7257	7257
q19	1909	964	573	573
q20	319	338	220	220
q21	3576	3170	2345	2345
q22	1023	970	988	970
Total cold run time: 115703 ms
Total hot run time: 34042 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5205	5110	5077	5077
q2	235	323	229	229
q3	2171	2668	2327	2327
q4	1339	1861	1424	1424
q5	4486	4440	4422	4422
q6	220	167	126	126
q7	1975	1949	1792	1792
q8	2576	2669	2525	2525
q9	7251	7035	7122	7035
q10	3007	3205	2807	2807
q11	595	514	510	510
q12	688	784	609	609
q13	3524	3950	3339	3339
q14	302	295	283	283
q15	528	480	480	480
q16	453	479	451	451
q17	1187	1584	1390	1390
q18	7709	7505	7418	7418
q19	873	838	1019	838
q20	1956	1986	1868	1868
q21	5035	4513	4372	4372
q22	1100	1046	1044	1044
Total cold run time: 52415 ms
Total hot run time: 50366 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 194342 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f725a4b43f6f95b20bc2b1b808be312e1e29682c, data reload: false

query1	1414	1098	1134	1098
query2	6187	1926	1882	1882
query3	11028	4517	4556	4517
query4	55793	25787	23386	23386
query5	5168	479	485	479
query6	348	207	210	207
query7	4918	500	298	298
query8	337	261	248	248
query9	6163	2683	2683	2683
query10	453	347	276	276
query11	15012	14969	14796	14796
query12	164	111	105	105
query13	1068	535	442	442
query14	10191	6169	6576	6169
query15	216	188	195	188
query16	7160	658	482	482
query17	1074	730	585	585
query18	1595	408	306	306
query19	197	195	162	162
query20	124	119	122	119
query21	207	132	108	108
query22	4373	4467	4363	4363
query23	34254	33661	33521	33521
query24	6621	2469	2460	2460
query25	505	503	443	443
query26	718	280	163	163
query27	2279	514	350	350
query28	3089	2170	2160	2160
query29	595	612	454	454
query30	278	222	194	194
query31	860	880	798	798
query32	82	66	67	66
query33	482	382	322	322
query34	844	869	555	555
query35	790	829	752	752
query36	931	988	894	894
query37	109	95	71	71
query38	4268	4332	4233	4233
query39	1575	1480	1447	1447
query40	218	124	107	107
query41	54	56	55	55
query42	131	118	109	109
query43	524	539	504	504
query44	1332	822	838	822
query45	186	181	167	167
query46	860	1057	647	647
query47	1866	1846	1794	1794
query48	399	423	327	327
query49	693	550	450	450
query50	686	726	416	416
query51	4234	4297	4453	4297
query52	115	112	104	104
query53	234	263	183	183
query54	622	598	532	532
query55	87	88	86	86
query56	303	316	303	303
query57	1171	1217	1146	1146
query58	269	266	256	256
query59	2718	2888	2719	2719
query60	346	333	317	317
query61	128	131	121	121
query62	749	776	668	668
query63	228	193	194	193
query64	1546	1025	672	672
query65	4329	4222	4224	4222
query66	718	411	306	306
query67	15948	15901	15674	15674
query68	7732	899	541	541
query69	539	319	287	287
query70	1203	1132	1123	1123
query71	497	335	294	294
query72	5469	4827	4982	4827
query73	1417	695	369	369
query74	9002	9335	9006	9006
query75	3798	3275	2765	2765
query76	4243	1206	781	781
query77	632	454	288	288
query78	10159	10123	9334	9334
query79	3310	832	585	585
query80	675	532	445	445
query81	481	261	223	223
query82	496	131	95	95
query83	349	250	232	232
query84	294	110	83	83
query85	803	370	320	320
query86	416	316	290	290
query87	4480	4576	4454	4454
query88	3327	2305	2315	2305
query89	405	311	287	287
query90	1975	212	210	210
query91	145	145	115	115
query92	73	62	54	54
query93	1822	943	585	585
query94	673	418	283	283
query95	369	292	287	287
query96	515	572	283	283
query97	3208	3228	3109	3109
query98	233	218	197	197
query99	1465	1390	1306	1306
Total cold run time: 302369 ms
Total hot run time: 194342 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.36 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f725a4b43f6f95b20bc2b1b808be312e1e29682c, data reload: false

query1	0.04	0.04	0.03
query2	0.13	0.11	0.11
query3	0.26	0.19	0.20
query4	1.59	0.20	0.20
query5	0.60	0.59	0.60
query6	1.20	0.73	0.73
query7	0.02	0.02	0.01
query8	0.04	0.04	0.04
query9	0.58	0.54	0.53
query10	0.57	0.59	0.58
query11	0.15	0.11	0.12
query12	0.15	0.12	0.12
query13	0.61	0.60	0.59
query14	0.80	0.80	0.81
query15	0.89	0.86	0.85
query16	0.40	0.39	0.38
query17	1.04	1.00	1.06
query18	0.23	0.22	0.22
query19	1.90	1.80	1.81
query20	0.02	0.02	0.01
query21	15.40	0.91	0.55
query22	0.76	1.20	1.03
query23	14.69	1.38	0.64
query24	7.12	1.36	0.45
query25	0.49	0.30	0.11
query26	0.56	0.15	0.12
query27	0.05	0.05	0.05
query28	9.59	0.89	0.47
query29	12.65	3.95	3.31
query30	0.25	0.09	0.07
query31	2.81	0.59	0.39
query32	3.23	0.54	0.47
query33	3.03	3.06	3.18
query34	15.76	5.03	4.49
query35	4.48	4.60	4.47
query36	0.67	0.50	0.48
query37	0.09	0.07	0.06
query38	0.06	0.04	0.03
query39	0.04	0.03	0.02
query40	0.19	0.14	0.13
query41	0.08	0.03	0.02
query42	0.03	0.03	0.02
query43	0.03	0.03	0.03
Total cold run time: 103.28 s
Total hot run time: 29.36 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/3) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 55.81% (14901/26698)
Line Coverage 44.60% (131918/295753)
Region Coverage 43.65% (66318/151924)
Branch Coverage 38.26% (33978/88800)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (3/3) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.28% (20823/26264)
Line Coverage 72.49% (214346/295689)
Region Coverage 70.67% (126090/178430)
Branch Coverage 64.41% (65320/101412)

Copy link
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 16, 2025
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 17a6e67 into apache:master May 19, 2025
24 of 26 checks passed
github-actions bot pushed a commit that referenced this pull request May 19, 2025
…eta to avoid jitter (#50587)

### What problem does this PR solve?

Introduce black list of backend when load job fetch meta to avoid
jitter:

1. Fetching meta operation would select one node randomly, If one node
abnormal continuously, fetching meta operation will timeout and cause
load speed jitter.

2. When will one backend added to the blacklist:

- Fetch meta RPC failed.
- Retry to other backend success.

3. When will one backend removed to the blacklist:

- Two minutes automatic expiration.

Other improvement of fetching meta retry: will not choose be failed in
the same request.
dataroaring pushed a commit that referenced this pull request May 22, 2025
… job fetch meta to avoid jitter #50587 (#51043)

Cherry-picked from #50587

Co-authored-by: hui lai <laihui@selectdb.com>
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…eta to avoid jitter (apache#50587)

### What problem does this PR solve?

Introduce black list of backend when load job fetch meta to avoid
jitter:

1. Fetching meta operation would select one node randomly, If one node
abnormal continuously, fetching meta operation will timeout and cause
load speed jitter.

2. When will one backend added to the blacklist:

- Fetch meta RPC failed.
- Retry to other backend success.

3. When will one backend removed to the blacklist:

- Two minutes automatic expiration.

Other improvement of fetching meta retry: will not choose be failed in
the same request.
@gavinchou gavinchou mentioned this pull request Jun 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.x dev/2.1.x-conflict dev/3.0.6-merged reviewed usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants