Skip to content

Conversation

@zy-kkk
Copy link
Member

@zy-kkk zy-kkk commented Nov 20, 2025

What problem does this PR solve?

For Hive tables with massive partitions (10K+), INSERT operations are extremely slow because:

  • FE fetches all partition metadata from HMS directly (expensive RPC calls)
  • Full table cache invalidation after each insert (unnecessary)

Problem Summary:

  1. Use cache for partition metadata in INSERT
  • FE now fetches partition info from cache instead of directly querying HMS when preparing INSERT
  • Avoid expensive HMS RPC calls for every INSERT operation
  1. Selective cache refresh after commit
  • Only invalidate affected partitions instead of full table cache
  • Based on partition update info from BE (NEW/APPEND/OVERWRITE)
  • Significantly reduces cache invalidation overhead
  1. Handle cache inconsistency gracefully
  • When BE marks partition as NEW but it already exists in HMS (cache miss)
  • FE detects this by checking HMS and treats it as APPEND instead of failing
  • Prevents AlreadyExistsException errors

For tables with partitions:

  • Before: HMS calls per INSERT + full cache invalidation
  • After: cache lookup + selective partition refresh
  • Expected speedup: 10x-100x for partition metadata fetching phas

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Nov 20, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@zy-kkk
Copy link
Member Author

zy-kkk commented Nov 20, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34362 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f37442a801d36436009ff8fdb6ea6bd9a9f56a09, data reload: false

------ Round 1 ----------------------------------
q1	17606	5039	4916	4916
q2	2037	320	200	200
q3	10266	1285	700	700
q4	10225	903	364	364
q5	7483	2400	2342	2342
q6	183	166	135	135
q7	950	767	626	626
q8	9354	1362	1113	1113
q9	7179	5364	5332	5332
q10	6900	2232	1792	1792
q11	494	292	280	280
q12	373	366	234	234
q13	17790	3661	3073	3073
q14	239	237	223	223
q15	581	522	505	505
q16	990	1007	943	943
q17	597	863	371	371
q18	7909	7277	7211	7211
q19	1101	947	546	546
q20	366	352	232	232
q21	3716	2547	2262	2262
q22	1078	1024	962	962
Total cold run time: 107417 ms
Total hot run time: 34362 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4968	4945	4943	4943
q2	338	420	313	313
q3	2202	2673	2314	2314
q4	1373	1780	1341	1341
q5	4220	4380	4529	4380
q6	211	165	124	124
q7	2031	1978	1882	1882
q8	2819	2630	2589	2589
q9	7646	7647	7531	7531
q10	3037	3232	2876	2876
q11	594	540	503	503
q12	687	783	648	648
q13	3772	3943	3334	3334
q14	286	306	278	278
q15	542	508	508	508
q16	1079	1102	1063	1063
q17	1205	1497	1429	1429
q18	7980	7685	7586	7586
q19	792	789	868	789
q20	1980	1966	1838	1838
q21	4769	4399	4370	4370
q22	1063	1041	998	998
Total cold run time: 53594 ms
Total hot run time: 51637 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188373 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f37442a801d36436009ff8fdb6ea6bd9a9f56a09, data reload: false

query1	1041	413	414	413
query2	6573	1692	1727	1692
query3	6755	230	222	222
query4	26026	22867	22984	22867
query5	4457	669	513	513
query6	337	234	220	220
query7	4648	505	311	311
query8	320	280	243	243
query9	8675	2935	2902	2902
query10	486	358	320	320
query11	15617	15030	14805	14805
query12	174	126	117	117
query13	1668	582	450	450
query14	10201	9197	9032	9032
query15	195	185	184	184
query16	7154	671	491	491
query17	1000	798	663	663
query18	1978	411	321	321
query19	210	199	178	178
query20	131	130	126	126
query21	216	134	116	116
query22	4016	4185	4016	4016
query23	33834	33064	32958	32958
query24	8211	2396	2375	2375
query25	633	533	460	460
query26	1241	270	167	167
query27	2759	496	362	362
query28	4430	2242	2216	2216
query29	858	656	556	556
query30	300	221	197	197
query31	910	820	755	755
query32	96	91	84	84
query33	620	417	364	364
query34	799	872	525	525
query35	829	838	777	777
query36	943	995	894	894
query37	135	128	102	102
query38	3523	3681	3501	3501
query39	1499	1442	1421	1421
query40	234	144	133	133
query41	69	67	64	64
query42	136	122	121	121
query43	461	488	467	467
query44	1257	813	802	802
query45	183	184	174	174
query46	892	992	662	662
query47	1736	1814	1704	1704
query48	411	448	335	335
query49	770	509	435	435
query50	667	690	414	414
query51	3853	4058	4082	4058
query52	119	122	113	113
query53	258	267	216	216
query54	358	356	330	330
query55	98	98	99	98
query56	378	377	363	363
query57	1163	1189	1114	1114
query58	313	295	289	289
query59	2530	2604	2540	2540
query60	384	386	362	362
query61	197	184	196	184
query62	786	716	693	693
query63	233	201	205	201
query64	4675	1260	871	871
query65	3998	3915	3934	3915
query66	1182	454	348	348
query67	15435	15186	14886	14886
query68	8266	956	642	642
query69	491	334	299	299
query70	1314	1312	1304	1304
query71	470	350	349	349
query72	6052	4882	4830	4830
query73	672	578	371	371
query74	8896	8835	8932	8835
query75	3875	3222	2777	2777
query76	3725	1163	768	768
query77	812	432	338	338
query78	9573	9433	8821	8821
query79	2397	904	621	621
query80	644	607	537	537
query81	514	260	226	226
query82	510	168	142	142
query83	262	272	249	249
query84	301	117	91	91
query85	945	493	455	455
query86	396	323	299	299
query87	3696	3686	3602	3602
query88	3845	2244	2237	2237
query89	378	321	304	304
query90	1880	240	230	230
query91	166	174	146	146
query92	93	78	74	74
query93	1862	1011	691	691
query94	728	428	332	332
query95	448	344	347	344
query96	481	570	283	283
query97	2898	2967	2874	2874
query98	262	225	215	215
query99	1620	1404	1279	1279
Total cold run time: 274746 ms
Total hot run time: 188373 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.48 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f37442a801d36436009ff8fdb6ea6bd9a9f56a09, data reload: false

query1	0.05	0.05	0.05
query2	0.10	0.05	0.05
query3	0.26	0.08	0.08
query4	1.60	0.11	0.11
query5	0.27	0.25	0.25
query6	1.18	0.64	0.65
query7	0.03	0.02	0.03
query8	0.06	0.04	0.04
query9	0.58	0.53	0.52
query10	0.57	0.58	0.58
query11	0.15	0.11	0.11
query12	0.15	0.12	0.12
query13	0.63	0.60	0.59
query14	1.02	0.99	1.01
query15	0.85	0.82	0.82
query16	0.40	0.41	0.39
query17	1.04	1.04	1.02
query18	0.21	0.20	0.20
query19	1.83	1.82	1.84
query20	0.01	0.01	0.01
query21	15.46	0.22	0.14
query22	5.06	0.06	0.04
query23	15.67	0.27	0.10
query24	3.12	0.65	0.41
query25	0.07	0.06	0.06
query26	0.15	0.14	0.13
query27	0.06	0.05	0.05
query28	3.53	1.17	0.95
query29	12.58	3.84	3.21
query30	0.28	0.14	0.11
query31	2.81	0.58	0.40
query32	3.23	0.55	0.47
query33	2.95	3.11	3.06
query34	15.66	5.15	4.55
query35	4.56	4.56	4.57
query36	0.69	0.50	0.49
query37	0.10	0.06	0.07
query38	0.07	0.04	0.04
query39	0.03	0.02	0.02
query40	0.16	0.16	0.13
query41	0.09	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 97.4 s
Total hot run time: 27.48 s

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes Hive INSERT performance on partitioned tables by leveraging cache instead of direct HMS queries and implementing selective cache refresh. The key problem addressed is that for tables with 10K+ partitions, INSERT operations are extremely slow due to expensive HMS RPC calls and full table cache invalidation.

Key changes:

  • Modified partition metadata retrieval during INSERT to use cache (HiveTableSink.java) instead of direct HMS queries
  • Implemented selective partition cache refresh after commit based on BE update info (NEW/APPEND/OVERWRITE)
  • Added cache miss detection and graceful handling when BE marks partition as NEW but it already exists in HMS

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
test_hive_partitions.groovy Adds test case for cache miss scenario where Hive creates a partition that Doris then writes to
HiveTableSink.java Changes partition retrieval from HMS client to cache-based approach with profiling
HiveInsertExecutor.java Adds partition update tracking and selective cache refresh after commit
BaseExternalTableInsertExecutor.java Introduces doAfterCommit() hook with default full table refresh behavior
HiveMetaStoreCache.java Implements refreshAffectedPartitions() for selective partition cache invalidation
HMSTransaction.java Adds HMS existence check for NEW partitions to handle cache misses gracefully
SummaryProfile.java Adds profiling metrics for sink partition value setting operation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 87.50% (98/112) 🎉
Increment coverage report
Complete coverage report

boolean hasNewPartitions = false;

for (org.apache.doris.thrift.THivePartitionUpdate update : partitionUpdates) {
List<String> partitionValues = HiveUtil.toPartitionValues(update.getName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this line to case OVERWRITE:


// Refresh partition values cache if new partitions were created
if (hasNewPartitions) {
invalidatePartitionValuesCache(nameMapping);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just modify the cache value instead of invalidate it all?
Otherwise, for insert with new partition, it always invalidate all cache

LOG.info("Partition {} already exists in HMS (Doris cache miss), treating as APPEND",
pu.getName());
insertExistsPartitions.add(Pair.of(pu, hivePartitionStatistics));
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this else block is same as case OVERWRITE.
extract a method for it.
Take care of addPartition() and dropPartition()

if (partitionUpdates != null && !partitionUpdates.isEmpty()) {
HiveMetaStoreCache cache = Env.getCurrentEnv().getExtMetaCacheMgr()
.getMetaStoreCache((HMSExternalCatalog) ((HMSExternalTable) table).getCatalog());
cache.refreshAffectedPartitions((HMSExternalTable) table, partitionUpdates);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before, the Env.getCurrentEnv().getRefreshManager().handleRefreshTable() will write edit log, so that non-master FE will get the latest partition info.

for (org.apache.hadoop.hive.metastore.api.Partition partition : hivePartitions) {

// Get partitions from cache instead of HMS client (similar to HiveScanNode)
HiveMetaStoreCache cache = Env.getCurrentEnv().getExtMetaCacheMgr()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this line to if (targetTable.isPartitionedTable())

// Get all partition values from cache
List<Type> partitionColumnTypes = targetTable.getPartitionColumnTypes(
MvccUtil.getSnapshotFromContext(targetTable));
HiveMetaStoreCache.HivePartitionValues partitionValues =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase and use targetTable.getHivePartitionValues() directly

@zy-kkk zy-kkk force-pushed the hms_sink_partition branch from 2908fe8 to 6c6b4ca Compare November 24, 2025 03:59
@zy-kkk
Copy link
Member Author

zy-kkk commented Nov 24, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34900 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6c6b4ca7ab1660349b505067613960c32684a28c, data reload: false

------ Round 1 ----------------------------------
q1	17612	5129	4900	4900
q2	2014	320	208	208
q3	10247	1277	736	736
q4	10224	905	373	373
q5	7543	2348	2323	2323
q6	182	173	141	141
q7	920	797	628	628
q8	9361	1338	1059	1059
q9	7053	5490	5497	5490
q10	6872	2250	1813	1813
q11	495	319	269	269
q12	338	361	227	227
q13	17802	3649	3013	3013
q14	224	233	209	209
q15	579	499	516	499
q16	1020	982	943	943
q17	594	856	367	367
q18	7338	7370	7707	7370
q19	1539	989	603	603
q20	356	364	237	237
q21	4012	3352	2434	2434
q22	1125	1058	1071	1058
Total cold run time: 107450 ms
Total hot run time: 34900 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5291	5151	5088	5088
q2	328	399	314	314
q3	2390	2938	2483	2483
q4	1426	1882	1368	1368
q5	4527	4466	4413	4413
q6	211	163	130	130
q7	1997	1941	1801	1801
q8	2593	2490	2505	2490
q9	7596	7619	7404	7404
q10	2866	3073	2653	2653
q11	563	518	478	478
q12	616	697	585	585
q13	3275	3686	3024	3024
q14	259	301	258	258
q15	537	508	482	482
q16	1012	1031	999	999
q17	1096	1455	1322	1322
q18	7313	7157	7069	7069
q19	743	762	797	762
q20	1906	1934	1860	1860
q21	4778	4517	4224	4224
q22	1086	1035	980	980
Total cold run time: 52409 ms
Total hot run time: 50187 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187095 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6c6b4ca7ab1660349b505067613960c32684a28c, data reload: false

query1	1057	440	390	390
query2	6563	1713	1683	1683
query3	6755	221	220	220
query4	25875	23097	22923	22923
query5	4395	614	470	470
query6	330	239	216	216
query7	4662	490	294	294
query8	313	247	244	244
query9	8708	2625	2605	2605
query10	480	336	292	292
query11	15475	14943	14863	14863
query12	177	117	111	111
query13	1677	577	443	443
query14	10828	9159	9125	9125
query15	201	184	172	172
query16	7247	662	487	487
query17	1233	740	657	657
query18	1974	463	318	318
query19	212	204	173	173
query20	136	120	119	119
query21	218	134	115	115
query22	3916	4164	4010	4010
query23	33759	33073	32864	32864
query24	8523	2401	2352	2352
query25	670	521	432	432
query26	1240	270	167	167
query27	2745	490	370	370
query28	4415	2205	2188	2188
query29	844	627	481	481
query30	304	228	205	205
query31	910	799	733	733
query32	86	77	82	77
query33	601	381	319	319
query34	800	856	515	515
query35	798	831	739	739
query36	944	989	880	880
query37	117	112	88	88
query38	3525	3527	3495	3495
query39	1467	1418	1392	1392
query40	225	129	118	118
query41	65	62	63	62
query42	125	119	110	110
query43	486	484	470	470
query44	1263	769	779	769
query45	187	180	180	180
query46	878	1002	646	646
query47	1739	1769	1712	1712
query48	415	438	341	341
query49	809	505	425	425
query50	674	701	426	426
query51	4032	3905	3860	3860
query52	114	114	111	111
query53	260	269	201	201
query54	322	304	295	295
query55	90	89	86	86
query56	334	340	343	340
query57	1176	1189	1127	1127
query58	294	286	279	279
query59	2519	2621	2509	2509
query60	355	373	358	358
query61	200	196	194	194
query62	793	706	669	669
query63	234	199	201	199
query64	4609	1153	851	851
query65	4010	3935	3934	3934
query66	1186	440	327	327
query67	15144	15076	14907	14907
query68	5793	920	629	629
query69	491	320	291	291
query70	1368	1294	1225	1225
query71	429	333	320	320
query72	6004	5051	4944	4944
query73	666	618	373	373
query74	8943	9037	8636	8636
query75	3313	3351	2856	2856
query76	3314	1155	758	758
query77	531	408	322	322
query78	9396	9540	8871	8871
query79	2393	809	621	621
query80	752	621	524	524
query81	519	265	230	230
query82	386	165	135	135
query83	277	271	262	262
query84	260	124	102	102
query85	1017	498	447	447
query86	383	315	310	310
query87	3750	3733	3651	3651
query88	3651	2226	2250	2226
query89	379	333	294	294
query90	1989	244	225	225
query91	178	164	134	134
query92	81	67	63	63
query93	2642	999	677	677
query94	698	446	338	338
query95	414	324	318	318
query96	476	582	280	280
query97	2890	2960	2853	2853
query98	241	209	209	209
query99	1337	1403	1238	1238
Total cold run time: 271679 ms
Total hot run time: 187095 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.39 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 6c6b4ca7ab1660349b505067613960c32684a28c, data reload: false

query1	0.06	0.05	0.04
query2	0.09	0.05	0.05
query3	0.26	0.09	0.08
query4	1.61	0.12	0.11
query5	0.27	0.25	0.26
query6	1.17	0.64	0.64
query7	0.03	0.02	0.02
query8	0.05	0.04	0.04
query9	0.57	0.53	0.52
query10	0.58	0.57	0.58
query11	0.16	0.11	0.11
query12	0.14	0.11	0.13
query13	0.61	0.61	0.62
query14	1.00	0.99	0.99
query15	0.85	0.83	0.83
query16	0.38	0.38	0.40
query17	0.99	1.02	1.03
query18	0.21	0.20	0.20
query19	1.91	1.78	1.80
query20	0.02	0.01	0.02
query21	15.47	0.19	0.14
query22	5.03	0.08	0.05
query23	15.66	0.26	0.10
query24	2.61	0.89	0.31
query25	0.07	0.07	0.06
query26	0.13	0.14	0.13
query27	0.07	0.06	0.05
query28	4.08	1.15	0.93
query29	12.56	3.93	3.29
query30	0.28	0.14	0.12
query31	2.82	0.58	0.38
query32	3.24	0.56	0.47
query33	2.98	3.04	3.03
query34	15.80	5.18	4.54
query35	4.53	4.57	4.60
query36	0.66	0.49	0.49
query37	0.10	0.06	0.07
query38	0.06	0.04	0.04
query39	0.04	0.03	0.03
query40	0.17	0.14	0.14
query41	0.09	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.04
Total cold run time: 97.49 s
Total hot run time: 27.39 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 26.27% (31/118) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 84.75% (100/118) 🎉
Increment coverage report
Complete coverage report

@zy-kkk
Copy link
Member Author

zy-kkk commented Nov 25, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33961 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f988f83353e0a7e0ee3d3447cef9e208469e17e8, data reload: false

------ Round 1 ----------------------------------
q1	17731	5081	4926	4926
q2	2134	309	212	212
q3	10141	1296	711	711
q4	10230	948	367	367
q5	7554	2285	2411	2285
q6	188	166	133	133
q7	930	821	636	636
q8	9587	1374	1061	1061
q9	7161	5357	5274	5274
q10	6866	2242	1795	1795
q11	496	313	289	289
q12	343	354	218	218
q13	17774	3688	3027	3027
q14	232	238	233	233
q15	589	517	511	511
q16	1042	1020	958	958
q17	591	839	358	358
q18	7423	7175	6998	6998
q19	1411	959	530	530
q20	351	343	233	233
q21	3671	3141	2272	2272
q22	1047	1019	934	934
Total cold run time: 107492 ms
Total hot run time: 33961 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5235	4996	4925	4925
q2	336	396	297	297
q3	2168	2751	2271	2271
q4	1363	1756	1334	1334
q5	4230	4551	4519	4519
q6	219	176	137	137
q7	2041	1914	1862	1862
q8	2631	2541	2584	2541
q9	7510	7562	7488	7488
q10	2983	3332	2798	2798
q11	625	540	494	494
q12	703	779	615	615
q13	3470	3925	3514	3514
q14	283	316	278	278
q15	543	499	512	499
q16	1121	1108	1082	1082
q17	1160	1560	1436	1436
q18	7805	7639	7762	7639
q19	807	858	991	858
q20	2013	2112	1835	1835
q21	4739	4286	4281	4281
q22	1079	1090	1016	1016
Total cold run time: 53064 ms
Total hot run time: 51719 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184561 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f988f83353e0a7e0ee3d3447cef9e208469e17e8, data reload: false

query1	1099	408	393	393
query2	6594	1605	1614	1605
query3	6757	222	219	219
query4	25099	22692	22784	22692
query5	4391	605	470	470
query6	331	237	226	226
query7	4668	490	302	302
query8	306	260	257	257
query9	8752	2627	2620	2620
query10	544	352	317	317
query11	15422	14900	14621	14621
query12	226	118	114	114
query13	1699	585	462	462
query14	10640	8968	8825	8825
query15	211	205	186	186
query16	7251	682	524	524
query17	1268	724	613	613
query18	1988	422	322	322
query19	209	201	181	181
query20	125	122	119	119
query21	242	140	114	114
query22	3940	3991	3781	3781
query23	33051	32033	32078	32033
query24	8497	2430	2410	2410
query25	618	526	457	457
query26	1236	273	160	160
query27	2770	508	352	352
query28	4402	2190	2172	2172
query29	825	612	478	478
query30	332	235	206	206
query31	866	742	636	636
query32	87	73	68	68
query33	582	367	344	344
query34	787	864	535	535
query35	815	846	776	776
query36	930	959	866	866
query37	139	106	84	84
query38	3322	3367	3271	3271
query39	1459	1416	1405	1405
query40	233	128	119	119
query41	65	64	63	63
query42	127	115	114	114
query43	467	483	446	446
query44	1242	767	755	755
query45	211	197	188	188
query46	893	988	652	652
query47	1689	1757	1658	1658
query48	398	433	329	329
query49	795	498	434	434
query50	641	682	410	410
query51	3892	3930	3958	3930
query52	111	116	100	100
query53	241	263	187	187
query54	308	293	273	273
query55	86	86	82	82
query56	324	310	329	310
query57	1152	1187	1102	1102
query58	284	277	274	274
query59	2438	2507	2414	2414
query60	349	362	336	336
query61	167	157	162	157
query62	799	718	667	667
query63	226	190	208	190
query64	4591	1234	890	890
query65	4056	3953	3999	3953
query66	1170	435	324	324
query67	15267	15266	14848	14848
query68	4710	914	658	658
query69	525	343	305	305
query70	1300	1240	1157	1157
query71	399	341	316	316
query72	6169	5090	5072	5072
query73	653	581	378	378
query74	8534	8503	8615	8503
query75	3320	3364	2848	2848
query76	3257	1120	725	725
query77	516	410	342	342
query78	9747	9747	8848	8848
query79	2286	833	603	603
query80	1658	579	521	521
query81	567	271	241	241
query82	488	160	131	131
query83	349	264	264	264
query84	253	120	103	103
query85	935	490	439	439
query86	470	301	306	301
query87	3487	3505	3330	3330
query88	2827	2274	2278	2274
query89	388	330	293	293
query90	1838	224	220	220
query91	180	172	145	145
query92	80	65	59	59
query93	2087	975	656	656
query94	791	443	339	339
query95	491	407	401	401
query96	496	576	278	278
query97	2888	2963	2872	2872
query98	231	212	211	211
query99	1276	1397	1307	1307
Total cold run time: 267898 ms
Total hot run time: 184561 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.74 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f988f83353e0a7e0ee3d3447cef9e208469e17e8, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.04
query3	0.26	0.09	0.08
query4	1.60	0.12	0.11
query5	0.27	0.26	0.26
query6	1.20	0.64	0.66
query7	0.03	0.03	0.02
query8	0.06	0.04	0.04
query9	0.58	0.52	0.52
query10	0.58	0.56	0.57
query11	0.16	0.11	0.12
query12	0.15	0.12	0.12
query13	0.62	0.60	0.60
query14	1.02	0.99	1.00
query15	0.84	0.86	0.83
query16	0.38	0.38	0.40
query17	1.01	1.00	1.02
query18	0.22	0.20	0.20
query19	1.94	1.82	1.80
query20	0.02	0.01	0.03
query21	15.44	0.20	0.13
query22	5.10	0.07	0.05
query23	15.65	0.26	0.10
query24	2.34	0.91	0.65
query25	0.07	0.06	0.07
query26	0.15	0.14	0.13
query27	0.06	0.06	0.06
query28	4.90	1.13	0.94
query29	12.60	3.97	3.18
query30	0.28	0.13	0.12
query31	2.81	0.57	0.38
query32	3.24	0.55	0.46
query33	3.07	3.10	3.10
query34	15.90	5.14	4.58
query35	4.58	4.57	4.63
query36	0.68	0.50	0.50
query37	0.10	0.07	0.07
query38	0.06	0.04	0.04
query39	0.04	0.03	0.03
query40	0.17	0.14	0.14
query41	0.09	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.03	0.04
Total cold run time: 98.51 s
Total hot run time: 27.74 s

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 84.75% (100/118) 🎉
Increment coverage report
Complete coverage report

@zy-kkk zy-kkk force-pushed the hms_sink_partition branch from f988f83 to 992bca6 Compare November 27, 2025 07:23
@zy-kkk
Copy link
Member Author

zy-kkk commented Nov 27, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 35248 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 992bca6e7a2be03c9ca5d5a8c575bce69f0a4c33, data reload: false

------ Round 1 ----------------------------------
q1	17624	5100	4942	4942
q2	2078	307	207	207
q3	10229	1279	734	734
q4	10221	909	374	374
q5	7524	2272	2395	2272
q6	185	174	144	144
q7	932	806	665	665
q8	9381	1395	1201	1201
q9	7236	5408	5367	5367
q10	6920	2239	1828	1828
q11	508	310	289	289
q12	374	371	232	232
q13	17788	3723	3052	3052
q14	242	235	215	215
q15	577	527	514	514
q16	1045	1058	979	979
q17	610	876	383	383
q18	7341	7222	7001	7001
q19	1210	950	586	586
q20	369	346	235	235
q21	4112	3336	3079	3079
q22	1017	1009	949	949
Total cold run time: 107523 ms
Total hot run time: 35248 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5087	5055	5021	5021
q2	328	387	315	315
q3	2200	2723	2314	2314
q4	1346	1784	1316	1316
q5	4357	4525	4545	4525
q6	202	183	143	143
q7	2083	2032	1883	1883
q8	2666	2689	2578	2578
q9	7555	7426	7586	7426
q10	3061	3262	2862	2862
q11	598	528	494	494
q12	743	807	622	622
q13	3576	4193	3235	3235
q14	282	299	265	265
q15	531	517	523	517
q16	1135	1160	1160	1160
q17	1217	1606	1429	1429
q18	7938	7754	7520	7520
q19	877	942	1079	942
q20	1991	2022	1836	1836
q21	4809	4228	4388	4228
q22	1068	1033	1028	1028
Total cold run time: 53650 ms
Total hot run time: 51659 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 182146 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 069ef57cb63b9a043fa2a08fcf29006c4e147c38, data reload: false

query1	1044	443	390	390
query2	6567	1197	1192	1192
query3	6759	230	221	221
query4	25072	23500	22907	22907
query5	4441	646	497	497
query6	345	240	224	224
query7	4644	501	307	307
query8	320	265	248	248
query9	8706	2624	2641	2624
query10	518	348	326	326
query11	15261	15135	15112	15112
query12	183	125	121	121
query13	1697	573	448	448
query14	9316	5969	6041	5969
query15	210	200	190	190
query16	7317	705	533	533
query17	1263	789	657	657
query18	2022	442	352	352
query19	214	211	195	195
query20	129	127	125	125
query21	220	138	121	121
query22	3801	3981	3856	3856
query23	32812	32114	31876	31876
query24	8649	2426	2417	2417
query25	662	566	497	497
query26	1248	281	175	175
query27	2735	488	351	351
query28	4340	2174	2161	2161
query29	844	681	545	545
query30	318	242	211	211
query31	869	767	622	622
query32	88	86	78	78
query33	641	398	390	390
query34	815	877	539	539
query35	803	840	729	729
query36	881	927	838	838
query37	119	107	89	89
query38	3849	3843	3762	3762
query39	1462	1403	1402	1402
query40	228	128	119	119
query41	68	65	66	65
query42	131	109	114	109
query43	448	458	432	432
query44	1302	766	751	751
query45	198	193	185	185
query46	879	1003	644	644
query47	1677	1727	1645	1645
query48	395	428	329	329
query49	759	500	426	426
query50	646	685	413	413
query51	4033	3911	4012	3911
query52	118	112	105	105
query53	247	263	203	203
query54	312	302	276	276
query55	104	99	93	93
query56	345	329	314	314
query57	1157	1168	1106	1106
query58	287	284	276	276
query59	2386	2409	2274	2274
query60	356	358	343	343
query61	162	161	160	160
query62	786	698	681	681
query63	232	200	201	200
query64	4559	1229	907	907
query65	4093	3987	3962	3962
query66	1170	445	343	343
query67	15593	15076	14809	14809
query68	8373	966	640	640
query69	524	346	307	307
query70	1100	1035	1022	1022
query71	485	351	319	319
query72	5829	5002	4677	4677
query73	706	572	345	345
query74	8904	8883	8631	8631
query75	3616	3036	2560	2560
query76	3750	1148	744	744
query77	806	416	324	324
query78	9467	9687	8804	8804
query79	1546	841	583	583
query80	639	596	507	507
query81	503	270	243	243
query82	422	142	117	117
query83	270	269	249	249
query84	257	110	101	101
query85	915	484	450	450
query86	341	303	281	281
query87	4166	4080	4041	4041
query88	2902	2329	2323	2323
query89	396	332	294	294
query90	1942	234	230	230
query91	173	169	138	138
query92	85	73	66	66
query93	1157	976	666	666
query94	716	443	338	338
query95	520	426	417	417
query96	517	527	291	291
query97	2633	2711	2566	2566
query98	235	226	209	209
query99	1400	1370	1285	1285
Total cold run time: 269150 ms
Total hot run time: 182146 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.2 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 069ef57cb63b9a043fa2a08fcf29006c4e147c38, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.05
query3	0.25	0.08	0.08
query4	1.61	0.11	0.11
query5	0.27	0.24	0.26
query6	1.16	0.65	0.63
query7	0.03	0.03	0.03
query8	0.06	0.05	0.05
query9	0.58	0.51	0.50
query10	0.56	0.56	0.55
query11	0.16	0.11	0.11
query12	0.15	0.12	0.12
query13	0.61	0.62	0.60
query14	0.99	0.99	0.98
query15	0.80	0.79	0.79
query16	0.42	0.39	0.40
query17	0.96	1.02	1.01
query18	0.25	0.24	0.22
query19	1.83	1.87	1.77
query20	0.01	0.01	0.01
query21	15.47	0.26	0.14
query22	4.92	0.06	0.05
query23	16.01	0.26	0.10
query24	1.72	0.25	0.36
query25	0.10	0.06	0.05
query26	0.15	0.14	0.13
query27	0.06	0.06	0.05
query28	3.92	1.24	1.03
query29	12.65	3.91	3.24
query30	0.28	0.13	0.11
query31	2.81	0.60	0.40
query32	3.23	0.56	0.47
query33	3.05	3.06	3.02
query34	16.60	5.17	4.50
query35	4.61	4.57	4.55
query36	0.65	0.49	0.48
query37	0.10	0.07	0.07
query38	0.07	0.05	0.03
query39	0.05	0.03	0.03
query40	0.18	0.13	0.14
query41	0.08	0.03	0.03
query42	0.04	0.03	0.02
query43	0.05	0.04	0.03
Total cold run time: 97.66 s
Total hot run time: 27.2 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 30.28% (43/142) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 78.17% (111/142) 🎉
Increment coverage report
Complete coverage report

@morningman morningman merged commit bb4122a into apache:master Dec 1, 2025
27 of 28 checks passed
github-actions bot pushed a commit that referenced this pull request Dec 1, 2025
)

### What problem does this PR solve?

For Hive tables with massive partitions (10K+), INSERT operations are
extremely slow because:
- FE fetches all partition metadata from HMS directly (expensive RPC
calls)
  - Full table cache invalidation after each insert (unnecessary)


Problem Summary:

1. **Use cache for partition metadata in INSERT**
- FE now fetches partition info from cache instead of directly querying
HMS when preparing INSERT
  - Avoid expensive HMS RPC calls for every INSERT operation

2. **Selective cache refresh after commit**
  - Only invalidate affected partitions instead of full table cache
  - Based on partition update info from BE (NEW/APPEND/OVERWRITE)
  - Significantly reduces cache invalidation overhead

3. **Handle cache inconsistency gracefully**
- When BE marks partition as NEW but it already exists in HMS (cache
miss)
- FE detects this by checking HMS and treats it as APPEND instead of
failing
  - Prevents `AlreadyExistsException` errors

For tables with partitions:
  - **Before**: HMS calls per INSERT + full cache invalidation
  - **After**: cache lookup + selective partition refresh
  - Expected speedup: 10x-100x for partition metadata fetching phas
@zy-kkk zy-kkk deleted the hms_sink_partition branch December 1, 2025 02:43
yiguolei pushed a commit that referenced this pull request Dec 2, 2025
…g cache #58166 (#58540)

Cherry-picked from #58166

Co-authored-by: zy-kkk <zhongyk10@gmail.com>
morningman pushed a commit that referenced this pull request Dec 3, 2025
…les (#58606)

### Problem

Reproduction Steps: Create a Hive Catalog, create an unpartitioned
table, then insert data. The following failure occurs.

```
copy file failed: software.amazon.awssdk.services.s3.model.NoSuchKeyException: The specified key does not exist. (Service: S3, Status Code: 404,
```

The BE mistakenly treats non-partitioned tables as partitioned ones. For
partitioned tables, the system always appends a folder suffix for each
partition, organizing data into partition directories. However,
non-partitioned tables do not require partition information. In this
case, the BE incorrectly added a partition folder suffix for
non-partitioned tables, causing the insert operation to fail.

### Solution
- Skip setting partition information for non-partitioned tables in the
BE.
- Maintain current behavior for partitioned tables, including folder
suffix handling.

### Result
- Inserts into non-partitioned object storage tables succeed.
- Partitioned tables continue to work as expected.

This issue was introduced in #58166
morningman added a commit that referenced this pull request Dec 6, 2025
…#58748)

### What problem does this PR solve?

Followup #58166
In #58166, the edit log need record "modified partitions" and "new
partitions" separately,
so that non-master FE can correctly update the partition cache.
Otherwise, some new partitions can not be queried in non-master FE after
inserting.
morningman pushed a commit to morningman/doris that referenced this pull request Dec 10, 2025
…che#58166)

For Hive tables with massive partitions (10K+), INSERT operations are
extremely slow because:
- FE fetches all partition metadata from HMS directly (expensive RPC
calls)
  - Full table cache invalidation after each insert (unnecessary)

Problem Summary:

1. **Use cache for partition metadata in INSERT**
- FE now fetches partition info from cache instead of directly querying
HMS when preparing INSERT
  - Avoid expensive HMS RPC calls for every INSERT operation

2. **Selective cache refresh after commit**
  - Only invalidate affected partitions instead of full table cache
  - Based on partition update info from BE (NEW/APPEND/OVERWRITE)
  - Significantly reduces cache invalidation overhead

3. **Handle cache inconsistency gracefully**
- When BE marks partition as NEW but it already exists in HMS (cache
miss)
- FE detects this by checking HMS and treats it as APPEND instead of
failing
  - Prevents `AlreadyExistsException` errors

For tables with partitions:
  - **Before**: HMS calls per INSERT + full cache invalidation
  - **After**: cache lookup + selective partition refresh
  - Expected speedup: 10x-100x for partition metadata fetching phas
morningman pushed a commit to morningman/doris that referenced this pull request Dec 10, 2025
…les (apache#58606)

### Problem

Reproduction Steps: Create a Hive Catalog, create an unpartitioned
table, then insert data. The following failure occurs.

```
copy file failed: software.amazon.awssdk.services.s3.model.NoSuchKeyException: The specified key does not exist. (Service: S3, Status Code: 404,
```

The BE mistakenly treats non-partitioned tables as partitioned ones. For
partitioned tables, the system always appends a folder suffix for each
partition, organizing data into partition directories. However,
non-partitioned tables do not require partition information. In this
case, the BE incorrectly added a partition folder suffix for
non-partitioned tables, causing the insert operation to fail.

### Solution
- Skip setting partition information for non-partitioned tables in the
BE.
- Maintain current behavior for partitioned tables, including folder
suffix handling.

### Result
- Inserts into non-partitioned object storage tables succeed.
- Partitioned tables continue to work as expected.

This issue was introduced in apache#58166
morningman added a commit to morningman/doris that referenced this pull request Dec 10, 2025
…apache#58748)

### What problem does this PR solve?

Followup apache#58166
In apache#58166, the edit log need record "modified partitions" and "new
partitions" separately,
so that non-master FE can correctly update the partition cache.
Otherwise, some new partitions can not be queried in non-master FE after
inserting.
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
…che#58166)

### What problem does this PR solve?

For Hive tables with massive partitions (10K+), INSERT operations are
extremely slow because:
- FE fetches all partition metadata from HMS directly (expensive RPC
calls)
  - Full table cache invalidation after each insert (unnecessary)


Problem Summary:

1. **Use cache for partition metadata in INSERT**
- FE now fetches partition info from cache instead of directly querying
HMS when preparing INSERT
  - Avoid expensive HMS RPC calls for every INSERT operation

2. **Selective cache refresh after commit**
  - Only invalidate affected partitions instead of full table cache
  - Based on partition update info from BE (NEW/APPEND/OVERWRITE)
  - Significantly reduces cache invalidation overhead

3. **Handle cache inconsistency gracefully**
- When BE marks partition as NEW but it already exists in HMS (cache
miss)
- FE detects this by checking HMS and treats it as APPEND instead of
failing
  - Prevents `AlreadyExistsException` errors

For tables with partitions:
  - **Before**: HMS calls per INSERT + full cache invalidation
  - **After**: cache lookup + selective partition refresh
  - Expected speedup: 10x-100x for partition metadata fetching phas
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
…les (apache#58606)

### Problem

Reproduction Steps: Create a Hive Catalog, create an unpartitioned
table, then insert data. The following failure occurs.

```
copy file failed: software.amazon.awssdk.services.s3.model.NoSuchKeyException: The specified key does not exist. (Service: S3, Status Code: 404,
```

The BE mistakenly treats non-partitioned tables as partitioned ones. For
partitioned tables, the system always appends a folder suffix for each
partition, organizing data into partition directories. However,
non-partitioned tables do not require partition information. In this
case, the BE incorrectly added a partition folder suffix for
non-partitioned tables, causing the insert operation to fail.

### Solution
- Skip setting partition information for non-partitioned tables in the
BE.
- Maintain current behavior for partitioned tables, including folder
suffix handling.

### Result
- Inserts into non-partitioned object storage tables succeed.
- Partitioned tables continue to work as expected.

This issue was introduced in apache#58166
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
…apache#58748)

### What problem does this PR solve?

Followup apache#58166
In apache#58166, the edit log need record "modified partitions" and "new
partitions" separately,
so that non-master FE can correctly update the partition cache.
Otherwise, some new partitions can not be queried in non-master FE after
inserting.
morrySnow pushed a commit that referenced this pull request Dec 15, 2025
…g cache #58166 #58606 #58748 (#58886)

picked from #58166 #58606 #58748

---------

Co-authored-by: zy-kkk <zhongyk10@gmail.com>
Co-authored-by: Calvin Kirs <guoqiang@selectdb.com>
zy-kkk added a commit to zy-kkk/doris that referenced this pull request Dec 25, 2025
morrySnow pushed a commit that referenced this pull request Dec 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.1.4-merged dev/4.0.2-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants