Skip to content

Conversation

@CalvinKirs
Copy link
Member

@CalvinKirs CalvinKirs commented Dec 2, 2025

Problem

Reproduction Steps: Create a Hive Catalog, create an unpartitioned table, then insert data. The following failure occurs.

copy file failed: software.amazon.awssdk.services.s3.model.NoSuchKeyException: The specified key does not exist. (Service: S3, Status Code: 404,

The BE mistakenly treats non-partitioned tables as partitioned ones. For partitioned tables, the system always appends a folder suffix for each partition, organizing data into partition directories. However, non-partitioned tables do not require partition information. In this case, the BE incorrectly added a partition folder suffix for non-partitioned tables, causing the insert operation to fail.

Solution

  • Skip setting partition information for non-partitioned tables in the BE.
  • Maintain current behavior for partitioned tables, including folder suffix handling.

Result

  • Inserts into non-partitioned object storage tables succeed.
  • Partitioned tables continue to work as expected.

This issue was introduced in #58166

### Problem

````
copy file failed: software.amazon.awssdk.services.s3.model.NoSuchKeyException: The specified key does not exist. (Service: S3, Status Code: 404,

```
Inserting into non-partitioned object storage tables fails because the BE (Backend) treats non-partitioned and partitioned tables differently. Partitioned tables always append a folder suffix for each partition, and all data is organized by partition folders. Non-partitioned tables do not require partition information, but it was being set, causing insert failures.

### Solution
- Skip setting partition information for non-partitioned tables in the BE.
- Maintain current behavior for partitioned tables, including folder suffix handling.

### Result
- Inserts into non-partitioned object storage tables succeed.
- Partitioned tables continue to work as expected.
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@CalvinKirs
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 35012 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 325644a2b8a4c8911e57d9ecdf0fec12740aa402, data reload: false

------ Round 1 ----------------------------------
q1	17657	5148	4968	4968
q2	2031	319	211	211
q3	10223	1318	742	742
q4	10222	893	317	317
q5	7523	2404	2159	2159
q6	182	171	135	135
q7	947	783	635	635
q8	9342	1444	1099	1099
q9	7023	5374	5359	5359
q10	6777	2185	1786	1786
q11	508	311	296	296
q12	345	385	232	232
q13	17776	3681	3037	3037
q14	233	231	221	221
q15	599	519	509	509
q16	893	899	815	815
q17	689	764	562	562
q18	7429	7380	7888	7380
q19	1102	997	630	630
q20	374	380	233	233
q21	4255	3901	2664	2664
q22	1098	1123	1022	1022
Total cold run time: 107228 ms
Total hot run time: 35012 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5170	5295	5186	5186
q2	339	401	312	312
q3	2344	2971	2509	2509
q4	1382	1802	1464	1464
q5	4551	4509	4527	4509
q6	214	170	122	122
q7	2083	1950	1761	1761
q8	2724	2536	2549	2536
q9	7581	7563	7583	7563
q10	3062	3253	2783	2783
q11	591	517	509	509
q12	719	786	566	566
q13	3591	3635	3061	3061
q14	272	279	256	256
q15	536	507	485	485
q16	832	882	839	839
q17	1103	1381	1345	1345
q18	7355	7089	7097	7089
q19	903	816	836	816
q20	1894	1972	1803	1803
q21	4691	4328	4095	4095
q22	1121	1066	991	991
Total cold run time: 53058 ms
Total hot run time: 50600 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 182094 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 325644a2b8a4c8911e57d9ecdf0fec12740aa402, data reload: false

query1	1069	407	395	395
query2	6600	1196	1181	1181
query3	6746	225	227	225
query4	25840	23388	23074	23074
query5	5809	668	492	492
query6	356	246	227	227
query7	4656	535	328	328
query8	310	247	262	247
query9	8740	2646	2644	2644
query10	598	378	312	312
query11	15411	14793	14967	14793
query12	184	124	113	113
query13	1682	568	426	426
query14	10598	5849	5882	5849
query15	210	203	183	183
query16	7643	704	516	516
query17	1426	772	611	611
query18	2069	421	329	329
query19	215	199	174	174
query20	131	125	120	120
query21	216	138	115	115
query22	4119	3905	3868	3868
query23	33029	31959	32047	31959
query24	8221	2446	2451	2446
query25	604	519	459	459
query26	1250	275	171	171
query27	2736	493	364	364
query28	4343	2178	2209	2178
query29	800	620	498	498
query30	317	240	205	205
query31	811	707	639	639
query32	82	74	73	73
query33	590	393	318	318
query34	827	891	553	553
query35	793	818	738	738
query36	895	937	807	807
query37	125	118	92	92
query38	3830	3944	3798	3798
query39	1479	1414	1408	1408
query40	233	134	121	121
query41	68	63	61	61
query42	131	130	114	114
query43	441	462	421	421
query44	1314	757	754	754
query45	196	192	180	180
query46	906	1022	662	662
query47	1706	1721	1682	1682
query48	402	423	337	337
query49	791	492	406	406
query50	686	723	417	417
query51	4011	3870	3860	3860
query52	119	115	109	109
query53	248	272	200	200
query54	329	324	295	295
query55	99	97	91	91
query56	376	339	359	339
query57	1155	1168	1097	1097
query58	299	291	294	291
query59	2278	2447	2264	2264
query60	371	374	362	362
query61	198	197	190	190
query62	783	712	668	668
query63	240	203	209	203
query64	4634	1342	1050	1050
query65	4037	3990	3951	3951
query66	1073	457	357	357
query67	15372	15090	14793	14793
query68	8530	997	638	638
query69	527	340	319	319
query70	1116	1047	998	998
query71	437	352	327	327
query72	6037	4863	4829	4829
query73	684	576	349	349
query74	8761	8835	8574	8574
query75	3050	3082	2459	2459
query76	3164	1149	753	753
query77	522	415	316	316
query78	9514	9699	8826	8826
query79	2045	890	630	630
query80	658	608	511	511
query81	511	269	245	245
query82	198	141	109	109
query83	276	270	244	244
query84	264	126	100	100
query85	915	505	455	455
query86	349	294	276	276
query87	4092	4103	4053	4053
query88	3893	2310	2315	2310
query89	399	330	296	296
query90	2119	224	228	224
query91	174	176	153	153
query92	92	76	66	66
query93	2029	1011	656	656
query94	754	458	360	360
query95	502	430	407	407
query96	538	559	290	290
query97	2615	2716	2604	2604
query98	265	216	218	216
query99	1306	1404	1298	1298
Total cold run time: 274192 ms
Total hot run time: 182094 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.07 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 325644a2b8a4c8911e57d9ecdf0fec12740aa402, data reload: false

query1	0.06	0.04	0.05
query2	0.10	0.05	0.05
query3	0.25	0.09	0.10
query4	1.61	0.11	0.12
query5	0.28	0.25	0.26
query6	1.19	0.67	0.64
query7	0.03	0.02	0.03
query8	0.05	0.04	0.05
query9	0.57	0.50	0.52
query10	0.56	0.57	0.56
query11	0.15	0.10	0.11
query12	0.15	0.11	0.11
query13	0.62	0.59	0.62
query14	1.00	0.99	1.00
query15	0.81	0.80	0.80
query16	0.40	0.39	0.38
query17	1.03	1.05	1.06
query18	0.24	0.21	0.21
query19	1.93	1.86	1.83
query20	0.01	0.01	0.01
query21	15.45	0.29	0.14
query22	4.79	0.05	0.04
query23	16.06	0.28	0.11
query24	0.97	1.03	0.93
query25	0.10	0.05	0.05
query26	0.14	0.13	0.12
query27	0.08	0.05	0.06
query28	4.91	1.23	1.02
query29	12.54	3.98	3.19
query30	0.28	0.14	0.14
query31	2.82	0.63	0.39
query32	3.23	0.56	0.47
query33	2.99	3.13	3.15
query34	16.84	5.20	4.52
query35	4.60	4.58	4.59
query36	0.68	0.49	0.48
query37	0.10	0.07	0.07
query38	0.07	0.04	0.04
query39	0.04	0.03	0.02
query40	0.18	0.14	0.13
query41	0.09	0.03	0.02
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 98.08 s
Total hot run time: 28.07 s

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

github-actions bot commented Dec 2, 2025

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Dec 2, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 2, 2025

PR approved by anyone and no changes requested.

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (1/1) 🎉
Increment coverage report
Complete coverage report

@morningman morningman merged commit 9329b39 into apache:master Dec 3, 2025
37 of 38 checks passed
morningman pushed a commit to morningman/doris that referenced this pull request Dec 10, 2025
…les (apache#58606)

### Problem

Reproduction Steps: Create a Hive Catalog, create an unpartitioned
table, then insert data. The following failure occurs.

```
copy file failed: software.amazon.awssdk.services.s3.model.NoSuchKeyException: The specified key does not exist. (Service: S3, Status Code: 404,
```

The BE mistakenly treats non-partitioned tables as partitioned ones. For
partitioned tables, the system always appends a folder suffix for each
partition, organizing data into partition directories. However,
non-partitioned tables do not require partition information. In this
case, the BE incorrectly added a partition folder suffix for
non-partitioned tables, causing the insert operation to fail.

### Solution
- Skip setting partition information for non-partitioned tables in the
BE.
- Maintain current behavior for partitioned tables, including folder
suffix handling.

### Result
- Inserts into non-partitioned object storage tables succeed.
- Partitioned tables continue to work as expected.

This issue was introduced in apache#58166
nagisa-kunhah pushed a commit to nagisa-kunhah/doris that referenced this pull request Dec 14, 2025
…les (apache#58606)

### Problem

Reproduction Steps: Create a Hive Catalog, create an unpartitioned
table, then insert data. The following failure occurs.

```
copy file failed: software.amazon.awssdk.services.s3.model.NoSuchKeyException: The specified key does not exist. (Service: S3, Status Code: 404,
```

The BE mistakenly treats non-partitioned tables as partitioned ones. For
partitioned tables, the system always appends a folder suffix for each
partition, organizing data into partition directories. However,
non-partitioned tables do not require partition information. In this
case, the BE incorrectly added a partition folder suffix for
non-partitioned tables, causing the insert operation to fail.

### Solution
- Skip setting partition information for non-partitioned tables in the
BE.
- Maintain current behavior for partitioned tables, including folder
suffix handling.

### Result
- Inserts into non-partitioned object storage tables succeed.
- Partitioned tables continue to work as expected.

This issue was introduced in apache#58166
morrySnow pushed a commit that referenced this pull request Dec 15, 2025
…g cache #58166 #58606 #58748 (#58886)

picked from #58166 #58606 #58748

---------

Co-authored-by: zy-kkk <zhongyk10@gmail.com>
Co-authored-by: Calvin Kirs <guoqiang@selectdb.com>
zy-kkk added a commit to zy-kkk/doris that referenced this pull request Dec 25, 2025
morrySnow pushed a commit that referenced this pull request Dec 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.1.4-merged dev/4.0.2-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants