Skip to content

Conversation

@csun5285
Copy link
Contributor

@csun5285 csun5285 commented May 15, 2025

What problem does this PR solve?

  1. When creating an index on a variant column, if the inserted variant is null, no index will be generated, but an empty index file is still required.
  2. For StreamSinkFileWriter, an explicit close is required to trigger file generation downstream..

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented May 15, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@csun5285
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34236 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 067074d52b668480c57d9a05b640690b026e6eb8, data reload: false

------ Round 1 ----------------------------------
q1	25713	5053	5045	5045
q2	2064	271	180	180
q3	10575	1297	724	724
q4	10395	1004	560	560
q5	7577	2395	2810	2395
q6	190	167	136	136
q7	977	778	623	623
q8	9459	1302	1156	1156
q9	6888	5060	5176	5060
q10	6877	2349	1917	1917
q11	485	286	264	264
q12	366	379	222	222
q13	19361	3743	3174	3174
q14	232	227	219	219
q15	544	478	498	478
q16	428	424	384	384
q17	586	876	355	355
q18	7453	7300	7067	7067
q19	1369	957	594	594
q20	347	329	223	223
q21	3810	3201	2469	2469
q22	1035	1050	991	991
Total cold run time: 116731 ms
Total hot run time: 34236 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5249	5178	5192	5178
q2	253	337	237	237
q3	2334	2834	2444	2444
q4	1433	1898	1515	1515
q5	4642	4635	4580	4580
q6	229	180	134	134
q7	2083	1984	1748	1748
q8	2629	2620	2581	2581
q9	7250	7153	6903	6903
q10	2993	3170	2788	2788
q11	570	525	510	510
q12	701	761	608	608
q13	3563	3939	3230	3230
q14	311	304	276	276
q15	534	479	494	479
q16	448	463	444	444
q17	1153	1571	1369	1369
q18	7597	7510	7364	7364
q19	855	793	877	793
q20	1965	2087	1881	1881
q21	4736	4545	4354	4354
q22	1124	1078	1023	1023
Total cold run time: 52652 ms
Total hot run time: 50439 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193467 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 067074d52b668480c57d9a05b640690b026e6eb8, data reload: false

query1	1428	1101	1059	1059
query2	6312	1860	1864	1860
query3	10967	4415	4451	4415
query4	54624	25099	23546	23546
query5	5133	481	476	476
query6	359	204	211	204
query7	4936	509	281	281
query8	345	244	238	238
query9	5737	2633	2648	2633
query10	439	329	274	274
query11	15099	14939	14842	14842
query12	162	114	108	108
query13	1055	528	414	414
query14	10186	6341	6321	6321
query15	202	199	178	178
query16	7070	672	499	499
query17	1095	742	593	593
query18	1592	414	361	361
query19	188	195	168	168
query20	126	116	127	116
query21	205	124	104	104
query22	4396	4347	4377	4347
query23	34001	33530	33468	33468
query24	6917	2452	2470	2452
query25	461	462	404	404
query26	751	276	160	160
query27	2422	526	345	345
query28	3013	2139	2151	2139
query29	597	571	447	447
query30	281	225	191	191
query31	856	854	772	772
query32	69	66	62	62
query33	452	364	307	307
query34	790	860	545	545
query35	797	820	759	759
query36	963	1015	915	915
query37	115	102	113	102
query38	4273	4204	4143	4143
query39	1499	1466	1482	1466
query40	215	133	110	110
query41	59	58	56	56
query42	122	111	114	111
query43	503	521	490	490
query44	1369	866	839	839
query45	178	172	175	172
query46	856	1045	646	646
query47	1844	1886	1752	1752
query48	391	462	330	330
query49	708	530	455	455
query50	656	698	414	414
query51	4240	4300	4138	4138
query52	114	111	99	99
query53	239	265	189	189
query54	595	592	531	531
query55	90	88	83	83
query56	317	309	317	309
query57	1159	1201	1109	1109
query58	262	271	266	266
query59	2770	2790	2732	2732
query60	331	347	314	314
query61	124	125	125	125
query62	724	776	693	693
query63	226	199	187	187
query64	2199	1074	715	715
query65	4359	4242	4263	4242
query66	760	402	310	310
query67	16072	15590	15193	15193
query68	7335	903	514	514
query69	558	305	260	260
query70	1188	1125	1125	1125
query71	503	309	295	295
query72	5935	4741	4677	4677
query73	1316	632	349	349
query74	8987	9177	9123	9123
query75	3787	3225	2738	2738
query76	4301	1194	769	769
query77	622	360	300	300
query78	10008	10105	9349	9349
query79	2254	817	579	579
query80	651	501	439	439
query81	508	314	227	227
query82	443	124	98	98
query83	362	245	236	236
query84	285	102	87	87
query85	780	351	312	312
query86	395	298	300	298
query87	4515	4449	4442	4442
query88	3342	2254	2253	2253
query89	404	316	286	286
query90	1923	206	215	206
query91	148	143	110	110
query92	77	59	53	53
query93	1791	1037	572	572
query94	678	409	308	308
query95	371	298	293	293
query96	499	573	284	284
query97	3189	3226	3093	3093
query98	235	216	209	209
query99	1430	1392	1259	1259
Total cold run time: 300215 ms
Total hot run time: 193467 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.89 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 067074d52b668480c57d9a05b640690b026e6eb8, data reload: false

query1	0.03	0.04	0.02
query2	0.13	0.10	0.11
query3	0.24	0.19	0.20
query4	1.59	0.19	0.20
query5	0.58	0.55	0.55
query6	1.19	0.71	0.73
query7	0.02	0.02	0.01
query8	0.04	0.04	0.03
query9	0.57	0.53	0.51
query10	0.56	0.56	0.56
query11	0.15	0.11	0.10
query12	0.14	0.12	0.12
query13	0.62	0.60	0.59
query14	0.78	0.81	0.80
query15	0.87	0.83	0.84
query16	0.39	0.37	0.38
query17	1.06	1.04	1.08
query18	0.22	0.21	0.20
query19	1.94	1.76	1.79
query20	0.01	0.01	0.02
query21	15.40	0.89	0.54
query22	0.76	1.09	0.62
query23	15.08	1.39	0.61
query24	6.78	2.04	0.64
query25	0.48	0.19	0.07
query26	0.62	0.16	0.13
query27	0.05	0.05	0.04
query28	9.26	0.80	0.44
query29	12.51	4.02	3.35
query30	0.25	0.10	0.07
query31	2.83	0.61	0.39
query32	3.23	0.54	0.47
query33	3.05	3.07	3.10
query34	15.81	5.08	4.51
query35	4.51	4.55	4.47
query36	0.70	0.49	0.48
query37	0.08	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.03
query40	0.16	0.14	0.13
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.02
Total cold run time: 102.9 s
Total hot run time: 28.89 s

@csun5285
Copy link
Contributor Author

run buildall

@csun5285 csun5285 force-pushed the fix_variant_idx_empty branch from 591a069 to b9b873f Compare May 15, 2025 07:58
@csun5285
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33853 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit b9b873fdcb4b1a50a422d2019a48a1c95f13c2b3, data reload: false

------ Round 1 ----------------------------------
q1	25892	5066	5014	5014
q2	2069	283	189	189
q3	10399	1241	665	665
q4	10217	991	523	523
q5	7534	2398	2346	2346
q6	178	170	137	137
q7	1005	735	609	609
q8	9315	1298	1084	1084
q9	6766	5137	5080	5080
q10	6845	2300	1925	1925
q11	484	296	263	263
q12	348	373	210	210
q13	18779	3677	3071	3071
q14	230	224	212	212
q15	536	496	482	482
q16	432	430	372	372
q17	577	862	365	365
q18	7420	7180	7341	7180
q19	1225	981	561	561
q20	344	330	220	220
q21	3874	3263	2373	2373
q22	1070	1030	972	972
Total cold run time: 115539 ms
Total hot run time: 33853 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5204	5177	5196	5177
q2	254	328	227	227
q3	2217	2679	2257	2257
q4	1367	1810	1350	1350
q5	4469	4458	4399	4399
q6	225	195	133	133
q7	2008	2087	1854	1854
q8	2684	2708	2648	2648
q9	7632	7384	7191	7191
q10	3118	3313	2824	2824
q11	569	534	508	508
q12	705	815	654	654
q13	3558	3918	3283	3283
q14	277	296	270	270
q15	549	503	489	489
q16	453	490	472	472
q17	1196	1590	1385	1385
q18	7653	7507	7445	7445
q19	827	811	870	811
q20	1996	2054	1861	1861
q21	4820	4373	4409	4373
q22	1095	1049	1028	1028
Total cold run time: 52876 ms
Total hot run time: 50639 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192447 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b9b873fdcb4b1a50a422d2019a48a1c95f13c2b3, data reload: false

query1	1403	1081	1075	1075
query2	6455	1824	1829	1824
query3	10998	4591	4511	4511
query4	55675	25344	23244	23244
query5	5102	535	461	461
query6	368	200	210	200
query7	5010	503	285	285
query8	323	266	235	235
query9	6136	2629	2630	2629
query10	417	326	271	271
query11	14988	14996	14783	14783
query12	162	110	102	102
query13	1081	528	413	413
query14	10080	6335	6199	6199
query15	199	210	187	187
query16	7037	649	479	479
query17	1060	718	579	579
query18	1518	404	312	312
query19	225	188	170	170
query20	129	131	122	122
query21	200	129	110	110
query22	4496	4489	4337	4337
query23	34271	33586	33626	33586
query24	6693	2425	2424	2424
query25	467	499	452	452
query26	680	275	154	154
query27	2165	522	341	341
query28	3015	2182	2159	2159
query29	574	551	430	430
query30	276	213	201	201
query31	860	855	806	806
query32	72	61	64	61
query33	465	357	317	317
query34	767	875	525	525
query35	806	850	740	740
query36	928	995	913	913
query37	119	103	79	79
query38	4258	4374	4179	4179
query39	1524	1436	1472	1436
query40	212	123	107	107
query41	60	54	53	53
query42	131	111	116	111
query43	500	519	476	476
query44	1372	864	842	842
query45	190	172	188	172
query46	895	1034	638	638
query47	1845	1899	1806	1806
query48	415	437	321	321
query49	688	525	435	435
query50	645	698	401	401
query51	4286	4236	4199	4199
query52	120	105	92	92
query53	221	253	189	189
query54	576	571	496	496
query55	84	83	79	79
query56	294	300	309	300
query57	1196	1210	1127	1127
query58	279	268	266	266
query59	2762	2868	2694	2694
query60	345	338	317	317
query61	149	146	150	146
query62	717	741	659	659
query63	227	193	195	193
query64	1545	1116	790	790
query65	4410	4244	4302	4244
query66	704	404	313	313
query67	16102	15716	15372	15372
query68	7295	878	519	519
query69	544	312	267	267
query70	1243	1109	1106	1106
query71	517	326	293	293
query72	5465	4746	4937	4746
query73	1327	667	356	356
query74	9376	9126	8663	8663
query75	3912	3182	2671	2671
query76	4318	1179	753	753
query77	617	357	295	295
query78	9965	10230	9369	9369
query79	1852	823	573	573
query80	620	527	443	443
query81	475	249	227	227
query82	457	126	95	95
query83	331	253	229	229
query84	293	103	89	89
query85	814	339	306	306
query86	339	297	296	296
query87	4401	4472	4381	4381
query88	2908	2284	2288	2284
query89	398	312	288	288
query90	1984	209	219	209
query91	153	144	128	128
query92	71	107	55	55
query93	1120	945	601	601
query94	689	401	308	308
query95	375	293	306	293
query96	494	576	283	283
query97	2725	2748	2664	2664
query98	220	219	202	202
query99	1403	1377	1286	1286
Total cold run time: 298673 ms
Total hot run time: 192447 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.92 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit b9b873fdcb4b1a50a422d2019a48a1c95f13c2b3, data reload: false

query1	0.04	0.03	0.03
query2	0.13	0.10	0.11
query3	0.25	0.19	0.20
query4	1.58	0.20	0.19
query5	0.45	0.45	0.44
query6	1.15	0.65	0.65
query7	0.02	0.01	0.02
query8	0.04	0.03	0.04
query9	0.59	0.51	0.52
query10	0.56	0.57	0.56
query11	0.15	0.11	0.11
query12	0.14	0.11	0.11
query13	0.60	0.60	0.59
query14	0.78	0.80	0.81
query15	0.86	0.84	0.86
query16	0.39	0.38	0.38
query17	1.03	1.03	1.04
query18	0.22	0.20	0.21
query19	1.89	1.78	1.83
query20	0.01	0.01	0.02
query21	15.41	0.92	0.53
query22	0.77	1.16	0.71
query23	14.88	1.39	0.59
query24	7.24	1.30	0.66
query25	0.49	0.14	0.15
query26	0.65	0.16	0.14
query27	0.05	0.05	0.04
query28	9.05	0.85	0.43
query29	12.60	3.96	3.35
query30	0.25	0.09	0.06
query31	2.83	0.58	0.39
query32	3.22	0.54	0.46
query33	3.16	3.10	3.05
query34	15.84	5.08	4.52
query35	4.53	4.49	4.49
query36	0.66	0.49	0.48
query37	0.08	0.07	0.06
query38	0.05	0.04	0.03
query39	0.03	0.02	0.02
query40	0.19	0.14	0.14
query41	0.08	0.02	0.03
query42	0.04	0.03	0.02
query43	0.04	0.03	0.03
Total cold run time: 103.02 s
Total hot run time: 28.92 s

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (4/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.37% (20847/26264)
Line Coverage 72.61% (214699/295692)
Region Coverage 70.84% (126398/178437)
Branch Coverage 64.50% (65418/101418)

Copy link
Member

@airborne12 airborne12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 19, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Member

@eldenmoon eldenmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eldenmoon eldenmoon merged commit 4757c6e into apache:master May 20, 2025
25 of 27 checks passed
github-actions bot pushed a commit that referenced this pull request May 20, 2025
…variant-type column (#50937)

create empty idx file when creating a index on variant-type column
@eldenmoon eldenmoon added dev/2.1.x usercase Important user case type label labels May 20, 2025
csun5285 added a commit to csun5285/doris that referenced this pull request May 23, 2025
…variant-type column (apache#50937)

create empty idx file when creating a index on variant-type column
dataroaring pushed a commit that referenced this pull request May 27, 2025
airborne12 added a commit that referenced this pull request Jun 3, 2025
Related PR: #50937 

Problem Summary:
This PR fixes the error that occurs when attempting to open an empty
inverted index file by adding a proper empty file check and error
reporting.

Added a condition to verify if the file size is zero.
Sets an error message and returns false to prevent CLucene errors.
github-actions bot pushed a commit that referenced this pull request Jun 3, 2025
Related PR: #50937 

Problem Summary:
This PR fixes the error that occurs when attempting to open an empty
inverted index file by adding a proper empty file check and error
reporting.

Added a condition to verify if the file size is zero.
Sets an error message and returns false to prevent CLucene errors.
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…variant-type column (apache#50937)

create empty idx file when creating a index on variant-type column
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
)

Related PR: apache#50937 

Problem Summary:
This PR fixes the error that occurs when attempting to open an empty
inverted index file by adding a proper empty file check and error
reporting.

Added a condition to verify if the file size is zero.
Sets an error message and returns false to prevent CLucene errors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.6-merged reviewed usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants