Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] (inverted index) Refactor Inverted index file writer #41625

Merged
merged 4 commits into from
Oct 30, 2024

Conversation

csun5285
Copy link
Contributor

@csun5285 csun5285 commented Oct 9, 2024

Proposed changes

  1. After the normal segment is flushed, the close_inverted_index is directly called to write the final composite file.
  2. During compaction, in the first step, the segment writer writes the bkd index while writing normal data. In the second step, the index compaction writes the string index. In the third step, close_inverted_index is uniformly called for all indexes to write the final files.
  3. The rowset writer uses InvertedIndexFileCollection to store all inverted index file writers, ensuring their lifecycle exists throughout the entire writing or compaction process.
  4. When the rowset writer generates the final rowset through build(rowset), it can retrieve the index file sizes from the InvertedIndexFileCollection and record them in the rowset meta.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@csun5285 csun5285 force-pushed the inverted_index_file_writer branch from 30b25d7 to c4a3510 Compare October 9, 2024 11:51
@csun5285
Copy link
Contributor Author

csun5285 commented Oct 9, 2024

run buildall

Copy link
Contributor

github-actions bot commented Oct 9, 2024

clang-tidy review says "All clean, LGTM! 👍"

2 similar comments
Copy link
Contributor

github-actions bot commented Oct 9, 2024

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

github-actions bot commented Oct 9, 2024

clang-tidy review says "All clean, LGTM! 👍"

@csun5285
Copy link
Contributor Author

csun5285 commented Oct 9, 2024

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.29% (9637/25845)
Line Coverage: 28.66% (79852/278571)
Region Coverage: 28.09% (41306/147053)
Branch Coverage: 24.71% (21031/85116)
Coverage Report: http://coverage.selectdb-in.cc/coverage/531902a1d9a370a80264a6e1e4b6b907165d10b5_531902a1d9a370a80264a6e1e4b6b907165d10b5/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 41171 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 531902a1d9a370a80264a6e1e4b6b907165d10b5, data reload: false

------ Round 1 ----------------------------------
q1	17821	7469	7372	7372
q2	2937	163	150	150
q3	11252	1187	1290	1187
q4	10405	761	807	761
q5	7828	2937	2877	2877
q6	243	153	160	153
q7	1003	642	627	627
q8	9453	1912	1999	1912
q9	6610	6430	6435	6430
q10	7018	2291	2327	2291
q11	439	245	259	245
q12	412	226	224	224
q13	17776	2954	2961	2954
q14	252	211	213	211
q15	575	533	541	533
q16	654	581	577	577
q17	986	582	593	582
q18	7275	6768	6724	6724
q19	1382	1071	1004	1004
q20	492	200	198	198
q21	3983	3148	3287	3148
q22	1104	1011	1036	1011
Total cold run time: 109900 ms
Total hot run time: 41171 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7263	7266	7243	7243
q2	337	249	241	241
q3	2883	2807	2761	2761
q4	1931	1714	1733	1714
q5	5428	5514	5470	5470
q6	230	138	147	138
q7	2193	1720	1731	1720
q8	3284	3428	3473	3428
q9	8552	8570	8610	8570
q10	3487	3430	3450	3430
q11	583	487	473	473
q12	792	598	586	586
q13	6293	3006	3016	3006
q14	283	255	272	255
q15	585	514	502	502
q16	679	658	637	637
q17	1799	1579	1582	1579
q18	7897	7531	7400	7400
q19	1670	1425	1527	1425
q20	2048	1848	1814	1814
q21	5363	5177	5112	5112
q22	1111	1043	1025	1025
Total cold run time: 64691 ms
Total hot run time: 58529 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192952 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 531902a1d9a370a80264a6e1e4b6b907165d10b5, data reload: false

query1	973	397	379	379
query2	6516	2063	1976	1976
query3	6708	214	231	214
query4	34133	23937	23551	23551
query5	4442	476	486	476
query6	260	187	173	173
query7	4630	309	311	309
query8	283	232	224	224
query9	9298	2641	2635	2635
query10	465	294	278	278
query11	18190	15425	15609	15425
query12	162	105	110	105
query13	1692	476	464	464
query14	10972	7790	7773	7773
query15	342	176	185	176
query16	8034	471	470	470
query17	1736	563	542	542
query18	2060	293	300	293
query19	355	151	147	147
query20	115	104	108	104
query21	212	107	101	101
query22	4592	4226	4049	4049
query23	35012	34244	34947	34244
query24	11104	2882	2897	2882
query25	634	388	396	388
query26	1380	164	170	164
query27	2805	296	294	294
query28	7809	2431	2414	2414
query29	895	423	409	409
query30	311	156	156	156
query31	1001	813	842	813
query32	97	55	54	54
query33	775	299	289	289
query34	979	503	524	503
query35	864	751	724	724
query36	1108	956	940	940
query37	158	92	95	92
query38	4141	3896	3885	3885
query39	1492	1439	1425	1425
query40	284	98	97	97
query41	47	44	44	44
query42	121	104	98	98
query43	522	490	491	490
query44	1279	828	830	828
query45	192	167	176	167
query46	1152	725	731	725
query47	1923	1832	1832	1832
query48	437	340	343	340
query49	1089	414	398	398
query50	823	424	418	418
query51	7055	6999	6788	6788
query52	104	88	86	86
query53	269	183	183	183
query54	1158	465	470	465
query55	81	77	77	77
query56	261	268	270	268
query57	1259	1151	1107	1107
query58	238	278	312	278
query59	3069	2996	2995	2995
query60	305	289	277	277
query61	126	122	121	121
query62	857	695	670	670
query63	235	200	200	200
query64	4385	750	724	724
query65	3324	3271	3216	3216
query66	1188	339	317	317
query67	16077	15589	15542	15542
query68	3524	601	586	586
query69	460	316	312	312
query70	1203	1132	1046	1046
query71	359	273	284	273
query72	6285	4161	4045	4045
query73	784	346	352	346
query74	9816	9082	9100	9082
query75	3417	2645	2656	2645
query76	2143	924	939	924
query77	474	300	300	300
query78	10469	9632	9515	9515
query79	2545	613	624	613
query80	1191	452	462	452
query81	572	246	250	246
query82	969	142	139	139
query83	234	150	141	141
query84	248	81	82	81
query85	1298	303	283	283
query86	425	300	309	300
query87	4482	4321	4318	4318
query88	3962	2401	2395	2395
query89	407	307	287	287
query90	1909	198	188	188
query91	163	110	122	110
query92	62	51	51	51
query93	1670	537	543	537
query94	834	297	303	297
query95	358	256	257	256
query96	624	282	281	281
query97	3258	3100	3108	3100
query98	216	203	197	197
query99	1546	1315	1318	1315
Total cold run time: 300502 ms
Total hot run time: 192952 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.97 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 531902a1d9a370a80264a6e1e4b6b907165d10b5, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.02	0.03
query3	0.23	0.07	0.06
query4	1.64	0.10	0.10
query5	0.55	0.50	0.49
query6	1.13	0.73	0.72
query7	0.02	0.01	0.01
query8	0.03	0.03	0.04
query9	0.56	0.51	0.52
query10	0.55	0.54	0.56
query11	0.13	0.11	0.10
query12	0.14	0.11	0.11
query13	0.61	0.60	0.59
query14	2.74	2.73	2.70
query15	0.90	0.82	0.82
query16	0.38	0.37	0.39
query17	1.03	1.06	1.06
query18	0.23	0.22	0.22
query19	1.85	1.75	1.90
query20	0.02	0.01	0.01
query21	15.36	0.58	0.58
query22	2.47	2.53	1.49
query23	17.09	1.09	0.79
query24	2.98	1.35	0.66
query25	0.24	0.14	0.18
query26	0.30	0.14	0.13
query27	0.04	0.03	0.04
query28	10.79	1.08	1.06
query29	12.58	3.23	3.26
query30	0.24	0.06	0.06
query31	2.85	0.38	0.38
query32	3.29	0.45	0.45
query33	2.98	3.00	3.02
query34	17.22	4.44	4.48
query35	4.52	4.53	4.50
query36	0.67	0.48	0.48
query37	0.08	0.06	0.06
query38	0.05	0.04	0.04
query39	0.04	0.02	0.02
query40	0.16	0.12	0.12
query41	0.07	0.02	0.02
query42	0.04	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 106.94 s
Total hot run time: 31.97 s

@csun5285
Copy link
Contributor Author

csun5285 commented Oct 9, 2024

run buildall

Copy link
Contributor

github-actions bot commented Oct 9, 2024

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.30% (9639/25845)
Line Coverage: 28.68% (79910/278591)
Region Coverage: 28.10% (41325/147059)
Branch Coverage: 24.72% (21045/85120)
Coverage Report: http://coverage.selectdb-in.cc/coverage/f8a2570bff021f43c7b3f4364367224a55f87c67_f8a2570bff021f43c7b3f4364367224a55f87c67/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 40780 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f8a2570bff021f43c7b3f4364367224a55f87c67, data reload: false

------ Round 1 ----------------------------------
q1	17573	7610	7215	7215
q2	2012	298	290	290
q3	12251	1068	1210	1068
q4	10563	765	693	693
q5	7759	2883	2746	2746
q6	238	151	152	151
q7	1023	619	633	619
q8	9356	1928	1992	1928
q9	6459	6437	6427	6427
q10	7000	2312	2314	2312
q11	438	261	252	252
q12	409	229	223	223
q13	17768	3011	2983	2983
q14	239	217	214	214
q15	569	526	534	526
q16	646	585	589	585
q17	979	602	543	543
q18	7233	6599	6724	6599
q19	1355	1020	1087	1020
q20	470	201	201	201
q21	4178	3192	3187	3187
q22	1111	998	1018	998
Total cold run time: 109629 ms
Total hot run time: 40780 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7242	7267	7208	7208
q2	319	232	229	229
q3	3033	2922	2936	2922
q4	2103	1812	1811	1811
q5	5816	5756	5754	5754
q6	236	148	150	148
q7	2195	1839	1839	1839
q8	3378	3592	3467	3467
q9	8976	8928	8865	8865
q10	3592	3554	3543	3543
q11	588	493	525	493
q12	849	652	595	595
q13	10765	3166	3213	3166
q14	299	276	268	268
q15	579	517	538	517
q16	698	654	659	654
q17	1844	1617	1580	1580
q18	8245	7679	7622	7622
q19	1728	1487	1448	1448
q20	2085	1874	1866	1866
q21	5652	5494	5502	5494
q22	1131	1074	1073	1073
Total cold run time: 71353 ms
Total hot run time: 60562 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192369 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f8a2570bff021f43c7b3f4364367224a55f87c67, data reload: false

query1	913	421	416	416
query2	6248	2066	1976	1976
query3	8684	195	195	195
query4	34222	23857	23650	23650
query5	3931	482	487	482
query6	274	194	169	169
query7	4188	303	302	302
query8	273	214	216	214
query9	9411	2622	2617	2617
query10	462	280	274	274
query11	17911	15193	15164	15164
query12	152	110	95	95
query13	1591	468	436	436
query14	9788	7629	7786	7629
query15	264	176	178	176
query16	8064	490	497	490
query17	1648	625	602	602
query18	2202	320	323	320
query19	364	153	151	151
query20	120	112	115	112
query21	207	107	124	107
query22	4759	4493	4421	4421
query23	34892	34006	34336	34006
query24	11046	2897	2924	2897
query25	641	423	396	396
query26	1181	167	167	167
query27	2238	302	302	302
query28	7413	2422	2391	2391
query29	841	445	437	437
query30	251	159	153	153
query31	1028	810	803	803
query32	98	56	55	55
query33	756	294	316	294
query34	925	519	526	519
query35	865	719	729	719
query36	1112	959	981	959
query37	152	94	91	91
query38	4038	3842	3875	3842
query39	1490	1422	1461	1422
query40	205	100	99	99
query41	46	44	45	44
query42	114	96	99	96
query43	527	499	464	464
query44	1224	813	802	802
query45	202	161	163	161
query46	1152	734	735	734
query47	1922	1855	1797	1797
query48	437	352	356	352
query49	894	411	414	411
query50	837	413	424	413
query51	7033	6924	6884	6884
query52	100	88	86	86
query53	265	188	187	187
query54	1180	487	494	487
query55	81	78	78	78
query56	278	272	289	272
query57	1237	1167	1154	1154
query58	235	227	277	227
query59	3095	3117	2988	2988
query60	284	274	273	273
query61	106	101	103	101
query62	871	677	664	664
query63	226	195	190	190
query64	3980	669	620	620
query65	3337	3237	3198	3198
query66	734	307	323	307
query67	16131	15465	15660	15465
query68	4596	585	563	563
query69	514	310	332	310
query70	1148	1133	1107	1107
query71	349	280	272	272
query72	7135	4052	3976	3976
query73	782	343	350	343
query74	10209	8975	8960	8960
query75	3415	2703	2732	2703
query76	2983	987	941	941
query77	445	327	313	313
query78	10449	9622	9463	9463
query79	1824	593	598	593
query80	1164	447	473	447
query81	575	240	245	240
query82	694	135	134	134
query83	229	136	136	136
query84	252	77	85	77
query85	1378	296	292	292
query86	421	279	297	279
query87	4415	4460	4267	4267
query88	3363	2411	2369	2369
query89	406	287	284	284
query90	1978	186	187	186
query91	145	108	109	108
query92	81	49	47	47
query93	2499	562	545	545
query94	1029	297	303	297
query95	355	258	261	258
query96	636	281	280	280
query97	3242	3089	3127	3089
query98	214	196	201	196
query99	1786	1311	1289	1289
Total cold run time: 300087 ms
Total hot run time: 192369 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.09 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f8a2570bff021f43c7b3f4364367224a55f87c67, data reload: false

query1	0.04	0.03	0.03
query2	0.06	0.03	0.03
query3	0.23	0.07	0.07
query4	1.65	0.09	0.10
query5	0.53	0.52	0.51
query6	1.13	0.74	0.72
query7	0.02	0.01	0.01
query8	0.05	0.03	0.03
query9	0.58	0.51	0.50
query10	0.58	0.56	0.54
query11	0.14	0.10	0.11
query12	0.15	0.12	0.11
query13	0.61	0.59	0.60
query14	2.83	2.84	2.81
query15	0.89	0.83	0.83
query16	0.37	0.38	0.41
query17	1.05	1.02	1.03
query18	0.20	0.20	0.19
query19	1.95	1.82	2.00
query20	0.01	0.00	0.00
query21	15.35	0.57	0.57
query22	2.50	2.39	1.73
query23	17.23	0.96	0.79
query24	2.75	1.46	0.52
query25	0.21	0.19	0.05
query26	0.35	0.15	0.14
query27	0.04	0.04	0.04
query28	10.99	1.09	1.06
query29	12.54	3.24	3.19
query30	0.25	0.07	0.06
query31	2.87	0.38	0.37
query32	3.28	0.47	0.46
query33	2.97	3.06	3.05
query34	16.96	4.45	4.48
query35	4.55	4.55	4.45
query36	0.67	0.50	0.48
query37	0.09	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.02
query40	0.16	0.12	0.12
query41	0.07	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 107.04 s
Total hot run time: 32.09 s

@csun5285
Copy link
Contributor Author

csun5285 commented Oct 9, 2024

run buildall

Copy link
Contributor

github-actions bot commented Oct 9, 2024

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.30% (9639/25843)
Line Coverage: 28.67% (79876/278598)
Region Coverage: 28.10% (41322/147067)
Branch Coverage: 24.72% (21041/85128)
Coverage Report: http://coverage.selectdb-in.cc/coverage/6bf48d04b3d46fb65c5c9d4b945309d4162b57be_6bf48d04b3d46fb65c5c9d4b945309d4162b57be/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 40815 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6bf48d04b3d46fb65c5c9d4b945309d4162b57be, data reload: false

------ Round 1 ----------------------------------
q1	17569	7690	7326	7326
q2	2015	276	276	276
q3	12231	1084	1221	1084
q4	10571	747	707	707
q5	7761	2902	2893	2893
q6	240	157	157	157
q7	1130	637	606	606
q8	9351	1918	1940	1918
q9	6628	6500	6449	6449
q10	6970	2308	2320	2308
q11	438	240	254	240
q12	411	226	225	225
q13	17776	2989	3029	2989
q14	258	207	212	207
q15	577	532	531	531
q16	635	592	580	580
q17	1000	551	540	540
q18	7320	6661	6567	6567
q19	1342	964	891	891
q20	496	206	199	199
q21	4135	3178	3138	3138
q22	1113	984	1006	984
Total cold run time: 109967 ms
Total hot run time: 40815 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7366	7262	7253	7253
q2	334	227	233	227
q3	3083	2945	2933	2933
q4	2110	1933	1844	1844
q5	5792	5777	5769	5769
q6	242	155	149	149
q7	2258	1817	1922	1817
q8	3374	3594	3444	3444
q9	8995	8953	8860	8860
q10	3616	3554	3524	3524
q11	579	485	500	485
q12	790	658	660	658
q13	10994	3170	3145	3145
q14	302	272	268	268
q15	574	545	526	526
q16	709	635	643	635
q17	1897	1626	1614	1614
q18	8403	7730	7584	7584
q19	1761	1455	1371	1371
q20	2125	1895	1925	1895
q21	5517	5433	5389	5389
q22	1134	1046	1036	1036
Total cold run time: 71955 ms
Total hot run time: 60426 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191870 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6bf48d04b3d46fb65c5c9d4b945309d4162b57be, data reload: false

query1	956	389	417	389
query2	6233	2076	2062	2062
query3	8685	194	198	194
query4	34022	23470	23434	23434
query5	3505	485	478	478
query6	271	170	168	168
query7	4211	324	309	309
query8	295	233	228	228
query9	9558	2691	2693	2691
query10	476	277	274	274
query11	17774	15145	15236	15145
query12	152	100	96	96
query13	1594	440	429	429
query14	9958	7613	7606	7606
query15	249	169	171	169
query16	7790	452	506	452
query17	1611	629	576	576
query18	2098	309	309	309
query19	329	187	165	165
query20	126	111	125	111
query21	220	106	108	106
query22	4612	4391	4624	4391
query23	35576	34057	34110	34057
query24	11026	2874	2836	2836
query25	617	422	399	399
query26	1170	163	162	162
query27	2365	294	299	294
query28	7559	2451	2419	2419
query29	807	437	419	419
query30	307	152	151	151
query31	1053	809	783	783
query32	100	54	53	53
query33	764	298	302	298
query34	917	515	497	497
query35	857	761	767	761
query36	1106	962	949	949
query37	158	82	90	82
query38	4029	3901	3872	3872
query39	1481	1421	1429	1421
query40	210	96	96	96
query41	49	44	44	44
query42	117	94	95	94
query43	544	507	481	481
query44	1312	821	819	819
query45	196	167	170	167
query46	1184	716	718	716
query47	1932	1821	1814	1814
query48	438	347	347	347
query49	977	418	412	412
query50	862	404	418	404
query51	7142	6976	6949	6949
query52	97	84	87	84
query53	252	175	180	175
query54	1207	463	476	463
query55	78	76	74	74
query56	278	261	252	252
query57	1259	1192	1148	1148
query58	225	233	246	233
query59	3339	3054	3040	3040
query60	297	253	269	253
query61	97	103	106	103
query62	856	672	679	672
query63	226	190	189	189
query64	4127	640	649	640
query65	3309	3212	3202	3202
query66	771	291	307	291
query67	15882	15321	15336	15321
query68	3931	591	573	573
query69	477	301	293	293
query70	1158	1129	1101	1101
query71	333	274	263	263
query72	7287	4000	3878	3878
query73	788	349	355	349
query74	9751	8979	9003	8979
query75	3418	2674	2676	2674
query76	2742	906	942	906
query77	551	292	307	292
query78	10384	9547	9493	9493
query79	1426	604	631	604
query80	928	462	454	454
query81	584	248	242	242
query82	743	140	139	139
query83	231	139	137	137
query84	262	79	73	73
query85	1284	296	288	288
query86	379	293	293	293
query87	4450	4365	4235	4235
query88	3321	2381	2360	2360
query89	414	286	287	286
query90	1844	184	186	184
query91	141	107	104	104
query92	63	46	47	46
query93	1075	531	550	531
query94	852	298	304	298
query95	346	255	251	251
query96	603	275	277	275
query97	3254	3144	3066	3066
query98	216	196	198	196
query99	1550	1300	1289	1289
Total cold run time: 296666 ms
Total hot run time: 191870 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.17 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 6bf48d04b3d46fb65c5c9d4b945309d4162b57be, data reload: false

query1	0.04	0.03	0.04
query2	0.07	0.02	0.03
query3	0.24	0.07	0.06
query4	1.65	0.11	0.10
query5	0.49	0.51	0.50
query6	1.13	0.74	0.72
query7	0.02	0.01	0.02
query8	0.03	0.03	0.04
query9	0.56	0.50	0.50
query10	0.57	0.54	0.53
query11	0.14	0.10	0.10
query12	0.14	0.11	0.10
query13	0.61	0.58	0.59
query14	2.78	2.72	2.72
query15	0.90	0.82	0.85
query16	0.39	0.38	0.37
query17	1.00	1.00	1.06
query18	0.20	0.20	0.20
query19	1.93	1.76	1.98
query20	0.02	0.01	0.01
query21	15.35	0.62	0.62
query22	2.25	2.85	1.55
query23	17.03	1.03	0.91
query24	2.92	0.74	1.40
query25	0.21	0.17	0.09
query26	0.43	0.13	0.13
query27	0.05	0.04	0.04
query28	10.62	1.09	1.06
query29	12.52	3.25	3.25
query30	0.24	0.06	0.06
query31	2.91	0.38	0.37
query32	3.29	0.47	0.45
query33	3.04	3.02	3.04
query34	17.09	4.47	4.47
query35	4.49	4.48	4.50
query36	0.67	0.48	0.50
query37	0.07	0.06	0.06
query38	0.04	0.03	0.04
query39	0.03	0.02	0.02
query40	0.17	0.12	0.12
query41	0.07	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.02
Total cold run time: 106.46 s
Total hot run time: 32.17 s

@csun5285 csun5285 force-pushed the inverted_index_file_writer branch from 6bf48d0 to 4f8864b Compare October 10, 2024 04:13
@csun5285
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.24% (9645/25900)
Line Coverage: 28.54% (79930/280078)
Region Coverage: 27.99% (41344/147721)
Branch Coverage: 24.60% (21049/85566)
Coverage Report: http://coverage.selectdb-in.cc/coverage/4f8864b42a9a838776993361e6f616bc5604be5b_4f8864b42a9a838776993361e6f616bc5604be5b/report/index.html

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@csun5285
Copy link
Contributor Author

run buildall

Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@qidaye qidaye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qidaye qidaye merged commit 4a08bae into apache:master Oct 30, 2024
24 of 29 checks passed
airborne12 pushed a commit that referenced this pull request Nov 4, 2024
### What problem does this PR solve?
<!--
You need to clearly describe your PR in this part:

1. What problem was fixed (it's best to include specific error reporting
information). How it was fixed.
2. Which behaviors were modified. What was the previous behavior, what
is it now, why was it modified, and what possible impacts might there
be.
3. What features were added. Why this function was added.
4. Which codes were refactored and why this part of the code was
refactored.
5. Which functions were optimized and what is the difference before and
after the optimization.

The description of the PR needs to enable reviewers to quickly and
clearly understand the logic of the code modification.
-->

Introduced by #41625

<!--
If there are related issues, please fill in the issue number.
- If you want the issue to be closed after the PR is merged, please use
"close #12345". Otherwise, use "ref #12345"
-->


<!--
If this PR is followup a preivous PR, for example, fix the bug that
introduced by a related PR,
link the PR here
-->
Related PR: #41625

Problem Summary:
In #41625, the index compaction process was modified. During index
compaction, the output rowset has not yet been built, so the output
rowset cannot be accessed at this point.

### Check List (For Committer)

- Test <!-- At least one of them must be included. -->

    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No colde files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:

    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?

    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

- Release note

    <!-- bugfix, feat, behavior changed need a release note -->
    <!-- Add one line release note for this PR. -->
    None

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
airborne12 added a commit that referenced this pull request Nov 4, 2024
…er in segment compaction (#43114)

### What problem does this PR solve?
<!--
You need to clearly describe your PR in this part:

1. What problem was fixed (it's best to include specific error reporting
information). How it was fixed.
2. Which behaviors were modified. What was the previous behavior, what
is it now, why was it modified, and what possible impacts might there
be.
3. What features were added. Why this function was added.
4. Which codes were refactored and why this part of the code was
refactored.
5. Which functions were optimized and what is the difference before and
after the optimization.

The description of the PR needs to enable reviewers to quickly and
clearly understand the logic of the code modification.
-->

<!--
If there are related issues, please fill in the issue number.
- If you want the issue to be closed after the PR is merged, please use
"close #12345". Otherwise, use "ref #12345"
-->

<!--
If this PR is followup a preivous PR, for example, fix the bug that
introduced by a related PR,
link the PR here
-->
Related PR: #41625

Problem Summary:
Fix BUAF problem, stack like this
```
stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:421
1# 0x00007F5370ABD520 in /lib/x86_64-linux-gnu/libc.so.6
2# pthread_kill at ./nptl/pthread_kill.c:89
3# raise at ../sysdeps/posix/raise.c:27
4# abort at ./stdlib/abort.c:81
5# _gnu_cxx::_verbose_terminate_handler() [clone .cold] at ../../../../libstdc+-v3/libsupc+/vterminate.cc:75
6# _cxxabiv1::_terminate(void ()) at ../../../../libstdc+-v3/libsupc+/eh_terminate.cc:48
7# 0x000055D9E015B681 in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
8# 0x000055D9E015B7D4 in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
9# 0x000055D9E015BBC6 in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
10# void fmt::v7::detail::buffer::append(char const*, char const*) in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
11# char const* fmt::v7::detail::parse_replacement_field, char, fmt::v7::basic_format_context, char> >&>(char const*, char const*, fmt::v7::detail::format_handler, char, fmt::v7::basic_format_context, char> >&) in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
12# void fmt::v7::detail::vformat_to(fmt::v7::detail::buffer&, fmt::v7::basic_string_view, fmt::v7::basic_format_args::type>, fmt::v7::type_identity::type> >, fmt::v7::detail::locale_ref) in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
13# fmt::v7::detail::vformat[abi:cxx11](fmt::v7::basic_string_view, fmt::v7::format_args) in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
14# doris::segment_v2::InvertedIndexDescriptor::get_temporary_index_path[abi:cxx11](std::basic_string_view >, std::basic_string_view >, long, long, std::basic_string_view >) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_desc.cpp:35
15# doris::segment_v2::InvertedIndexFileWriter::open(doris::TabletIndex const*) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_file_writer.cpp:45
16# doris::segment_v2::InvertedIndexColumnWriterImpl<(doris::FieldType)7>::init_bkd_index() at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:146
17# doris::segment_v2::InvertedIndexColumnWriterImpl<(doris::FieldType)7>::init() at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:116
18# doris::segment_v2::InvertedIndexColumnWriter::create(doris::Field const*, std::unique_ptr >, doris::segment_v2::InvertedIndexFileWriter, doris::TabletIndex const*) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:698
19# doris::segment_v2::ScalarColumnWriter::init() at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/column_writer.cpp:483
20# doris::segment_v2::SegmentWriter::_create_column_writer(unsigned int, doris::TabletColumn const&, std::shared_ptr const&) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:258
21# doris::segment_v2::SegmentWriter::_create_writers(std::shared_ptr const&, std::vector > const&) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:307
22# doris::segment_v2::SegmentWriter::init(std::vector > const&, bool) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:276
23# doris::SegcompactionWorker::_do_compact_segments(std::shared_ptr, std::allocator > > >) in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
24# doris::SegcompactionWorker::compact_segments(std::shared_ptr, std::allocator > > >) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segcompaction.cpp:354
25# doris::StorageEngine::_handle_seg_compaction(std::shared_ptr, std::shared_ptr, std::allocator > > >, unsigned long) at /home/zcp/repo_center/doris_master/doris/be/src/olap/olap_server.cpp:1121
26# std::_Function_handler, std::shared_ptr, std::allocator > > >)::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
27# doris::ThreadPool::dispatch_thread() in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
28# doris::Thread::supervise_thread(void*) at /home/zcp/repo_center/doris_master/doris/be/src/util/thread.cpp:499
29# start_thread at ./nptl/pthread_create.c:442
30# 0x00007F5370BA1850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83
172.20.50.120 last coredump sql: last SQL query not found
```

### Check List (For Committer)

- Test <!-- At least one of them must be included. -->

    - [ ] Regression test
    - [x] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No colde files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:

    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?

    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

- Release note

    <!-- bugfix, feat, behavior changed need a release note -->
    <!-- Add one line release note for this PR. -->
    None

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
airborne12 added a commit that referenced this pull request Nov 4, 2024
### What problem does this PR solve?
<!--
You need to clearly describe your PR in this part:

1. What problem was fixed (it's best to include specific error reporting
information). How it was fixed.
2. Which behaviors were modified. What was the previous behavior, what
is it now, why was it modified, and what possible impacts might there
be.
3. What features were added. Why this function was added.
4. Which codes were refactored and why this part of the code was
refactored.
5. Which functions were optimized and what is the difference before and
after the optimization.

The description of the PR needs to enable reviewers to quickly and
clearly understand the logic of the code modification.
-->

<!--
If there are related issues, please fill in the issue number.
- If you want the issue to be closed after the PR is merged, please use
"close #12345". Otherwise, use "ref #12345"
-->

<!--
If this PR is followup a preivous PR, for example, fix the bug that
introduced by a related PR,
link the PR here
-->
Related PR: #41625 

Problem Summary:
Refine case of index compaction, DO NOT need so much data.

### Check List (For Committer)

- Test <!-- At least one of them must be included. -->

    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No colde files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:

    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?

    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

- Release note

    <!-- bugfix, feat, behavior changed need a release note -->
    <!-- Add one line release note for this PR. -->
    None

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
csun5285 added a commit to csun5285/doris that referenced this pull request Nov 8, 2024
…he#41625)

## Proposed changes

1. After the normal segment is flushed, the `close_inverted_index` is
directly called to write the final composite file.
2. During compaction, in the first step, the `segment writer `writes the
`bkd index` while writing normal data. In the second step, the` index
compaction` writes the `string index`. In the third step,
`close_inverted_index` is uniformly called for all indexes to write the
final files.
3. The rowset writer uses `InvertedIndexFileCollection` to store all
inverted index file writers, ensuring their lifecycle exists throughout
the entire writing or compaction process.
4. When the rowset writer generates the final rowset through
`build(rowset)`, it can retrieve the index file sizes from the
`InvertedIndexFileCollection` and record them in the rowset meta.
airborne12 added a commit that referenced this pull request Nov 11, 2024
…e in segment compaction (#43477)

Related PR: #41625

Problem Summary:
Fix coredump when doing segcompaction after refactor of inverted index
file writer.

Co-authored-by: airborne12 <jiangkai@selectdb.com>
github-actions bot pushed a commit that referenced this pull request Nov 11, 2024
…e in segment compaction (#43477)

Related PR: #41625

Problem Summary:
Fix coredump when doing segcompaction after refactor of inverted index
file writer.

Co-authored-by: airborne12 <jiangkai@selectdb.com>
airborne12 pushed a commit that referenced this pull request Nov 11, 2024
…) (#43528)

pick from mater #41625

---------

Co-authored-by: csun5285 <sunchenyang@selectdb.com>
csun5285 added a commit to csun5285/doris that referenced this pull request Nov 11, 2024
### What problem does this PR solve?
<!--
You need to clearly describe your PR in this part:

1. What problem was fixed (it's best to include specific error reporting
information). How it was fixed.
2. Which behaviors were modified. What was the previous behavior, what
is it now, why was it modified, and what possible impacts might there
be.
3. What features were added. Why this function was added.
4. Which codes were refactored and why this part of the code was
refactored.
5. Which functions were optimized and what is the difference before and
after the optimization.

The description of the PR needs to enable reviewers to quickly and
clearly understand the logic of the code modification.
-->

Introduced by apache#41625

<!--
If there are related issues, please fill in the issue number.
- If you want the issue to be closed after the PR is merged, please use
"close apache#12345". Otherwise, use "ref apache#12345"
-->


<!--
If this PR is followup a preivous PR, for example, fix the bug that
introduced by a related PR,
link the PR here
-->
Related PR: apache#41625

Problem Summary:
In apache#41625, the index compaction process was modified. During index
compaction, the output rowset has not yet been built, so the output
rowset cannot be accessed at this point.

### Check List (For Committer)

- Test <!-- At least one of them must be included. -->

    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No colde files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:

    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?

    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

- Release note

    <!-- bugfix, feat, behavior changed need a release note -->
    <!-- Add one line release note for this PR. -->
    None

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
airborne12 added a commit to airborne12/apache-doris that referenced this pull request Nov 11, 2024
…er in segment compaction (apache#43114)

### What problem does this PR solve?
<!--
You need to clearly describe your PR in this part:

1. What problem was fixed (it's best to include specific error reporting
information). How it was fixed.
2. Which behaviors were modified. What was the previous behavior, what
is it now, why was it modified, and what possible impacts might there
be.
3. What features were added. Why this function was added.
4. Which codes were refactored and why this part of the code was
refactored.
5. Which functions were optimized and what is the difference before and
after the optimization.

The description of the PR needs to enable reviewers to quickly and
clearly understand the logic of the code modification.
-->

<!--
If there are related issues, please fill in the issue number.
- If you want the issue to be closed after the PR is merged, please use
"close apache#12345". Otherwise, use "ref apache#12345"
-->

<!--
If this PR is followup a preivous PR, for example, fix the bug that
introduced by a related PR,
link the PR here
-->
Related PR: apache#41625

Problem Summary:
Fix BUAF problem, stack like this
```
stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:421
1# 0x00007F5370ABD520 in /lib/x86_64-linux-gnu/libc.so.6
2# pthread_kill at ./nptl/pthread_kill.c:89
3# raise at ../sysdeps/posix/raise.c:27
4# abort at ./stdlib/abort.c:81
5# _gnu_cxx::_verbose_terminate_handler() [clone .cold] at ../../../../libstdc+-v3/libsupc+/vterminate.cc:75
6# _cxxabiv1::_terminate(void ()) at ../../../../libstdc+-v3/libsupc+/eh_terminate.cc:48
7# 0x000055D9E015B681 in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
8# 0x000055D9E015B7D4 in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
9# 0x000055D9E015BBC6 in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
10# void fmt::v7::detail::buffer::append(char const*, char const*) in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
11# char const* fmt::v7::detail::parse_replacement_field, char, fmt::v7::basic_format_context, char> >&>(char const*, char const*, fmt::v7::detail::format_handler, char, fmt::v7::basic_format_context, char> >&) in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
12# void fmt::v7::detail::vformat_to(fmt::v7::detail::buffer&, fmt::v7::basic_string_view, fmt::v7::basic_format_args::type>, fmt::v7::type_identity::type> >, fmt::v7::detail::locale_ref) in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
13# fmt::v7::detail::vformat[abi:cxx11](fmt::v7::basic_string_view, fmt::v7::format_args) in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
14# doris::segment_v2::InvertedIndexDescriptor::get_temporary_index_path[abi:cxx11](std::basic_string_view >, std::basic_string_view >, long, long, std::basic_string_view >) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_desc.cpp:35
15# doris::segment_v2::InvertedIndexFileWriter::open(doris::TabletIndex const*) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_file_writer.cpp:45
16# doris::segment_v2::InvertedIndexColumnWriterImpl<(doris::FieldType)7>::init_bkd_index() at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:146
17# doris::segment_v2::InvertedIndexColumnWriterImpl<(doris::FieldType)7>::init() at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:116
18# doris::segment_v2::InvertedIndexColumnWriter::create(doris::Field const*, std::unique_ptr >, doris::segment_v2::InvertedIndexFileWriter, doris::TabletIndex const*) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:698
19# doris::segment_v2::ScalarColumnWriter::init() at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/column_writer.cpp:483
20# doris::segment_v2::SegmentWriter::_create_column_writer(unsigned int, doris::TabletColumn const&, std::shared_ptr const&) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:258
21# doris::segment_v2::SegmentWriter::_create_writers(std::shared_ptr const&, std::vector > const&) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:307
22# doris::segment_v2::SegmentWriter::init(std::vector > const&, bool) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:276
23# doris::SegcompactionWorker::_do_compact_segments(std::shared_ptr, std::allocator > > >) in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
24# doris::SegcompactionWorker::compact_segments(std::shared_ptr, std::allocator > > >) at /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segcompaction.cpp:354
25# doris::StorageEngine::_handle_seg_compaction(std::shared_ptr, std::shared_ptr, std::allocator > > >, unsigned long) at /home/zcp/repo_center/doris_master/doris/be/src/olap/olap_server.cpp:1121
26# std::_Function_handler, std::shared_ptr, std::allocator > > >)::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
27# doris::ThreadPool::dispatch_thread() in /mnt/hdd01/ci/master-deploy/be/lib/doris_be
28# doris::Thread::supervise_thread(void*) at /home/zcp/repo_center/doris_master/doris/be/src/util/thread.cpp:499
29# start_thread at ./nptl/pthread_create.c:442
30# 0x00007F5370BA1850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83
172.20.50.120 last coredump sql: last SQL query not found
```

### Check List (For Committer)

- Test <!-- At least one of them must be included. -->

    - [ ] Regression test
    - [x] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No colde files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:

    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?

    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

- Release note

    <!-- bugfix, feat, behavior changed need a release note -->
    <!-- Add one line release note for this PR. -->
    None

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
airborne12 added a commit to airborne12/apache-doris that referenced this pull request Nov 11, 2024
…e in segment compaction (apache#43477)

Related PR: apache#41625

Problem Summary:
Fix coredump when doing segcompaction after refactor of inverted index
file writer.

Co-authored-by: airborne12 <jiangkai@selectdb.com>
yiguolei pushed a commit that referenced this pull request Nov 16, 2024
…43962)

### What problem does this PR solve?
**Problem:**
after restore from other cluster, then rowsets got different index_id,
and make index compaction in base compaction always failed.

**Fix:**
On master branch, this pr: #41625
already fix it.
Here pick it to branch-2.1
csun5285 pushed a commit to csun5285/doris that referenced this pull request Nov 21, 2024
### What problem does this PR solve?
<!--
You need to clearly describe your PR in this part:

1. What problem was fixed (it's best to include specific error reporting
information). How it was fixed.
2. Which behaviors were modified. What was the previous behavior, what
is it now, why was it modified, and what possible impacts might there
be.
3. What features were added. Why this function was added.
4. Which codes were refactored and why this part of the code was
refactored.
5. Which functions were optimized and what is the difference before and
after the optimization.

The description of the PR needs to enable reviewers to quickly and
clearly understand the logic of the code modification.
-->

<!--
If there are related issues, please fill in the issue number.
- If you want the issue to be closed after the PR is merged, please use
"close apache#12345". Otherwise, use "ref apache#12345"
-->

<!--
If this PR is followup a preivous PR, for example, fix the bug that
introduced by a related PR,
link the PR here
-->
Related PR: apache#41625 

Problem Summary:
Refine case of index compaction, DO NOT need so much data.

### Check List (For Committer)

- Test <!-- At least one of them must be included. -->

    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No colde files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:

    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?

    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

- Release note

    <!-- bugfix, feat, behavior changed need a release note -->
    <!-- Add one line release note for this PR. -->
    None

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.3-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants