Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](jdbc catalog) Fix Memory Leak by Enabling Weak References in HikariCP #39582

Merged
merged 1 commit into from
Aug 23, 2024

Conversation

zy-kkk
Copy link
Member

@zy-kkk zy-kkk commented Aug 19, 2024

This PR addresses a memory leak issue caused by FastList objects in HikariCP being retained by ThreadLocal variables, which are not easily garbage collected in long-running JNI threads. To mitigate this, a system property com.zaxxer.hikari.useWeakReferences is set to true, ensuring that WeakReference is used for ThreadLocal objects, allowing the garbage collector to reclaim memory more effectively.
Even though setting this will affect some performance, solving resource leaks is relatively more important
Performance difference before and after setting
Before setting:
10 concurrency 0.02-0.05
100 concurrency 0.18-0.4
After setting:
10 concurrency 0.02-0.07
100 concurrency 0.18-0.7

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@zy-kkk
Copy link
Member Author

zy-kkk commented Aug 19, 2024

run buildall

morningman
morningman previously approved these changes Aug 19, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 19, 2024
Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 38143 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7c1bae980f5f45e65e89fde188e6920842b8553e, data reload: false

------ Round 1 ----------------------------------
q1	17896	4358	4276	4276
q2	2056	223	224	223
q3	10425	1151	1084	1084
q4	10175	749	782	749
q5	7786	2832	2748	2748
q6	260	162	162	162
q7	998	671	645	645
q8	9409	2085	2062	2062
q9	7091	6542	6530	6530
q10	7072	2259	2172	2172
q11	517	278	279	278
q12	444	264	263	263
q13	17792	3051	2995	2995
q14	294	256	259	256
q15	560	512	537	512
q16	542	410	408	408
q17	979	703	766	703
q18	7519	6837	6757	6757
q19	5599	1091	1014	1014
q20	690	366	358	358
q21	3911	3028	2922	2922
q22	1128	1047	1026	1026
Total cold run time: 113143 ms
Total hot run time: 38143 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4591	4329	4312	4312
q2	423	315	319	315
q3	2863	2654	2639	2639
q4	1917	1620	1679	1620
q5	5637	5684	5626	5626
q6	234	152	155	152
q7	2226	1784	1811	1784
q8	3291	3403	3454	3403
q9	8703	8560	8828	8560
q10	3511	3320	3312	3312
q11	618	521	528	521
q12	829	668	673	668
q13	16461	3125	3174	3125
q14	339	309	306	306
q15	568	522	508	508
q16	501	446	461	446
q17	1806	1558	1507	1507
q18	8827	8035	7984	7984
q19	4642	1648	1474	1474
q20	2106	1896	1899	1896
q21	13844	5354	5261	5261
q22	1167	1068	1082	1068
Total cold run time: 85104 ms
Total hot run time: 56487 ms

@zy-kkk zy-kkk marked this pull request as draft August 19, 2024 16:45
@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Aug 21, 2024
@zy-kkk zy-kkk changed the title [fix](jdbc catalog) Fix HikariDataSource Resource Leak in JdbcExecutor [fix](jdbc catalog) Fix Memory Leak by Enabling Weak References in HikariCP Aug 21, 2024
@zy-kkk zy-kkk marked this pull request as ready for review August 21, 2024 16:10
@zy-kkk
Copy link
Member Author

zy-kkk commented Aug 21, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39052 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 52dad197127dc7dc6e5800807c2e8c1450b232c1, data reload: false

------ Round 1 ----------------------------------
q1	18273	4516	4449	4449
q2	2908	212	217	212
q3	11626	1179	1182	1179
q4	10299	794	787	787
q5	7845	2940	2868	2868
q6	276	160	161	160
q7	1033	678	657	657
q8	9394	2152	2163	2152
q9	7210	6582	6585	6582
q10	7041	2235	2259	2235
q11	487	271	279	271
q12	426	269	266	266
q13	17890	3072	3093	3072
q14	319	251	258	251
q15	548	512	513	512
q16	517	404	407	404
q17	999	721	732	721
q18	7437	6817	6929	6817
q19	1452	1038	1099	1038
q20	706	368	349	349
q21	3871	2986	3178	2986
q22	1131	1088	1084	1084
Total cold run time: 111688 ms
Total hot run time: 39052 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4351	4368	4331	4331
q2	400	305	297	297
q3	2927	2717	2608	2608
q4	1954	1696	1700	1696
q5	5422	5418	5419	5418
q6	234	146	145	145
q7	2125	1792	1806	1792
q8	3265	3374	3348	3348
q9	8486	8422	8471	8422
q10	3491	3240	3175	3175
q11	639	540	535	535
q12	853	642	665	642
q13	14540	3046	3071	3046
q14	325	295	297	295
q15	588	533	537	533
q16	513	466	461	461
q17	1814	1551	1498	1498
q18	7850	7439	7409	7409
q19	1739	1721	1648	1648
q20	2074	1859	1864	1859
q21	5590	5212	5312	5212
q22	1186	1092	1067	1067
Total cold run time: 70366 ms
Total hot run time: 55437 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192655 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 52dad197127dc7dc6e5800807c2e8c1450b232c1, data reload: false

query1	955	389	382	382
query2	6740	2056	1974	1974
query3	6679	232	242	232
query4	34140	23269	23245	23245
query5	4368	723	702	702
query6	315	215	212	212
query7	4612	334	327	327
query8	485	441	430	430
query9	8651	2561	2556	2556
query10	513	344	346	344
query11	16031	15121	15252	15121
query12	206	140	138	138
query13	1707	475	463	463
query14	10617	7448	6734	6734
query15	291	192	191	191
query16	8070	473	496	473
query17	1820	599	588	588
query18	2175	344	359	344
query19	363	165	168	165
query20	145	138	139	138
query21	252	142	142	142
query22	4402	4205	4187	4187
query23	34167	33423	33793	33423
query24	11189	2955	2985	2955
query25	661	427	419	419
query26	1209	185	195	185
query27	2757	305	303	303
query28	7340	2120	2112	2112
query29	868	461	450	450
query30	339	189	188	188
query31	1047	850	845	845
query32	123	85	86	85
query33	821	362	349	349
query34	910	524	526	524
query35	890	774	787	774
query36	1128	953	967	953
query37	175	106	108	106
query38	4019	3866	3967	3866
query39	1532	1466	1448	1448
query40	243	158	157	157
query41	140	136	165	136
query42	149	126	158	126
query43	562	508	524	508
query44	1245	802	808	802
query45	228	201	201	201
query46	1138	803	824	803
query47	1982	1833	1859	1833
query48	416	349	347	347
query49	1182	591	597	591
query50	867	485	480	480
query51	7301	7038	7203	7038
query52	119	109	108	108
query53	297	238	239	238
query54	1014	525	504	504
query55	91	89	93	89
query56	335	318	319	318
query57	1254	1162	1099	1099
query58	323	298	324	298
query59	3126	2951	2996	2951
query60	358	332	329	329
query61	174	148	146	146
query62	898	719	721	719
query63	273	229	225	225
query64	5978	2371	1842	1842
query65	3280	3213	3192	3192
query66	1550	666	675	666
query67	15699	15452	15327	15327
query68	4935	600	600	600
query69	472	320	322	320
query70	1234	1214	1125	1125
query71	487	325	323	323
query72	6548	2365	2101	2101
query73	805	364	369	364
query74	9430	9019	8969	8969
query75	3573	2781	2826	2781
query76	2813	1057	1025	1025
query77	629	445	457	445
query78	11429	9644	9141	9141
query79	1979	574	567	567
query80	965	621	606	606
query81	621	269	262	262
query82	583	156	161	156
query83	286	216	214	214
query84	260	101	97	97
query85	777	362	349	349
query86	492	316	328	316
query87	4427	4291	4314	4291
query88	4029	2550	2549	2549
query89	424	330	326	326
query90	1951	244	238	238
query91	157	128	127	127
query92	91	77	75	75
query93	1139	565	558	558
query94	798	317	336	317
query95	399	301	303	301
query96	612	294	285	285
query97	3251	3129	3136	3129
query98	246	225	224	224
query99	1687	1343	1328	1328
Total cold run time: 300463 ms
Total hot run time: 192655 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.81 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 52dad197127dc7dc6e5800807c2e8c1450b232c1, data reload: false

query1	0.05	0.05	0.04
query2	0.08	0.04	0.04
query3	0.22	0.05	0.05
query4	1.66	0.07	0.07
query5	0.54	0.50	0.49
query6	1.12	0.74	0.72
query7	0.02	0.01	0.01
query8	0.06	0.05	0.04
query9	0.55	0.49	0.48
query10	0.54	0.56	0.55
query11	0.16	0.13	0.12
query12	0.15	0.13	0.12
query13	0.62	0.59	0.58
query14	0.76	0.79	0.77
query15	0.85	0.83	0.84
query16	0.36	0.35	0.36
query17	1.05	1.02	0.97
query18	0.22	0.21	0.20
query19	1.86	1.76	1.76
query20	0.02	0.01	0.02
query21	15.40	0.67	0.66
query22	4.06	6.34	2.58
query23	18.32	1.36	1.32
query24	2.26	0.22	0.22
query25	0.17	0.09	0.09
query26	0.26	0.19	0.17
query27	0.09	0.08	0.07
query28	13.14	1.03	1.01
query29	12.67	3.43	3.43
query30	0.42	0.20	0.19
query31	2.80	0.40	0.40
query32	3.26	0.48	0.48
query33	2.94	2.98	3.01
query34	17.20	4.41	4.36
query35	4.44	4.42	4.44
query36	0.67	0.47	0.52
query37	0.21	0.17	0.18
query38	0.18	0.16	0.16
query39	0.07	0.06	0.07
query40	0.18	0.16	0.16
query41	0.12	0.08	0.07
query42	0.08	0.07	0.07
query43	0.07	0.07	0.07
Total cold run time: 109.9 s
Total hot run time: 31.81 s

@zy-kkk
Copy link
Member Author

zy-kkk commented Aug 22, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38892 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 52dad197127dc7dc6e5800807c2e8c1450b232c1, data reload: false

------ Round 1 ----------------------------------
q1	17866	5114	4409	4409
q2	2060	221	226	221
q3	11706	997	1229	997
q4	10521	781	768	768
q5	7794	2905	2869	2869
q6	266	158	162	158
q7	1028	663	664	663
q8	9411	2138	2135	2135
q9	7341	6601	6590	6590
q10	7089	2257	2262	2257
q11	497	284	281	281
q12	438	271	268	268
q13	17779	3056	3067	3056
q14	302	258	256	256
q15	571	537	533	533
q16	558	428	414	414
q17	1004	703	787	703
q18	7457	6878	6810	6810
q19	1458	1114	1148	1114
q20	714	365	378	365
q21	3995	2992	3035	2992
q22	1166	1045	1033	1033
Total cold run time: 111021 ms
Total hot run time: 38892 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4446	4389	4382	4382
q2	429	315	322	315
q3	2910	2711	2729	2711
q4	1967	1668	1706	1668
q5	5692	5738	5721	5721
q6	247	152	166	152
q7	2299	1864	1855	1855
q8	3336	3478	3463	3463
q9	8845	8873	8833	8833
q10	3605	3374	3450	3374
q11	645	539	546	539
q12	900	671	705	671
q13	17100	3231	3283	3231
q14	332	301	302	301
q15	561	526	542	526
q16	527	467	475	467
q17	1863	1569	1598	1569
q18	8238	7939	7791	7791
q19	4211	1646	1731	1646
q20	2164	1921	1909	1909
q21	5664	5471	5507	5471
q22	1181	1096	1087	1087
Total cold run time: 77162 ms
Total hot run time: 57682 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 198387 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 52dad197127dc7dc6e5800807c2e8c1450b232c1, data reload: false

query1	1324	911	875	875
query2	6581	1992	2022	1992
query3	10641	3910	3970	3910
query4	59064	23864	23557	23557
query5	5948	741	735	735
query6	481	214	214	214
query7	5800	342	345	342
query8	520	456	438	438
query9	9246	2568	2552	2552
query10	626	354	346	346
query11	18108	15162	15446	15162
query12	212	148	144	144
query13	1644	462	460	460
query14	11568	7623	7394	7394
query15	279	192	195	192
query16	7622	512	495	495
query17	1233	620	631	620
query18	2003	353	353	353
query19	300	175	175	175
query20	160	140	136	136
query21	252	144	144	144
query22	4543	4360	4291	4291
query23	34413	33986	33864	33864
query24	6048	3058	3013	3013
query25	579	431	435	431
query26	705	196	191	191
query27	1740	310	311	310
query28	3940	2175	2155	2155
query29	747	478	472	472
query30	232	196	192	192
query31	1027	901	838	838
query32	110	77	81	77
query33	508	344	348	344
query34	893	530	520	520
query35	876	753	787	753
query36	1093	952	971	952
query37	167	109	105	105
query38	3896	3935	3874	3874
query39	1559	1470	1472	1470
query40	238	157	157	157
query41	144	139	144	139
query42	143	122	119	119
query43	545	491	516	491
query44	1124	791	812	791
query45	224	199	199	199
query46	1138	778	807	778
query47	1992	1894	1835	1835
query48	416	346	347	346
query49	933	594	589	589
query50	880	500	495	495
query51	7308	7091	7160	7091
query52	118	111	109	109
query53	297	223	226	223
query54	620	514	521	514
query55	91	88	91	88
query56	340	325	329	325
query57	1234	1131	1138	1131
query58	303	327	327	327
query59	3021	2840	2637	2637
query60	353	332	336	332
query61	153	153	154	153
query62	810	716	704	704
query63	271	231	225	225
query64	4470	2412	1928	1928
query65	3251	3222	3229	3222
query66	1014	691	686	686
query67	15301	15176	15486	15176
query68	4516	597	615	597
query69	478	321	324	321
query70	1236	1169	1141	1141
query71	422	316	324	316
query72	6577	2385	2156	2156
query73	791	420	372	372
query74	9260	8814	8962	8814
query75	3467	2791	2789	2789
query76	2013	1041	1147	1041
query77	688	474	440	440
query78	9735	9102	9093	9093
query79	1075	574	568	568
query80	864	624	647	624
query81	583	268	272	268
query82	304	156	157	156
query83	273	221	216	216
query84	289	103	103	103
query85	957	363	367	363
query86	377	324	320	320
query87	4475	4359	4331	4331
query88	3198	2552	2561	2552
query89	436	326	322	322
query90	2094	239	239	239
query91	155	129	129	129
query92	86	79	77	77
query93	1097	558	557	557
query94	833	335	330	330
query95	412	316	304	304
query96	606	285	288	285
query97	3261	3137	3121	3121
query98	232	231	233	231
query99	1563	1337	1313	1313
Total cold run time: 315773 ms
Total hot run time: 198387 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.42 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 52dad197127dc7dc6e5800807c2e8c1450b232c1, data reload: false

query1	0.05	0.05	0.04
query2	0.08	0.05	0.04
query3	0.23	0.05	0.05
query4	1.66	0.08	0.07
query5	0.53	0.49	0.53
query6	1.13	0.73	0.72
query7	0.02	0.02	0.02
query8	0.06	0.05	0.06
query9	0.54	0.49	0.48
query10	0.55	0.54	0.55
query11	0.17	0.12	0.12
query12	0.15	0.13	0.13
query13	0.64	0.59	0.60
query14	0.80	0.80	0.78
query15	0.85	0.83	0.83
query16	0.38	0.38	0.37
query17	0.99	1.04	0.97
query18	0.21	0.21	0.21
query19	1.92	1.82	1.78
query20	0.01	0.01	0.01
query21	15.42	0.68	0.67
query22	4.27	7.30	2.05
query23	18.29	1.39	1.40
query24	2.11	0.24	0.24
query25	0.15	0.09	0.09
query26	0.28	0.19	0.19
query27	0.10	0.09	0.08
query28	13.23	1.03	1.00
query29	12.64	3.49	3.45
query30	0.44	0.25	0.19
query31	2.81	0.40	0.39
query32	3.24	0.48	0.48
query33	3.00	2.98	2.98
query34	17.08	4.38	4.35
query35	4.45	4.41	4.44
query36	0.66	0.51	0.48
query37	0.21	0.16	0.17
query38	0.17	0.16	0.16
query39	0.06	0.06	0.05
query40	0.19	0.15	0.14
query41	0.11	0.07	0.07
query42	0.07	0.06	0.07
query43	0.07	0.05	0.06
Total cold run time: 110.02 s
Total hot run time: 31.42 s

@morningman morningman merged commit 2c154c6 into apache:master Aug 23, 2024
29 of 31 checks passed
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 23, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@zy-kkk zy-kkk deleted the fix_hikari_lead branch August 23, 2024 06:54
zy-kkk added a commit to zy-kkk/doris that referenced this pull request Aug 23, 2024
…kariCP (apache#39582)

This PR addresses a memory leak issue caused by FastList objects in
HikariCP being retained by ThreadLocal variables, which are not easily
garbage collected in long-running JNI threads. To mitigate this, a
system property com.zaxxer.hikari.useWeakReferences is set to true,
ensuring that WeakReference is used for ThreadLocal objects, allowing
the garbage collector to reclaim memory more effectively.
Even though setting this will affect some performance, solving resource
leaks is relatively more important
Performance difference before and after setting
Before setting:
10 concurrency 0.02-0.05
100 concurrency 0.18-0.4
After setting:
10 concurrency 0.02-0.07
100 concurrency 0.18-0.7
zy-kkk added a commit to zy-kkk/doris that referenced this pull request Aug 23, 2024
…kariCP (apache#39582)

This PR addresses a memory leak issue caused by FastList objects in
HikariCP being retained by ThreadLocal variables, which are not easily
garbage collected in long-running JNI threads. To mitigate this, a
system property com.zaxxer.hikari.useWeakReferences is set to true,
ensuring that WeakReference is used for ThreadLocal objects, allowing
the garbage collector to reclaim memory more effectively.
Even though setting this will affect some performance, solving resource
leaks is relatively more important
Performance difference before and after setting
Before setting:
10 concurrency 0.02-0.05
100 concurrency 0.18-0.4
After setting:
10 concurrency 0.02-0.07
100 concurrency 0.18-0.7
zy-kkk added a commit that referenced this pull request Aug 23, 2024
…rences in HikariCP (#39835)

pick (#39582)

This PR addresses a memory leak issue caused by FastList objects in
HikariCP being retained by ThreadLocal variables, which are not easily
garbage collected in long-running JNI threads. To mitigate this, a
system property com.zaxxer.hikari.useWeakReferences is set to true,
ensuring that WeakReference is used for ThreadLocal objects, allowing
the garbage collector to reclaim memory more effectively. Even though
setting this will affect some performance, solving resource leaks is
relatively more important
Performance difference before and after setting
Before setting:
10 concurrency 0.02-0.05
100 concurrency 0.18-0.4
After setting:
10 concurrency 0.02-0.07
100 concurrency 0.18-0.7
morningman pushed a commit that referenced this pull request Aug 23, 2024
dataroaring pushed a commit that referenced this pull request Aug 26, 2024
…kariCP (#39582)

This PR addresses a memory leak issue caused by FastList objects in
HikariCP being retained by ThreadLocal variables, which are not easily
garbage collected in long-running JNI threads. To mitigate this, a
system property com.zaxxer.hikari.useWeakReferences is set to true,
ensuring that WeakReference is used for ThreadLocal objects, allowing
the garbage collector to reclaim memory more effectively.
Even though setting this will affect some performance, solving resource
leaks is relatively more important
Performance difference before and after setting
Before setting:
10 concurrency 0.02-0.05
100 concurrency 0.18-0.4
After setting:
10 concurrency 0.02-0.07
100 concurrency 0.18-0.7
@yiguolei yiguolei mentioned this pull request Sep 5, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants