Skip to content

Conversation

@BePPPower
Copy link
Contributor

@BePPPower BePPPower commented Feb 17, 2025

Problem Summary:

Each Export task will be deleted after execution in the finally block of the TaskHandler#onTransientTaskHandle() method. Therefore, when removing a task in the TransientTaskManager#cancelMemoryTask method, there should be a check to see if the task exists. Otherwise, a null pointer exception may occur, causing the Export Job status update to fail.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Feb 17, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@BePPPower
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31573 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 13e149df25405cd9383f4614d5f0687394857ef2, data reload: false

------ Round 1 ----------------------------------
q1	17582	5286	5166	5166
q2	2060	302	165	165
q3	10402	1288	718	718
q4	10226	998	541	541
q5	7547	2373	2300	2300
q6	183	171	136	136
q7	902	720	598	598
q8	9300	1272	1059	1059
q9	5600	4614	4692	4614
q10	6820	2333	1874	1874
q11	476	281	262	262
q12	348	352	220	220
q13	17757	3715	3118	3118
q14	248	237	207	207
q15	518	469	453	453
q16	634	612	576	576
q17	563	872	333	333
q18	6970	6247	6256	6247
q19	1196	933	546	546
q20	302	328	187	187
q21	2743	2137	1947	1947
q22	363	339	306	306
Total cold run time: 102740 ms
Total hot run time: 31573 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5159	5195	5168	5168
q2	230	331	239	239
q3	2162	2701	2318	2318
q4	1472	1870	1449	1449
q5	4214	4122	4164	4122
q6	206	162	124	124
q7	1849	1805	1727	1727
q8	2619	2720	2598	2598
q9	7130	7241	7203	7203
q10	2922	3194	2801	2801
q11	589	501	488	488
q12	678	761	608	608
q13	3464	3858	3266	3266
q14	278	294	281	281
q15	529	467	445	445
q16	653	686	654	654
q17	1123	1612	1332	1332
q18	7504	7447	7270	7270
q19	804	840	927	840
q20	1965	2059	1895	1895
q21	5298	4836	4839	4836
q22	628	631	549	549
Total cold run time: 51476 ms
Total hot run time: 50213 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190806 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 13e149df25405cd9383f4614d5f0687394857ef2, data reload: false

query1	1302	946	955	946
query2	6143	1854	1859	1854
query3	11110	4765	4696	4696
query4	25940	24023	23448	23448
query5	5470	676	493	493
query6	316	209	211	209
query7	3994	506	308	308
query8	299	243	244	243
query9	8507	2521	2526	2521
query10	479	334	241	241
query11	15461	15178	14974	14974
query12	162	108	104	104
query13	1566	512	379	379
query14	9012	7126	6432	6432
query15	215	189	170	170
query16	7637	694	494	494
query17	1096	732	586	586
query18	2049	414	343	343
query19	203	216	181	181
query20	127	121	124	121
query21	214	132	121	121
query22	4440	4490	4342	4342
query23	34603	33480	33632	33480
query24	7009	2438	2455	2438
query25	517	468	407	407
query26	778	276	155	155
query27	2238	503	340	340
query28	3769	2448	2409	2409
query29	613	571	434	434
query30	223	186	162	162
query31	915	879	788	788
query32	76	66	62	62
query33	518	386	303	303
query34	1084	875	505	505
query35	821	838	797	797
query36	988	1000	902	902
query37	125	104	69	69
query38	4387	4372	4382	4372
query39	1520	1485	1468	1468
query40	203	124	106	106
query41	58	50	52	50
query42	137	114	106	106
query43	505	530	476	476
query44	1291	814	815	814
query45	196	178	173	173
query46	892	1071	676	676
query47	1840	1838	1793	1793
query48	390	446	321	321
query49	724	529	427	427
query50	705	757	436	436
query51	4338	4321	4367	4321
query52	119	106	96	96
query53	239	268	187	187
query54	503	515	410	410
query55	92	85	85	85
query56	286	260	273	260
query57	1182	1199	1157	1157
query58	266	257	255	255
query59	2706	2950	2697	2697
query60	282	277	255	255
query61	116	115	118	115
query62	778	724	654	654
query63	225	198	192	192
query64	2676	983	651	651
query65	3220	3137	3147	3137
query66	796	410	318	318
query67	15916	15827	15261	15261
query68	4359	803	560	560
query69	471	301	272	272
query70	1213	1088	1114	1088
query71	394	288	265	265
query72	5196	3562	3788	3562
query73	721	725	346	346
query74	8870	9163	8709	8709
query75	3178	3152	2716	2716
query76	3035	1200	760	760
query77	466	402	276	276
query78	10024	10123	9302	9302
query79	1651	816	583	583
query80	1697	535	461	461
query81	536	276	233	233
query82	413	127	105	105
query83	277	185	160	160
query84	235	104	74	74
query85	783	348	306	306
query86	408	338	275	275
query87	4395	4711	4547	4547
query88	2859	2183	2191	2183
query89	388	318	279	279
query90	1582	193	196	193
query91	137	140	110	110
query92	62	56	61	56
query93	1207	1024	587	587
query94	674	406	302	302
query95	355	274	259	259
query96	488	548	270	270
query97	2802	2888	2741	2741
query98	216	208	216	208
query99	1322	1427	1292	1292
Total cold run time: 267021 ms
Total hot run time: 190806 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.36 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 13e149df25405cd9383f4614d5f0687394857ef2, data reload: false

query1	0.04	0.04	0.03
query2	0.07	0.05	0.04
query3	0.24	0.06	0.07
query4	1.62	0.11	0.10
query5	0.41	0.41	0.39
query6	1.18	0.67	0.65
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.59	0.53	0.52
query10	0.58	0.58	0.57
query11	0.15	0.10	0.10
query12	0.14	0.11	0.11
query13	0.63	0.60	0.60
query14	2.77	2.72	2.73
query15	0.92	0.87	0.87
query16	0.38	0.37	0.36
query17	1.01	1.04	1.04
query18	0.21	0.19	0.20
query19	1.96	1.79	2.00
query20	0.02	0.02	0.01
query21	15.34	0.89	0.54
query22	0.75	1.26	0.70
query23	14.81	1.36	0.60
query24	10.59	1.52	0.49
query25	0.57	0.20	0.11
query26	0.96	0.19	0.15
query27	0.05	0.05	0.05
query28	6.35	0.78	0.43
query29	12.58	3.89	3.24
query30	0.26	0.09	0.06
query31	2.82	0.58	0.39
query32	3.23	0.55	0.47
query33	2.99	3.02	3.11
query34	15.68	5.12	4.54
query35	4.55	4.55	4.59
query36	0.68	0.50	0.49
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.02
query40	0.18	0.14	0.13
query41	0.08	0.03	0.03
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 105.7 s
Total hot run time: 30.36 s


public void cancelMemoryTask(Long taskId) throws JobException {
try {
if (taskExecutorMap.containsKey(taskId)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to get the task from map first, and then check if it is null and do the cancel.
Because the task may be removed after containsKey() and before get(),
this 2 methods are not atomic

private void exportExportJob() {
private void exportExportJob() throws JobException {
if (getState() == ExportJobState.CANCELLED || getState() == ExportJobState.FINISHED) {
throw new JobException("export job has been cancelled or finished.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Print the job state in error msg.

@BePPPower
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31625 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2dee6c4ede2529604d933dd2fd259a1f48616782, data reload: false

------ Round 1 ----------------------------------
q1	17602	5243	5064	5064
q2	2046	301	187	187
q3	10398	1259	740	740
q4	10274	1008	553	553
q5	8217	2384	2364	2364
q6	188	166	132	132
q7	902	760	598	598
q8	9321	1296	1116	1116
q9	4826	4671	4720	4671
q10	6841	2326	1880	1880
q11	474	281	252	252
q12	344	345	227	227
q13	18019	3724	3086	3086
q14	227	235	204	204
q15	494	475	459	459
q16	624	601	576	576
q17	574	845	371	371
q18	6581	6341	6201	6201
q19	1210	945	528	528
q20	303	307	191	191
q21	2726	2110	1915	1915
q22	352	328	310	310
Total cold run time: 102543 ms
Total hot run time: 31625 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5090	5062	5067	5062
q2	229	323	227	227
q3	2128	2657	2311	2311
q4	1441	1832	1334	1334
q5	4224	4114	4135	4114
q6	201	163	122	122
q7	1870	1794	1671	1671
q8	2617	2695	2551	2551
q9	7237	7077	7190	7077
q10	3010	3203	2769	2769
q11	625	537	510	510
q12	699	766	645	645
q13	3399	3904	3288	3288
q14	275	290	264	264
q15	511	448	460	448
q16	639	679	637	637
q17	1112	1591	1350	1350
q18	7583	7483	7406	7406
q19	776	815	799	799
q20	1992	2004	1849	1849
q21	5374	4884	4767	4767
q22	608	630	564	564
Total cold run time: 51640 ms
Total hot run time: 49765 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 189967 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2dee6c4ede2529604d933dd2fd259a1f48616782, data reload: false

query1	1345	952	935	935
query2	6184	1814	1796	1796
query3	11002	4534	4455	4455
query4	54535	24818	23500	23500
query5	4806	538	483	483
query6	316	181	181	181
query7	4893	506	285	285
query8	284	226	224	224
query9	5351	2532	2495	2495
query10	421	310	264	264
query11	15090	15051	14958	14958
query12	154	106	108	106
query13	1021	521	375	375
query14	10126	6735	6274	6274
query15	223	198	187	187
query16	7103	646	473	473
query17	1046	714	545	545
query18	1569	412	305	305
query19	197	185	157	157
query20	121	122	129	122
query21	204	132	111	111
query22	4546	4773	4472	4472
query23	34128	33446	33520	33446
query24	5813	2458	2403	2403
query25	439	463	412	412
query26	677	278	159	159
query27	1802	498	336	336
query28	2771	2458	2431	2431
query29	590	578	467	467
query30	218	193	156	156
query31	892	862	819	819
query32	78	66	68	66
query33	455	406	306	306
query34	758	871	516	516
query35	779	841	763	763
query36	952	1016	997	997
query37	123	97	69	69
query38	4295	4272	4182	4182
query39	1479	1444	1474	1444
query40	206	118	102	102
query41	50	48	51	48
query42	128	102	102	102
query43	494	513	501	501
query44	1316	812	792	792
query45	175	169	163	163
query46	895	1066	666	666
query47	1865	1874	1741	1741
query48	387	424	313	313
query49	694	547	426	426
query50	743	756	414	414
query51	4314	4347	4283	4283
query52	112	105	91	91
query53	230	258	190	190
query54	471	501	413	413
query55	82	76	81	76
query56	244	282	261	261
query57	1169	1193	1118	1118
query58	256	240	237	237
query59	2663	2751	2608	2608
query60	286	292	289	289
query61	155	139	137	137
query62	743	770	700	700
query63	235	195	196	195
query64	1591	1047	671	671
query65	3374	3217	3117	3117
query66	787	388	297	297
query67	15893	15505	15327	15327
query68	5625	769	513	513
query69	524	379	266	266
query70	1199	1138	1135	1135
query71	441	296	254	254
query72	6360	3556	3716	3556
query73	1335	750	355	355
query74	9008	8950	8846	8846
query75	3225	3136	2706	2706
query76	3838	1175	734	734
query77	547	364	283	283
query78	10173	10145	9420	9420
query79	2343	806	593	593
query80	665	532	443	443
query81	505	267	239	239
query82	218	120	92	92
query83	176	166	155	155
query84	286	92	76	76
query85	752	338	302	302
query86	348	282	302	282
query87	4406	4525	4323	4323
query88	3467	2223	2202	2202
query89	399	319	297	297
query90	1820	194	194	194
query91	137	142	108	108
query92	74	57	63	57
query93	2244	1021	586	586
query94	660	421	298	298
query95	343	271	260	260
query96	524	565	273	273
query97	2782	2819	2746	2746
query98	232	197	200	197
query99	1356	1405	1223	1223
Total cold run time: 292356 ms
Total hot run time: 189967 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.03 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2dee6c4ede2529604d933dd2fd259a1f48616782, data reload: false

query1	0.03	0.03	0.03
query2	0.08	0.04	0.03
query3	0.25	0.07	0.07
query4	1.61	0.11	0.10
query5	0.42	0.42	0.41
query6	1.18	0.66	0.67
query7	0.02	0.02	0.01
query8	0.04	0.03	0.03
query9	0.60	0.50	0.51
query10	0.59	0.58	0.57
query11	0.15	0.11	0.10
query12	0.14	0.10	0.11
query13	0.62	0.60	0.60
query14	2.68	2.71	2.71
query15	0.92	0.85	0.84
query16	0.38	0.37	0.37
query17	1.04	1.06	1.05
query18	0.21	0.20	0.20
query19	1.88	1.80	2.04
query20	0.02	0.01	0.01
query21	15.36	0.85	0.53
query22	0.76	1.19	0.64
query23	14.95	1.37	0.61
query24	9.81	3.30	0.42
query25	0.32	0.10	0.12
query26	0.97	0.19	0.14
query27	0.05	0.04	0.04
query28	6.23	0.78	0.43
query29	12.54	3.99	3.28
query30	0.26	0.09	0.06
query31	2.81	0.58	0.38
query32	3.22	0.55	0.46
query33	3.05	2.98	2.99
query34	15.79	5.10	4.46
query35	4.52	4.58	4.50
query36	0.64	0.49	0.48
query37	0.09	0.06	0.06
query38	0.06	0.04	0.04
query39	0.03	0.03	0.03
query40	0.18	0.15	0.13
query41	0.07	0.02	0.03
query42	0.03	0.02	0.02
query43	0.04	0.04	0.03
Total cold run time: 104.64 s
Total hot run time: 30.03 s

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 19, 2025
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit 4d90ef1 into apache:master Feb 19, 2025
28 checks passed
github-actions bot pushed a commit that referenced this pull request Feb 19, 2025
…on EXPORTING. (#47974)

Problem Summary:

Each Export task will be deleted after execution in the finally block of
the TaskHandler#onTransientTaskHandle() method. Therefore, when removing
a task in the TransientTaskManager#cancelMemoryTask method, there should
be a check to see if the task exists. Otherwise, a null pointer
exception may occur, causing the Export Job status update to fail.
github-actions bot pushed a commit that referenced this pull request Feb 19, 2025
…on EXPORTING. (#47974)

Problem Summary:

Each Export task will be deleted after execution in the finally block of
the TaskHandler#onTransientTaskHandle() method. Therefore, when removing
a task in the TransientTaskManager#cancelMemoryTask method, there should
be a check to see if the task exists. Otherwise, a null pointer
exception may occur, causing the Export Job status update to fail.
yiguolei pushed a commit that referenced this pull request Feb 19, 2025
…stays stuck on EXPORTING. #47974 (#48060)

Cherry-picked from #47974

Co-authored-by: Tiewei Fang <fangtiewei@selectdb.com>
lzyy2024 pushed a commit to lzyy2024/doris that referenced this pull request Feb 21, 2025
…on EXPORTING. (apache#47974)

Problem Summary:

Each Export task will be deleted after execution in the finally block of
the TaskHandler#onTransientTaskHandle() method. Therefore, when removing
a task in the TransientTaskManager#cancelMemoryTask method, there should
be a check to see if the task exists. Otherwise, a null pointer
exception may occur, causing the Export Job status update to fail.
dataroaring pushed a commit that referenced this pull request Feb 24, 2025
…stays stuck on EXPORTING. #47974 (#48059)

Cherry-picked from #47974

Co-authored-by: Tiewei Fang <fangtiewei@selectdb.com>
dataroaring pushed a commit that referenced this pull request May 26, 2025
### What problem does this PR solve?

Related PR: #47974

Problem Summary:

When cancel task, the task should be removed from `taskExecutorMap`,
otherwise, if `tryPublishTask()` failed when `addMemoryTask`, the task
will
be leaked in `taskExecutorMap`
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…on EXPORTING. (apache#47974)

Problem Summary:

Each Export task will be deleted after execution in the finally block of
the TaskHandler#onTransientTaskHandle() method. Therefore, when removing
a task in the TransientTaskManager#cancelMemoryTask method, there should
be a check to see if the task exists. Otherwise, a null pointer
exception may occur, causing the Export Job status update to fail.
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
### What problem does this PR solve?

Related PR: apache#47974

Problem Summary:

When cancel task, the task should be removed from `taskExecutorMap`,
otherwise, if `tryPublishTask()` failed when `addMemoryTask`, the task
will
be leaked in `taskExecutorMap`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.9-merged dev/3.0.5-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants