Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](pipelien) should not finalize probe when wake up early in SetProbeSinkOperatorX (#46706) #46831

Merged
merged 1 commit into from
Jan 13, 2025

Conversation

mrhhsg
Copy link
Member

@mrhhsg mrhhsg commented Jan 12, 2025

*** Query id: 80819fcc223e4a45-b46155de6e0c4eee ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1736352810 (unix time) try "date -d @1736352810" if you are using GNU date ***
*** Current BE git commitID: 08683cbaf5 ***
*** SIGSEGV address not mapped to object (@0x38) received by PID 8736 (TID 11549 OR 0x7f8dd0922640) from PID 56; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris_branch-3.0/doris/be/src/common/signal_handler.h:421
 1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
 3# 0x00007F92019CA520 in /lib/x86_64-linux-gnu/libc.so.6
 4# auto doris::pipeline::SetProbeSinkOperatorX::_refresh_hash_table(doris::pipeline::SetProbeSinkLocalState&)::{lambda(auto:1&&)#1}::operator(), HashTableNoState>, DefaultHash, HashTableGrower<10ul>, Allocator > >&>(doris::vectorized::MethodSerialized, HashTableNoState>, DefaultHash, HashTableGrower<10ul>, Allocator > >&) const at /root/doris_branch-3.0/doris/be/src/pipeline/exec/set_probe_sink_operator.cpp:213
 5# doris::pipeline::SetProbeSinkOperatorX::_finalize_probe(doris::pipeline::SetProbeSinkLocalState&) at /root/doris_branch-3.0/doris/be/src/pipeline/exec/set_probe_sink_operator.cpp:184
 6# doris::pipeline::SetProbeSinkOperatorX::sink(doris::RuntimeState*, doris::vectorized::Block*, bool) at /root/doris_branch-3.0/doris/be/src/pipeline/exec/set_probe_sink_operator.cpp:98
 7# doris::pipeline::PipelineTask::execute(bool*) at /root/doris_branch-3.0/doris/be/src/pipeline/pipeline_task.cpp:387
 8# doris::pipeline::TaskScheduler::_do_work(unsigned long) at /root/doris_branch-3.0/doris/be/src/pipeline/task_scheduler.cpp:138
 9# doris::ThreadPool::dispatch_thread() in /mnt/ssd01/doris-branch40preview/NEREIDS_ASAN/be/lib/doris_be
10# doris::Thread::supervise_thread(void*) at /root/doris_branch-3.0/doris/be/src/util/thread.cpp:499
11# start_thread at ./nptl/pthread_create.c:442
12# 0x00007F9201AAE850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83

What problem does this PR solve?

Issue Number: close #xxx

pick #46706
Related PR: #46706

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…obeSinkOperatorX (apache#46706)

```
*** Query id: 80819fcc223e4a45-b46155de6e0c4eee ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1736352810 (unix time) try "date -d @1736352810" if you are using GNU date ***
*** Current BE git commitID: 08683cb ***
*** SIGSEGV address not mapped to object (@0x38) received by PID 8736 (TID 11549 OR 0x7f8dd0922640) from PID 56; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris_branch-3.0/doris/be/src/common/signal_handler.h:421
 1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
 3# 0x00007F92019CA520 in /lib/x86_64-linux-gnu/libc.so.6
 4# auto doris::pipeline::SetProbeSinkOperatorX::_refresh_hash_table(doris::pipeline::SetProbeSinkLocalState&)::{lambda(auto:1&&)apache#1}::operator(), HashTableNoState>, DefaultHash, HashTableGrower<10ul>, Allocator > >&>(doris::vectorized::MethodSerialized, HashTableNoState>, DefaultHash, HashTableGrower<10ul>, Allocator > >&) const at /root/doris_branch-3.0/doris/be/src/pipeline/exec/set_probe_sink_operator.cpp:213
 5# doris::pipeline::SetProbeSinkOperatorX::_finalize_probe(doris::pipeline::SetProbeSinkLocalState&) at /root/doris_branch-3.0/doris/be/src/pipeline/exec/set_probe_sink_operator.cpp:184
 6# doris::pipeline::SetProbeSinkOperatorX::sink(doris::RuntimeState*, doris::vectorized::Block*, bool) at /root/doris_branch-3.0/doris/be/src/pipeline/exec/set_probe_sink_operator.cpp:98
 7# doris::pipeline::PipelineTask::execute(bool*) at /root/doris_branch-3.0/doris/be/src/pipeline/pipeline_task.cpp:387
 8# doris::pipeline::TaskScheduler::_do_work(unsigned long) at /root/doris_branch-3.0/doris/be/src/pipeline/task_scheduler.cpp:138
 9# doris::ThreadPool::dispatch_thread() in /mnt/ssd01/doris-branch40preview/NEREIDS_ASAN/be/lib/doris_be
10# doris::Thread::supervise_thread(void*) at /root/doris_branch-3.0/doris/be/src/util/thread.cpp:499
11# start_thread at ./nptl/pthread_create.c:442
12# 0x00007F9201AAE850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83
```
@Thearas
Copy link
Contributor

Thearas commented Jan 12, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mrhhsg
Copy link
Member Author

mrhhsg commented Jan 12, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40903 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 65e5d9df783e3f56ff518b90dcffdacadde9d2c5, data reload: false

------ Round 1 ----------------------------------
q1	17582	7428	7404	7404
q2	2072	184	171	171
q3	11227	1042	1192	1042
q4	10579	755	753	753
q5	7756	2879	2801	2801
q6	241	148	146	146
q7	987	608	602	602
q8	9363	1944	1986	1944
q9	6644	6392	6366	6366
q10	7004	2326	2321	2321
q11	465	269	266	266
q12	399	219	217	217
q13	17759	2976	3027	2976
q14	240	208	207	207
q15	585	542	515	515
q16	702	605	601	601
q17	962	622	534	534
q18	7304	6568	6606	6568
q19	1380	1130	1032	1032
q20	471	204	195	195
q21	4021	3269	3253	3253
q22	1104	989	1018	989
Total cold run time: 108847 ms
Total hot run time: 40903 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7257	7254	7199	7199
q2	327	223	226	223
q3	2931	2995	2971	2971
q4	2088	1884	1780	1780
q5	5781	5676	5718	5676
q6	231	147	146	146
q7	2205	1854	1809	1809
q8	3373	3561	3431	3431
q9	8938	8874	8798	8798
q10	3564	3583	3559	3559
q11	622	489	510	489
q12	826	638	652	638
q13	11080	3152	3199	3152
q14	328	275	275	275
q15	581	523	540	523
q16	712	672	653	653
q17	1839	1639	1599	1599
q18	8165	7727	7735	7727
q19	1680	1512	1577	1512
q20	2098	1884	1844	1844
q21	5626	5432	5542	5432
q22	1107	1036	1073	1036
Total cold run time: 71359 ms
Total hot run time: 60472 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 198806 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 65e5d9df783e3f56ff518b90dcffdacadde9d2c5, data reload: false

query1	1275	926	925	925
query2	6250	2079	2077	2077
query3	10984	4359	4352	4352
query4	65627	34562	23843	23843
query5	4911	465	461	461
query6	304	181	177	177
query7	4932	322	311	311
query8	301	242	226	226
query9	6283	2699	2689	2689
query10	433	266	255	255
query11	15510	15242	15842	15242
query12	160	106	110	106
query13	1064	451	437	437
query14	10888	7701	7218	7218
query15	213	191	181	181
query16	7060	490	488	488
query17	1093	630	597	597
query18	1844	342	326	326
query19	210	165	179	165
query20	126	121	120	120
query21	207	101	106	101
query22	4899	4563	4285	4285
query23	34926	34409	35064	34409
query24	6148	3003	3027	3003
query25	542	415	446	415
query26	678	172	176	172
query27	1841	361	363	361
query28	4146	2503	2467	2467
query29	727	483	422	422
query30	243	163	170	163
query31	994	805	884	805
query32	67	55	58	55
query33	469	300	284	284
query34	902	514	529	514
query35	841	732	721	721
query36	1090	962	976	962
query37	122	69	75	69
query38	4115	4093	4028	4028
query39	1562	1491	1477	1477
query40	208	99	101	99
query41	51	46	59	46
query42	113	101	100	100
query43	524	495	479	479
query44	1202	840	843	840
query45	189	168	173	168
query46	1191	747	725	725
query47	1992	1889	1885	1885
query48	490	392	379	379
query49	709	393	394	393
query50	871	437	433	433
query51	7315	7432	7154	7154
query52	99	88	87	87
query53	262	184	183	183
query54	569	445	452	445
query55	79	78	76	76
query56	263	254	249	249
query57	1218	1123	1157	1123
query58	214	200	209	200
query59	3142	3240	3098	3098
query60	271	261	250	250
query61	108	111	108	108
query62	757	660	675	660
query63	216	191	193	191
query64	1360	688	679	679
query65	3268	3198	3229	3198
query66	629	303	299	299
query67	15908	15596	15596	15596
query68	3688	595	584	584
query69	440	283	261	261
query70	1198	1093	1096	1093
query71	331	262	250	250
query72	6429	4100	4212	4100
query73	754	345	345	345
query74	10114	8950	8995	8950
query75	3382	2630	2720	2630
query76	1624	1079	1090	1079
query77	505	269	274	269
query78	10429	9701	9604	9604
query79	1114	610	615	610
query80	917	447	439	439
query81	515	245	237	237
query82	209	117	116	116
query83	168	151	144	144
query84	286	88	89	88
query85	952	299	307	299
query86	375	307	295	295
query87	4439	4298	4271	4271
query88	4249	2382	2366	2366
query89	411	296	288	288
query90	2148	186	193	186
query91	186	153	149	149
query92	65	49	52	49
query93	2271	556	554	554
query94	773	298	303	298
query95	351	258	258	258
query96	614	284	274	274
query97	3396	3163	3224	3163
query98	219	222	201	201
query99	1521	1309	1304	1304
Total cold run time: 311278 ms
Total hot run time: 198806 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.49 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 65e5d9df783e3f56ff518b90dcffdacadde9d2c5, data reload: false

query1	0.04	0.04	0.03
query2	0.07	0.03	0.04
query3	0.23	0.06	0.07
query4	1.63	0.10	0.10
query5	0.53	0.50	0.51
query6	1.14	0.73	0.72
query7	0.02	0.02	0.02
query8	0.04	0.03	0.04
query9	0.57	0.49	0.51
query10	0.55	0.55	0.55
query11	0.14	0.10	0.10
query12	0.15	0.11	0.10
query13	0.61	0.60	0.60
query14	2.92	2.87	2.95
query15	0.90	0.83	0.85
query16	0.39	0.38	0.38
query17	1.05	1.04	1.03
query18	0.23	0.21	0.22
query19	1.94	1.82	1.93
query20	0.01	0.01	0.01
query21	15.35	0.60	0.60
query22	2.71	2.17	2.41
query23	16.99	1.13	0.80
query24	3.02	1.22	1.28
query25	0.28	0.18	0.05
query26	0.39	0.14	0.13
query27	0.04	0.04	0.06
query28	10.49	1.10	1.07
query29	12.55	3.25	3.23
query30	0.25	0.07	0.06
query31	2.86	0.40	0.39
query32	3.25	0.46	0.47
query33	2.95	3.00	3.03
query34	17.16	4.60	4.49
query35	4.59	4.55	4.50
query36	0.69	0.49	0.51
query37	0.10	0.06	0.06
query38	0.05	0.04	0.04
query39	0.04	0.03	0.02
query40	0.15	0.13	0.12
query41	0.09	0.03	0.02
query42	0.03	0.03	0.02
query43	0.04	0.03	0.03
Total cold run time: 107.23 s
Total hot run time: 33.49 s

@yiguolei yiguolei merged commit 6bc2b1e into apache:branch-3.0 Jan 13, 2025
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants