-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](group commit) make group commit cancel in time #36249
Conversation
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TeamCity be ut coverage result: |
TPC-H: Total hot run time: 40354 ms
|
TPC-DS: Total hot run time: 173785 ms
|
ClickBench: Total hot run time: 30.38 s
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
## Proposed changes If group commit time interval is larger than the load timeout, and there is no new client load to reuse the internal group commit load, the group commit can not cancel in time because it stuck in wait: ``` #0 0x00007f33937a47aa in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00005651105dbd05 in __gthread_cond_timedwait(pthread_cond_t*, pthread_mutex_t*, timespec const*) () #2 0x000056511063f385 in std::__condvar::wait_until(std::mutex&, timespec&) () #3 0x000056511063dc2e in std::cv_status std::condition_variable::__wait_until_impl<std::chrono::duration<long, std::ratio<1l, 1000000000l> > >(std::unique_lock<std::mutex>&, std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > > const&) () #4 0x000056511063cedf in std::cv_status std::condition_variable::wait_until<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >(std::unique_lock<std::mutex>&, std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > > const&) () #5 0x0000565110824f48 in std::cv_status std::condition_variable::wait_for<long, std::ratio<1l, 1000l> >(std::unique_lock<std::mutex>&, std::chrono::duration<long, std::ratio<1l, 1000l> > const&) () #6 0x0000565113b5612a in doris::LoadBlockQueue::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*, bool*) () #7 0x000056513f900941 in doris::pipeline::GroupCommitOperatorX::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) () #8 0x000056513c69c0b6 in doris::pipeline::ScanOperatorX<doris::pipeline::GroupCommitLocalState>::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) () #9 0x000056514009d5f1 in doris::pipeline::PipelineTask::execute(bool*) () #10 0x00005651400fb24a in doris::pipeline::TaskScheduler::_do_work(unsigned long) () ```
## Proposed changes If group commit time interval is larger than the load timeout, and there is no new client load to reuse the internal group commit load, the group commit can not cancel in time because it stuck in wait: ``` #0 0x00007f33937a47aa in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00005651105dbd05 in __gthread_cond_timedwait(pthread_cond_t*, pthread_mutex_t*, timespec const*) () #2 0x000056511063f385 in std::__condvar::wait_until(std::mutex&, timespec&) () #3 0x000056511063dc2e in std::cv_status std::condition_variable::__wait_until_impl<std::chrono::duration<long, std::ratio<1l, 1000000000l> > >(std::unique_lock<std::mutex>&, std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > > const&) () #4 0x000056511063cedf in std::cv_status std::condition_variable::wait_until<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >(std::unique_lock<std::mutex>&, std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > > const&) () #5 0x0000565110824f48 in std::cv_status std::condition_variable::wait_for<long, std::ratio<1l, 1000l> >(std::unique_lock<std::mutex>&, std::chrono::duration<long, std::ratio<1l, 1000l> > const&) () #6 0x0000565113b5612a in doris::LoadBlockQueue::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*, bool*) () #7 0x000056513f900941 in doris::pipeline::GroupCommitOperatorX::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) () #8 0x000056513c69c0b6 in doris::pipeline::ScanOperatorX<doris::pipeline::GroupCommitLocalState>::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) () #9 0x000056514009d5f1 in doris::pipeline::PipelineTask::execute(bool*) () #10 0x00005651400fb24a in doris::pipeline::TaskScheduler::_do_work(unsigned long) () ```
Proposed changes
If group commit time interval is larger than the load timeout, and there is no new client load to reuse the internal group commit load, the group commit can not cancel in time because it stuck in wait: