-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
FragmentMgr will create new pthreads if the fragment_map size is more than config::fragment_pool_thread_num.
https://github.com/apache/incubator-doris/blob/0430714ca9b0d910ba9b290276da05adfc752f8f/be/src/runtime/fragment_mgr.cpp#L461
If plan fragments are too much, be has no defence, it'll create too much threads. And the threads in thread pool may starve, because fragment map size is big but the fragments may be executed by pthread(not by the thread in threadpool).
Solution
ref #2915, I think it's a better thread pool for fragment mgr. If func_num > max_thread_num+queue, it will abort the new plan fragment.
Follow-up
After replaced the thread pool, we need to consider about the dtor of FragmentExecState's members.
https://github.com/apache/incubator-doris/blob/0430714ca9b0d910ba9b290276da05adfc752f8f/be/src/runtime/fragment_mgr.cpp#L433
If threadpool.submit() failed, PlanFragmentExecutor will be just prepared, and then closed with OK status.
https://github.com/apache/incubator-doris/blob/0430714ca9b0d910ba9b290276da05adfc752f8f/be/src/runtime/plan_fragment_executor.cpp#L548-L553
It may hide some bugs. Currently I found OlapTableSink can't close with OK status when it didn't do open()/send().
Bugs may be hidden elsewhere, I will test it in our test & product envs. If it works fine, I'll submit a pull request.