-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix] (mem tracker) Fix some memory leaks, inaccurate statistics, core dump, deadlock bugs #10072
Conversation
d9df8ce
to
590dc6a
Compare
be/src/runtime/mem_tracker.cpp
Outdated
@@ -124,14 +124,17 @@ std::shared_ptr<MemTracker> MemTracker::create_tracker_impl( | |||
std::string reset_label; | |||
MemTracker* task_parent_tracker = reset_parent->parent_task_mem_tracker(); | |||
if (task_parent_tracker) { | |||
reset_label = fmt::format("{}:{}", label, split(task_parent_tracker->label(), ":")[1]); | |||
reset_label = fmt::format("{}&{}", label, split(task_parent_tracker->label(), "&")[1]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change to &
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because :
has many meanings as a separator in task mem tracker label, and &
is only used to split out queryId
or loadID
or tabletID
.
The latest code replaces &
with #
, which seems a little nicer :)
} | ||
} | ||
for (auto tid : expired_tasks) { | ||
// This means that after all RuntimeState is destructed, | ||
// there are still task mem trackers that are get or register. | ||
// The only known case: after an load task ends all fragments on a BE,`tablet_writer_open` is still | ||
// called to create a channel, and the load task tracker will be re-registered in the channel open. | ||
// https://github.com/apache/incubator-doris/issues/9905 | ||
if (_task_mem_trackers[tid].use_count() == 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_task_mem_trackers[tid]
maybe nullptr here?
Because you add a case if (!it->second)
before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it's a bug,
fixed, thks~
And need to rebase to make p0 happy |
590dc6a
to
6b3fede
Compare
done |
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
Proposed changes
Issue Number: close #10006 #10071
Problem Summary:
Fix the memory leak. When the load task is canceled, the
IndexChannel
andNodeChannel
mem trackers cannot be destructed in time.Fix Load task being frequently canceled by oom and inaccurate
LoadChannel
mem tracker limit, and rewrite the variable name ofmem limit
inLoadChannel
.Fix core dump, when logout task mem tracker, phmap erase fails, resulting in repeated logout of the same tracker.
Fix the deadlock, when add_child_tracker mem limit exceeds, calling log_usage causes
_child_trackers_lock
deadlock.Fix frequent log printing when thread mem tracker limit exceeds, which will affect readability and performance.
Optimize some details of mem tracker display.
Checklist(Required)
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...