-
Notifications
You must be signed in to change notification settings - Fork 549
monitor process memory consumption and alert for om[is]agent #2419
Conversation
fixed #2385 |
|
||
if info.rss > 500 * 1024 * 1024: | ||
# only record large memory consumption to save space in prometheus | ||
cmd = info.cmd.split()[0] # remove args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we strip the command details, so most commands will be bash
or python
, count them seems meaningless and wasteful
How about excluding such processes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed args because omiagent will also have args like /opt/omi/bin/omiagent 9 10 --destdir / --providerdir /opt/omi/lib --loglevel WARNING
, I think the args is useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not filter omiagent here, instead of in alert rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that will be a special case in code
Get process memory consumption by ps rss, report those who use more than 500M to save space in prometheus. Also defined an alert rule for process
omiagent
andomsagent
, these two processes are frequently causing OOM in azure VM, which DRI should take care of.