-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
earlyoom will terminate itself #205
Comments
Hmm, how does this happen? Is earlyoom really the process using most memory on this machine? |
Probably. There's not much else running that's not on ignore. Config: EARLYOOM_ARGS="-r 60 -s 15,5 -m 10,5 -p -n --avoid '(^|/)(init|systemd.*|Xorg|sshd)$'" More log:
|
Could you provide the output with the |
I'm looking at the log, and something is going wrong before. Killing ipfs should have freed 656 megabytes, but the amount of available mem and swap did not change! |
It seems that the ipfs process has become invisible for earlyoom. earlyoom is trying to kill other processes, although it seems that the ipfs process has not yet terminated. Maybe ipfs became a zombie and has not yet freed his memory, but earlyoom considers it already dead. |
See #128 (comment) |
I guess we should wait until we see available memory rise |
Unlikely that I can reproduce the problem. I've since circumvented by adding more swap, and running gc/compaction once. Now, the usage doesn't go quite as high any more. |
I tried hard to reproduce this behavoir, on my PC, on a VM, and also a Raspberry Pi 4. But I always get the
|
Do you want to give me the source for membomb? I can try to reproduce this on something more like the original environment (An EC2 t3.micro running ami-0fcf6df96bd1453cc with a swap file on a 50 GB gp2 EBS vol.) |
I reported this because I thought it would probably be trivial to add a check against |
membomb is in
Would be nice if you can reprodruce this, and then I'll maybe have to learn how to use aws :) The getpid check we have is only for a specific case (hidepid), so it did not trigger here. My reasoning for not preventing terminating ourselves is: what if there's a memory leak in earlyoom? |
Set in
|
It doesn't seem to be reproducible with either membomb nor |
Is it possible that this happened because |
Hm, yes, sounds plausible. |
@rfjakob Do you have any ideas how to fix it? |
One could check whether the memory situation is still dire after picking a process. But still, what earlyoom does is simply not possible to do entirely race-free, right? (With the larger pid space on newer kernels, it's certainly less troubling, though.) |
Yes, that's what I was thinking about as well. Because, if we look at the stats, kill_largest_process takes about 400x longer than parse_meminfo (check how free memory is doing). That 400x number is roughly the number of processes you have running, so it will be higher with lots of processes, because earlyoom has to look at each of them.
I think the kernel oom killer is race-free because it freezes all processes (or freezes all allocations?) |
Split kill_largest_process into find_largest_process + kill_process. #205
The run time of find_largest_process is proportional to the number of processes, and takes 2.5ms on my box with a running Gnome desktop (try "make bench"). This is long enough that the situation may have changed in the meantime, so we double-check if we still need to kill anything. The run time of parse_meminfo is only 6us on my box and independent of the number of processes (try "make bench"). Fixes #205
Maybe this should be prevented by default? (Apologies if it is already, running earlyoom 1.6-1 from archlinux.)
The text was updated successfully, but these errors were encountered: