Would it be useful to use rate of swapin as a threshold? #34

fdr · 2017-10-19T01:17:25Z

Hello,

In addition to the absolute amount of memory swapped out, would it make sense to use the quantity of swap-in or the amount of time spent waiting on swap-in (if available) as a trigger for kill?

My use case are servers that have overcommit off, which means they are very conservative on how much memory can be committed, and extent of concurrency of multiple processes winds up penalized. When one considers simple Python, Go, or even C programs can allocate hundreds of megabytes of virtual memory that they never use (cat does this for me, in fact), this is a meaningful problem.

Thus, the amount of swap I may wish to allocate be somewhat large.

However, these servers have soft-real-time performance requirements. It's not desirable to actually let memory be swapped in. It would be better to start killing things if swap-in pressure gets beyond the residual.

Thoughts?

The text was updated successfully, but these errors were encountered:

rfjakob · 2017-10-25T19:02:44Z

Interesting, I have never run Linux with overcommit off. Why do you disable it? To prevent having random processes being killed when running out of memory?

fdr · 2017-10-25T20:11:01Z

Yeah, that's right. Postgres will roll back transactions and deliver an error message to the user. But requires allocations to fail, which in Linux virtually requires overcommit being off.

rfjakob · 2017-10-25T20:45:38Z

Ok I see. But if you run earlyoom, won't that you back to the situation that process are randomly killed?

fdr · 2017-10-25T21:00:17Z

On Wed, Oct 25, 2017 at 1:46 PM rfjakob ***@***.***> wrote: Ok I see. But if you run earlyoom, won't that you back to the situation that process are randomly killed?

The error message will be bad (the protocol will report: "terminated by administrator command" or something like this), but it won't crash the entire server. Linux's OOM is much more severe than a SIGTERM, causing every connection to crash and crash recovery to start.

rfjakob · 2017-10-26T14:19:03Z

Just had to check myself, earlyoom also sends SIGKILL, I'm afraid it would be about the same as the kernel oom killer ( https://github.com/rfjakob/earlyoom/blob/master/main.c#L208 ). But could be modified to send SIGTERM without too much trouble.

Swap statistics are available through /proc/vmstat. We'd have to do our own averaging but that should work.

$ cat /proc/vmstat | grep swp
pswpin 15886
pswpout 99679

Have you monitored if the swap rate is a good indicator for things going south on your machines? Like letting

$ vmstat 1

run while imposing heavy load?

fdr · 2017-10-26T20:53:43Z

On Thu, Oct 26, 2017 at 7:19 AM rfjakob ***@***.***> wrote: Just had to check myself, earlyoom also sends SIGKILL, I'm afraid it would be about the same as the kernel oom killer ( https://github.com/rfjakob/earlyoom/blob/master/main.c#L208 ). But could be modified to send SIGTERM without too much trouble.

That's interesting. I'm surprised SIGTERM isn't the default. One of the big advantages earlyoom has is the system can still find memory to allow processes to do clean-up.

Swap statistics are available through /proc/vmstat. We'd have to do our own averaging but that should work. $ cat /proc/vmstat | grep swp pswpin 15886 pswpout 99679 Have you monitored if the swap rate is a good indicator for things going south on your machines? Like letting $ vmstat 1 run while imposing heavy load?

Not yet. On EC2 for real-time workloads, I am going to guess many people run with swap disabled (indeed, the new "r4" line, popular for databases, doesn't even have local disk to use for swap). Thus, I haven't been gathering my own numbers for pathological swapping.

rfjakob · 2018-07-07T22:01:27Z

I'll close this for now.

rfjakob added the question label Jan 28, 2018

rfjakob closed this as completed Jul 7, 2018

rfjakob mentioned this issue Jul 7, 2018

What about SIGTERM? #67

Closed

yangfl mentioned this issue Nov 10, 2018

Add PSI support (threshold for swapin/swapout) #100

Closed

hakavlad mentioned this issue Feb 1, 2019

Add basic desktop linux configuration facebookincubator/oomd#40

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Would it be useful to use rate of swapin as a threshold? #34

Would it be useful to use rate of swapin as a threshold? #34

fdr commented Oct 19, 2017

rfjakob commented Oct 25, 2017 •

edited

Loading

fdr commented Oct 25, 2017

rfjakob commented Oct 25, 2017

fdr commented Oct 25, 2017 via email

rfjakob commented Oct 26, 2017

fdr commented Oct 26, 2017 via email

rfjakob commented Jul 7, 2018

Would it be useful to use rate of swapin as a threshold? #34

Would it be useful to use rate of swapin as a threshold? #34

Comments

fdr commented Oct 19, 2017

rfjakob commented Oct 25, 2017 • edited Loading

fdr commented Oct 25, 2017

rfjakob commented Oct 25, 2017

fdr commented Oct 25, 2017 via email

rfjakob commented Oct 26, 2017

fdr commented Oct 26, 2017 via email

rfjakob commented Jul 7, 2018

rfjakob commented Oct 25, 2017 •

edited

Loading