Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move "OOM Kill disable" warning to the daemon #38382

Merged
merged 3 commits into from
Dec 20, 2018

Conversation

thaJeztah
Copy link
Member

Relates to docker/cli#1571

Disabling the oom-killer for a container without setting a memory limit
is dangerous, as it can result in the container consuming unlimited memory,
without the kernel being able to kill it. A check for this situation is curently
done in the CLI, but other consumers of the API won't receive this warning.

This patch adds a check for this situation to the daemon, so that all consumers
of the API will receive this warning.

This patch will have one side-effect; docker cli's that also perform this check
client-side will print the warning twice; this can be addressed by disabling
the cli-side check for newer API versions, but will generate a bit of extra
noise when using an older CLI.

With this patch applied (and a cli that does not take the new warning into account);

docker create --oom-kill-disable busybox
WARNING: OOM killer is disabled for the container, but no memory limit is set, this can result in the system running out of resources.
669933b9b237fa27da699483b5cf15355a9027050825146587a0e5be0d848adf

docker run --rm --oom-kill-disable busybox
WARNING: Disabling the OOM killer on containers without setting a '-m/--memory' limit may be dangerous.
WARNING: OOM killer is disabled for the container, but no memory limit is set, this can result in the system running out of resources.

- How to verify it

make BIND_DIR=. TESTDIRS='github.com/docker/docker/daemon/' TESTFLAGS='-test.run ^TestVerifyContainerResources$' test-unit

- Description for the changelog

+ Return warnings when creating a container with OOM-kill-disabled, but no memory-constraints set


if resources.OomKillDisable != nil && *resources.OomKillDisable && resources.Memory == 0 {
warnings = append(warnings, "OOM killer is disabled for the container, but no memory limit is set, this can result in the system running out of resources.")
logrus.Warn("OOM killer is disabled for the container, but no memory limit is set, this can result in the system running out of resources.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I see that the code prints every warning it adds. Maybe it makes sense to drop all the logrus.Warn lines and add something like this instead?

--- a/daemon/daemon_unix.go
+++ b/daemon/daemon_unix.go
@@ -358,6 +358,14 @@ func verifyContainerResources(resources *containertypes.Resources, sysInfo *sysi
        warnings := []string{}
        fixMemorySwappiness(resources)
 
+       defer func() {
+               if len(warnings) > 0 {
+                       for _, w := range warnings {
+                               logrus.Warn(w)
+                       }
+               }
+       }
+

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, good one; perhaps instead of a defer, we can do this in the calling function; in case of an error, I think it's ok as well to skip logging the warnings, and only log the error (although we could do both of course), i.e.;

warnings, err := verifyContainerResources(....)
if err != nil {
    return err
}
for _, w := range warnings {
    logrus.Warn(w)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can do this in the calling function

Yes better, put it to verifyContainerSettings() in daemon/container.go as it calls a few functions that prints some warnings. Someplace here:

if hostConfig.NetworkMode.IsHost() && len(hostConfig.PortBindings) > 0 {

That way we can remove a lot of logrus.Warn() calls in both daemon_unix.go and daemon_windows.go.

Also maybe add container id field:

logrus.WithField("container", container.ID).Warn(w)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, missed your last comment; didn't make this change yet:

logrus.WithField("container", container.ID).Warn(w)

Disabling the oom-killer for a container without setting a memory limit
is dangerous, as it can result in the container consuming unlimited memory,
without the kernel being able to kill it. A check for this situation is curently
done in the CLI, but other consumers of the API won't receive this warning.

This patch adds a check for this situation to the daemon, so that all consumers
of the API will receive this warning.

This patch will have one side-effect; docker cli's that also perform this check
client-side will print the warning twice; this can be addressed by disabling
the cli-side check for newer API versions, but will generate a bit of extra
noise when using an older CLI.

With this patch applied (and a cli that does not take the new warning into account);

```
docker create --oom-kill-disable busybox
WARNING: OOM killer is disabled for the container, but no memory limit is set, this can result in the system running out of resources.
669933b9b237fa27da699483b5cf15355a9027050825146587a0e5be0d848adf

docker run --rm --oom-kill-disable busybox
WARNING: Disabling the OOM killer on containers without setting a '-m/--memory' limit may be dangerous.
WARNING: OOM killer is disabled for the container, but no memory limit is set, this can result in the system running out of resources.
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Windows does not have host-mode networking, so on Windows, this
check was a no-op

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@thaJeztah thaJeztah force-pushed the move_create_warnings_to_daemon branch from 7906414 to e884b93 Compare December 19, 2018 00:30
@codecov
Copy link

codecov bot commented Dec 19, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@419972a). Click here to learn what that means.
The diff coverage is 80%.

@@            Coverage Diff            @@
##             master   #38382   +/-   ##
=========================================
  Coverage          ?   36.55%           
=========================================
  Files             ?      608           
  Lines             ?    44958           
  Branches          ?        0           
=========================================
  Hits              ?    16433           
  Misses            ?    26247           
  Partials          ?     2278

@thaJeztah
Copy link
Member Author

@kolyshkin updated, PTAL. Also did a follow-up to do some additional refactoring/cleaning up; #38393

sysInfo := sysinfo.New(true)

w, err := verifyContainerResources(&hostConfig.Resources, sysInfo, update)

// no matter err is nil or not, w could have data in itself.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK this comment should not be removed. What it says is "even if err != nil, we still want those warnings". Without such comment, just by looking at the code I want to move the append line to after the err check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, I thought it would be apparent from the function as a whole, but you're right; better to add it back

@kolyshkin
Copy link
Contributor

LGTM except for the removed comment. Ideally it would be awesome to change the patch order, switching patches 1 and 3 (cleanups first, new functionality next) as otherwise you are removing the line you just added. Not super important though...

@thaJeztah
Copy link
Member Author

Ideally it would be awesome to change the patch order, switching patches 1 and 3 (cleanups first, new functionality next)

I'm always on the fence on that one, as there's benefits to both; refactor first makes the implementation slightly cleaner, however, I've been in situations where a change had to be backported and could not, because there was a big refactor before it, so I guess that's why I did it the other way round.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@kolyshkin
Copy link
Contributor

a change had to be backported and could not, because there was a big refactor before it, so I guess that's why I did it the other way round.

Makes total sense to me, thanks for the explanation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants