Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker: memory_hard_limit conflict with MemorySwap #8153

Closed
njenwei opened this issue Jun 11, 2020 · 7 comments · Fixed by #8159
Closed

Docker: memory_hard_limit conflict with MemorySwap #8153

njenwei opened this issue Jun 11, 2020 · 7 comments · Fixed by #8159

Comments

@njenwei
Copy link
Contributor

njenwei commented Jun 11, 2020

Nomad version

Nomad v0.11.3 (8918fc804a0c6758b6e3e9960e4eb2e605e38552)

Operating system and Environment details

RHEL 3.10.0-1127.el7.x86_64
Docker version 19.03.8, build afacb8b

Issue

Unable to create containers when memory_hard_limit > memory.

Error: failed to create container: API error (400): Minimum memoryswap limit should be larger than memory limit, see usage

I am not familar with Go, but could this be due to an incorrect hostConfig.MemorySwap value being set when using the option memory_hard_limit as shown in

hostConfig.MemorySwap = task.Resources.LinuxResources.MemoryLimitBytes // MemorySwap is memory + swap.

I feel like it should be

hostConfig.MemorySwap = driverConfig.MemoryHardLimit

instead of

hostConfig.MemorySwap = task.Resources.LinuxResources.MemoryLimitBytes

Reproduction steps

Run job file

Job file (if appropriate)

job "hello" {
  datacenters = ["dc1"]
  group "echo" {
    count = 1
    task "server" {
      driver = "docker"
      config {
        image = "hashicorp/http-echo:latest"
        args = [
          "-listen", ":8080",
          "-text", "Hello and welcome to ${NOMAD_IP_http} running on port 8080",
        ]
        memory_hard_limit = 4000
      }

      resources {
        memory = 1000
        network {
          port "http" {
            static = 8080
          }
        }
      }
    }
  }
}

This was also raise in the original PR: #8087 (comment)

@shoenig
Copy link
Member

shoenig commented Jun 11, 2020

Thanks for reporting this, @njenwei !

When I try your reproduction example on my dev machine it works fine. I'm thinking this may be an incompatibility with older kernels - for reference 3.10 came out in 2013.
I'm running with

$ uname -r
5.4.0-33-generic

@njenwei
Copy link
Contributor Author

njenwei commented Jun 11, 2020

@shoenig Thanks for your quick response!

Unfortunately, I am able to recreate this on the Mac OS as well:

Nomad v0.11.3 (8918fc804a0c6758b6e3e9960e4eb2e605e38552)
Docker version 19.03.8, build afacb8b
ProductName:    Mac OS X
ProductVersion: 10.15.4
BuildVersion:   19E287

See output when running in dev mode:

./nomad agent -dev
==> No configuration files loaded
==> Starting Nomad agent...
==> Nomad agent configuration:

       Advertise Addrs: HTTP: 127.0.0.1:4646; RPC: 127.0.0.1:4647; Serf: 127.0.0.1:4648
            Bind Addrs: HTTP: 127.0.0.1:4646; RPC: 127.0.0.1:4647; Serf: 127.0.0.1:4648
                Client: true
             Log Level: DEBUG
                Region: global (DC: dc1)
                Server: true
               Version: 0.11.3

==> Nomad agent started! Log data will stream in below:

    2020-06-11T22:22:47.222+0100 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=
    2020-06-11T22:22:47.222+0100 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=
    2020-06-11T22:22:47.222+0100 [INFO]  agent: detected plugin: name=exec type=driver plugin_version=0.1.0
    2020-06-11T22:22:47.222+0100 [INFO]  agent: detected plugin: name=qemu type=driver plugin_version=0.1.0
    2020-06-11T22:22:47.222+0100 [INFO]  agent: detected plugin: name=java type=driver plugin_version=0.1.0
    2020-06-11T22:22:47.222+0100 [INFO]  agent: detected plugin: name=docker type=driver plugin_version=0.1.0
    2020-06-11T22:22:47.222+0100 [INFO]  agent: detected plugin: name=raw_exec type=driver plugin_version=0.1.0
    2020-06-11T22:22:47.224+0100 [INFO]  nomad.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:127.0.0.1:4647 Address:127.0.0.1:4647}]"
    2020-06-11T22:22:47.224+0100 [INFO]  nomad.raft: entering follower state: follower="Node at 127.0.0.1:4647 [Follower]" leader=

# truncated
# job ran

    2020-06-11T22:23:12.320+0100 [DEBUG] client.driver_mgr.docker: docker pull succeeded: driver=docker image_ref=hashicorp/http-echo:latest
    2020-06-11T22:23:12.335+0100 [DEBUG] client.driver_mgr.docker: image reference count incremented: driver=docker image_name=hashicorp/http-echo:latest image_id=sha256:a6838e9a6ff6ab3624720a7bd36152dda540ce3987714398003e14780e61478a references=1
    2020-06-11T22:23:12.335+0100 [DEBUG] client.driver_mgr.docker: configured resources: driver=docker task_name=server memory=4194304000 memory_reservation=1048576000 cpu_shares=100 cpu_quota=0 cpu_period=0
    2020-06-11T22:23:12.335+0100 [DEBUG] client.driver_mgr.docker: binding directories: driver=docker task_name=server binds="[]string{"/private/var/folders/z1/qj_4f20j4l3_107q531pfxwc0000gn/T/NomadClient815241885/e1ee8214-b9ca-804d-e4da-b0267308ca1b/alloc:/alloc", "/private/var/folders/z1/qj_4f20j4l3_107q531pfxwc0000gn/T/NomadClient815241885/e1ee8214-b9ca-804d-e4da-b0267308ca1b/server/local:/local", "/private/var/folders/z1/qj_4f20j4l3_107q531pfxwc0000gn/T/NomadClient815241885/e1ee8214-b9ca-804d-e4da-b0267308ca1b/server/secrets:/secrets"}"
    2020-06-11T22:23:12.335+0100 [DEBUG] client.driver_mgr.docker: networking mode not specified; using default: driver=docker task_name=server
    2020-06-11T22:23:12.335+0100 [DEBUG] client.driver_mgr.docker: allocated static port: driver=docker task_name=server ip=127.0.0.1 port=8080
    2020-06-11T22:23:12.335+0100 [DEBUG] client.driver_mgr.docker: exposed port: driver=docker task_name=server port=8080
    2020-06-11T22:23:12.335+0100 [DEBUG] client.driver_mgr.docker: applied labels on the container: driver=docker task_name=server labels=map[com.hashicorp.nomad.alloc_id:e1ee8214-b9ca-804d-e4da-b0267308ca1b]
    2020-06-11T22:23:12.336+0100 [DEBUG] client.driver_mgr.docker: setting container name: driver=docker task_name=server container_name=server-e1ee8214-b9ca-804d-e4da-b0267308ca1b
    2020-06-11T22:23:12.353+0100 [DEBUG] client.driver_mgr.docker: failed to create container: driver=docker container_name=server-e1ee8214-b9ca-804d-e4da-b0267308ca1b image_name=hashicorp/http-echo:latest image_id=hashicorp/http-echo:latest attempt=1 error="API error (400): Minimum memoryswap limit should be larger than memory limit, see usage"
    2020-06-11T22:23:12.353+0100 [ERROR] client.driver_mgr.docker: failed to create container: driver=docker error="API error (400): Minimum memoryswap limit should be larger than memory limit, see usage"
    2020-06-11T22:23:12.377+0100 [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=e1ee8214-b9ca-804d-e4da-b0267308ca1b task=server error="failed to create container: API error (400): Minimum memoryswap limit should be larger than memory limit, see usage"
    2020-06-11T22:23:12.377+0100 [INFO]  client.alloc_runner.task_runner: not restarting task: alloc_id=e1ee8214-b9ca-804d-e4da-b0267308ca1b task=server reason="Error was unrecoverable"

Are you able to check your memory configuration for the docker container that nomad is launching on your PC? Something like docker inspect $CONTAINER_ID | grep Memory

@shoenig
Copy link
Member

shoenig commented Jun 12, 2020

I think your original observation is correct, @njenwei . After reading through the memory swap docs again with a magnifying glass, I believe we are currently setting the incorrect value for --memory-swap.

The assignment

hostConfig.MemorySwap = task.Resources.LinuxResources.MemoryLimitBytes

does not take into account the case when hard_memory_limit is set, and as such --memory-swap here is being set to a value lower than --memory (but still non-zero positive). The docs don't cover what happens in that exact case, so I'm thinking this is working for newer kernels due to some newfound leniency. The correct intended value should always be the computed hard limit (i.e. the value ultimately passed to --memory) due to Nomad's no swap policy. Given that, the assignment would just become

-               hostConfig.MemorySwap = task.Resources.LinuxResources.MemoryLimitBytes // MemorySwap is memory + swap.
+               // --memory-swap should always be same as --memory (hard limit) so as to disable swap.
+               //
+               // More information in https://docs.docker.com/config/containers/resource_constraints/#--memory-swap-details
+               hostConfig.MemorySwap = memory

Testing this out locally produces identical results:

$ cat before.txt 
            "Memory": 4194304000,
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 1048576000,
            "MemorySwap": -1,
            "MemorySwappiness": 0,
$ cat after.txt 
            "Memory": 4194304000,
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 1048576000,
            "MemorySwap": -1,
            "MemorySwappiness": 0,

Interesting that the MemorySwap value appears to be discarded (presumably after being used), but that's happening in the docker client.

@njenwei do you mind testing out that patch and confirming whether it solves the issue? If it checks out, would you also like to open a PR? If not I can take care of it ~

@shoenig shoenig self-assigned this Jun 12, 2020
njenwei added a commit to njenwei/nomad that referenced this issue Jun 12, 2020
Fixes an incorrect value being assigned to MemorySwap when `memory_hard_limit` flag is being used.

Issue raised in hashicorp#8153
@njenwei
Copy link
Contributor Author

njenwei commented Jun 12, 2020

With the change, I am now getting expected values when inspecting the container on mac:

$ docker container inspect 6a | grep Memory
            "Memory": 4194304000,
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 1048576000,
            "MemorySwap": 4194304000,
            "MemorySwappiness": 0,

Unfortunately I won't have the opportunity to test this out on a Linux box until Monday.

I'm not sure why your MemorySwap is being set to -1, it might be worth looking into what the go-dockerclient is outputting on your side.

@njenwei
Copy link
Contributor Author

njenwei commented Jun 16, 2020

I can also confirm that the change works on my original OS:

RHEL 3.10.0-1127.el7.x86_64
Docker version 19.03.8, build afacb8b
$ sudo docker inspect 0e | grep Memory
            "Memory": 4194304000,
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 1048576000,
            "MemorySwap": 4194304000,
            "MemorySwappiness": 0,

@shoenig are you happy to merge the changes?

@shoenig
Copy link
Member

shoenig commented Jun 16, 2020

Out of curiosity I tested with/without the fix on Ubuntu 20.04, 16.04, CentOS 7 and RHEL 8 using ec2 AMIs.

20.04

20.04)
# with fix
ubuntu@ip-172-31-5-58:~$ docker inspect 9a9 | grep Mem
            "Memory": 536870912,
            "CpusetMems": "",
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 268435456,
            "MemorySwap": -1,
            "MemorySwappiness": 0,
            
# without fix            
 ubuntu@ip-172-31-5-58:~$ docker inspect 41 | grep Memory
            "Memory": 536870912,
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 268435456,
            "MemorySwap": -1,
            "MemorySwappiness": 0,

16.04

# with fix
$ docker inspect 2b | grep Memory
            "Memory": 536870912,
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 268435456,
            "MemorySwap": -1,
            "MemorySwappiness": 0,

# without fix
$ docker inspect ee | grep Memory
            "Memory": 536870912,
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 268435456,
            "MemorySwap": -1,
            "MemorySwappiness": 0,

CentOS 7

# with fix
$ docker inspect 1d2 | grep Memory
            "Memory": 536870912,
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 268435456,
            "MemorySwap": 536870912,
            "MemorySwappiness": 0,
            
# without fix
    2020-06-16T15:06:58.157Z [DEBUG] client.driver_mgr.docker: failed to create container: driver=docker container_name=redis-4b5a0ff1-d796-c29e-aad3-bd5fa68e3e02 image_name=redis:3.2 image_id=redis:3.2 attempt=1 error="API error (400): Minimum memoryswap limit should be larger than memory limit, see usage"
    2020-06-16T15:06:58.157Z [ERROR] client.driver_mgr.docker: failed to create container: driver=docker error="API error (400): Minimum memoryswap limit should be larger than memory limit, see usage"
    2020-06-16T15:06:58.158Z [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=4b5a0ff1-d796-c29e-aad3-bd5fa68e3e02 task=redis error="failed to create container: API error (400): Minimum memoryswap limit should be larger than memory limit, see usage"

RHEL 8

# with fix
$ sudo docker inspect fe | grep Memory
            "Memory": 536870912,
            "KernelMemory": 0,
            "MemoryReservation": 268435456,
            "MemorySwap": 536870912,
            "MemorySwappiness": 0,

# without fix
    2020-06-16T15:30:44.244Z [DEBUG] client.driver_mgr.docker: failed to create container: driver=docker container_name=redis-fc949e06-19aa-1ffd-8692-382e8e3f8462 image_name=redis:3.2 image_id=redis:3.2 attempt=1 error="API error (400): Minimum memoryswap limit should be larger than memory limit, see usage"
    2020-06-16T15:30:44.244Z [ERROR] client.driver_mgr.docker: failed to create container: driver=docker error="API error (400): Minimum memoryswap limit should be larger than memory limit, see usage"
    2020-06-16T15:30:44.245Z [ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=fc949e06-19aa-1ffd-8692-382e8e3f8462 task=redis error="failed to create container: API error (400): Minimum memoryswap limit should be larger than memory limit, see usage"

@njenwei I do believe this fix makes Nomad's use of the feature correct with respect to Docker's documentation. Looking at the differences above, I realize this may just be the result of a difference in default swap settings between OS's.

20.04

$ cat /proc/meminfo | grep Swap
SwapCached:            0 kB
SwapTotal:       2097148 kB
SwapFree:        2097148 kB

RHEL 8

$ cat /proc/meminfo | grep Swap
SwapCached:            0 kB
SwapTotal:             0 kB
SwapFree:              0 kB

@github-actions
Copy link

github-actions bot commented Nov 6, 2022

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants