Skip to content
This repository has been archived by the owner on Sep 30, 2020. It is now read-only.

Kube node drainer service fails to start. #40

Closed
camilb opened this issue Nov 7, 2016 · 16 comments
Closed

Kube node drainer service fails to start. #40

camilb opened this issue Nov 7, 2016 · 16 comments
Milestone

Comments

@camilb
Copy link
Contributor

camilb commented Nov 7, 2016

I think the line Restart=on-failure from kube-node-drainer.service should be removed.

Getting these errors:

Failed to restart kubelet.service: Unit kube-node-drainer.service is not loaded properly: Invalid argument.
kube-node-drainer.service: Service has Restart= setting other than no, which isn't allowed for Type=oneshot services. Refusing.

@camilb
Copy link
Contributor Author

camilb commented Nov 7, 2016

#41

@pieterlange
Copy link
Contributor

Easy mistake to slip in..

I think i've spotted another one:

        ExecStop=/bin/sh -c '/usr/bin/docker run --rm -v /etc/kubernetes:/etc/kubernetes {{.HyperkubeImageRepo}}:{{.K8sVer}} \
          /hyperkube kubectl \
          --server=https://{{.ExternalDNSName}}:443 \
          --kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml \
          drain $$(hostname) \
          --ignore-daemonsets \
          --force'

Unless i'm missing something here, the double $$ seems to be an error. Is this some escaping trick? I think we should just be able to run $(hostname), right?

@camilb
Copy link
Contributor Author

camilb commented Nov 7, 2016

@pieterlange I see that it's working with double $$as with one $.

@camilb
Copy link
Contributor Author

camilb commented Nov 7, 2016

Related to this, but looks like it's affecting other services too, using ExecStartPre=/usr/bin/systemctl is-active kubelet.service will check if the service is active then exit (code=exited, status=3). This being a oneshotservice, it will not be restarted. Also observed this on install-calico-system.service and install-kube-system.service. Using ExecStartPre=/usr/bin/systemctl is-active service.name it will generate a lot of errors before kubelet.service becomes active.

In this case I think is better to use something like:

After=multi-user.target

[Install]
WantedBy=node-drain.target

This way we make sure that all the services are running before we start this one.

Here is a proposal:

     [Unit]
     Description=drain this k8s node to make running pods time to gracefully shut down before stopping kubelet
     After=multi-user.target
     Wants=decrypt-tls-assets.service kubelet.service docker.service

     [Service]
     Type=oneshot
     RemainAfterExit=true
     ExecStart=/bin/sh -c '/usr/bin/docker run --rm -v /etc/kubernetes:/etc/kubernetes {{.HyperkubeImageRepo}}:{{.K8sVer}} \
       /hyperkube kubectl \
       --server=https://{{.ExternalDNSName}}:443 \
       --kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml \
       uncordon $(hostname)
     ExecStop=/bin/sh -c '/usr/bin/docker run --rm -v /etc/kubernetes:/etc/kubernetes {{.HyperkubeImageRepo}}:{{.K8sVer}} \
       /hyperkube kubectl \
       --server=https://{{.ExternalDNSName}}:443 \
       --kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml \
       drain $(hostname) \
       --ignore-daemonsets \
       --force'

     [Install]
     WantedBy=node-drain.target

@mumoshu
Copy link
Contributor

mumoshu commented Nov 8, 2016

@pieterlange replied to you in #41 (comment) about $$

@mumoshu
Copy link
Contributor

mumoshu commented Nov 8, 2016

@camilb Thanks for your feedback and proposal!

I'm not intended to just stick with the is-active method but anyways it seems that we may have 3 issues here:

  1. a lot of errors before starting kubelet, flanneld, etc.
  2. the node-drainer isn't coming up at all(!)
  3. un-cordoning nodes

For 1, I believe we can use RestartSec as used in https://github.com/coreos/coreos-baremetal/blob/master/examples/ignition/bootkube-controller.yaml#L66 to alleviate the issue. Should we start from RestartSec=10 anyway?

For 2, a part of issue is resolved thanks to your pr #41. To tackle the other part of issue, I began to believe your proposal, that uses After=multi-user.target to control order of service startup, is the only way to go!

For 3, though I'm rather looking forward with it, I'm not familiar with its use-case!

@mumoshu mumoshu added this to the v0.9.1-rc.2 milestone Nov 8, 2016
@mumoshu
Copy link
Contributor

mumoshu commented Nov 8, 2016

@pieterlange
Copy link
Contributor

Uncordoning will be necessary when the node was restarted due to manual operator intervention instead of a rolling upgrade. Not sure how often we'd see that in practice (who reboots nodes in an ASG?) but it might cause some 'funny' side-effects.

@camilb
Copy link
Contributor Author

camilb commented Nov 8, 2016

@mumoshu

  1. I will do some testing today with RestartSec and also try to set a order for services to see which it's more effective.
  2. I'm already using After=multi-user.target and works fine.
  3. We could stick with ExecStart=/bin/true in this case. @pieterlange Actually had to reboot the nodes several time on a staging cluster due the Docker 10.3 problems on overload. I know that you can lose the node in an ASG but it's faster.

@mumoshu
Copy link
Contributor

mumoshu commented Nov 8, 2016

@camilb Thanks for your cooperation here.

Just doing a quick reply to 1., fyi, we've already been hit by the 51200 bytes limit in the master branch.
If you are going to start testing on top of it, I'd like to encourage you to try #45 as the base branch for testing!

@camilb
Copy link
Contributor Author

camilb commented Nov 8, 2016

@mumoshu Thanks, I will try it. Hitted the limit several times and was using kube-aws up --export with S3.

@camilb
Copy link
Contributor Author

camilb commented Nov 8, 2016

@mumoshu Finished testing.

  1. kube-aws up --s3-uri s3://my_bucket/kube-aws works fine

  2. kube-aws update --s3-uri s3://my_bucket/kube-aws always shows this error:

    Error: Error updating cluster: error updating cloudformation stack: ValidationError: Template    format error: unsupported structure.
    status code: 400, request id: 0d04f84f-a607-11e6-b3c5-2548ad1d19cd
    

But works fine with aws cloudformation update-stack.

  1. Regarding kube-node-drainer.service

docker.service it's stopped at the same time with kube-node-drainer.service Less than 10% of times it worked. I did try several configurations like Wanted-by or Required-by=poweroff.target reboot.target halt.target etc... Then swithced to rkt and after few intents I found a solution that works fine with rolling-updates, manual shutdown, reboot, etc. Tested several times and it din't fail.

- name: kube-node-drainer.service
  enable: true
  command: start
  runtime: true
  content: |
    [Unit]
    Description=drain this k8s node to make running pods time to gracefully shut down before stopping kubelet
    After=multi-user.target

    [Service]
    Type=oneshot
    RemainAfterExit=true
    ExecStart=/bin/true
    TimeoutStopSec=30s
    ExecStop=/bin/sh -c '/usr/bin/rkt run \
    --volume=kube,kind=host,source=/etc/kubernetes,readOnly=true \
    --mount=volume=kube,target=/etc/kubernetes \
    --net=host \
    quay.io/coreos/hyperkube:v1.4.5_coreos.0 \
      --exec=/kubectl -- \
      --server=https://{{.ExternalDNSName}}:443 \
      --kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml \
      drain $(hostname) \
      --ignore-daemonsets \
      --force'

    [Install]
    WantedBy=multi-user.target

@mumoshu
Copy link
Contributor

mumoshu commented Nov 9, 2016

@camilb

  • Glad to hear that kube-aws up --s3-uri worked for you!
  • For 2., it turned out that I had made the worst mistake I've ever made in the last commit 😈 already fixed/git-push'ed and going to test it out soon. Thank you for the feedback 🙇
  • For 3., honestly I had never noticed that instability, which is what we'd like to fix asap! I'm going to look into your PR 👍

@mumoshu
Copy link
Contributor

mumoshu commented Nov 9, 2016

FYI, I have just merged #48 which, in combination with #41, fixes what is stated in the title of this issue.

@mumoshu
Copy link
Contributor

mumoshu commented Nov 9, 2016

Revisiting & thinking about @pieterlange's comment at #40 (comment) and @camilb's comment at #40 (comment).

In addition to what you've mentioned, would the uncordon feature + the node drainer allow us to automatically upgrade CoreOS version, which implies automatic rebooting, hopefully without downtime?
If that is the case, as there're several possible use-cases already, I thought it would be nice to keep discussing about it in an another issue.

@camilb
Copy link
Contributor Author

camilb commented Nov 11, 2016

@mumoshu I will close this for now. Still not perfect but works better now. The request to drain the node is properly sent and the pods are started on other nodes. The problem is the containers are quickly killed on the drained node and for some pods that are using bigger images or need a longer time to start/stop, there is not enough time to be started on other nodes. I'm looking for a good method to delay stopping some services on shutdown or reboot. Saw some examples on Redhat and want to test them. I will open another issue with a proposal for improvements soon.

@camilb camilb closed this as completed Nov 11, 2016
davidmccormick pushed a commit to HotelsDotCom/kube-aws that referenced this issue Jul 18, 2018
…m-image to hcom-flavour

* commit 'f97674a7765dd91c6fa868cc7f2ded1aac5c1836':
  KIAMImage should affect server as well as client.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants