Skip to content
This repository has been archived by the owner on Sep 30, 2020. It is now read-only.

[v0.14.x] Referencing the drainTimeout value in the NodeDrainer daemonset #1733

Conversation

HarryStericker
Copy link
Contributor

In a previous PR (#1722) I had attempted to stop node drainer scheduling on cordoned nodes once the timeout had elapsed (300s/5m). This is beneficial to me as otherwise nodeDrainer tries to redeploy itself onto the node after the 5 minutes have elapsed, and goes into a restart loop as kubelet is preventing things from being scheduled. When this happens our alerts go off.

This proved more difficult than planned, and I can not think of any other way to do so than placing a taint on all nodes that the daemonset does not have a toleration for. This is IMO not feasible and way too heavyweight.

We already have drainTimeout specified here in cluster.yml so it makes sense to use this value in the drain command also. This way I can increase the timeout allowing the node to drain and terminate within the allotted time.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign davidmccormick
You can assign the PR to them by writing /assign @davidmccormick in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 11, 2019
@codecov-io
Copy link

Codecov Report

Merging #1733 into v0.14.x will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           v0.14.x    #1733   +/-   ##
========================================
  Coverage    25.28%   25.28%           
========================================
  Files           98       98           
  Lines         5087     5087           
========================================
  Hits          1286     1286           
  Misses        3659     3659           
  Partials       142      142

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3c850d9...ac566f0. Read the comment docs.

@dominicgunn dominicgunn added this to the v0.14.2 milestone Sep 12, 2019
@dominicgunn
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 12, 2019
@HarryStericker HarryStericker changed the title [v0.14.x] Utilizing the parameterized drain timeout value [v0.14.x] Referencing the drainTimeout value in the NodeDrainer daemonset Sep 12, 2019
@davidmccormick davidmccormick merged commit 7ae7d36 into kubernetes-retired:v0.14.x Sep 18, 2019
@davidmccormick
Copy link
Contributor

Many thanks for updating both the 0.13.x and 0.14.x branches!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm Indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants