Skip to content

Conversation

@stevehipwell
Copy link
Contributor

@stevehipwell stevehipwell commented Nov 30, 2021

Description of changes:

This PR updates the Helm chart to use the Helm 3 v2 API and to follow the latest idiomatic pattern; this should lower the cognitive complexity of the chart and make it easier to keep maintained moving forwards.

A summary of the important changes are as follows:

  • Update API version to v2 and only support Helm 3
  • Limit chart to K8s v1.16 or later
  • Align chart to default chart template
  • Align Daemonset and Deployment templates
  • Remove support for deprecated beta node labels
  • Improve default security context
  • Remove un-needed configuration copied between Deployment/Daemonset
  • Require probes for deployment
  • Extract Daemonset specific configuration into new values (with fall back logic)
  • Use correct jobLabel logic in ServiceMonitor/PodMonitor
  • Update chart template lint tests to use idiomatic queue and IMDS installations

Fixes #518.

Breaking Changes:

The following changes are technically breaking but should have no impact on the use of NTH in Queue mode and require only minor changes for NTH in IMDS mode.

All Modes:

  • Only support Helm 3
  • Only support Kubernetes v1.16 and up
  • Remove use of beta K8s node labels
  • Namespace isn't set in templates
  • Removed gracePeriod

Queue Mode:

  • Can't set hostNetwork or dnsPolicy
  • Probes server is always enabled

IMDS Mode:

  • Ignore priorityClassName and use daemonsetPriorityClassName
  • Only one node selector, affinity and toleration key can be set (see README)
  • Removed linuxUpdateStrategy & windowsUpdateStrategy

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@stevehipwell
Copy link
Contributor Author

FYI @bwagner5

@stevehipwell
Copy link
Contributor Author

@bwagner5 I'm getting an interesting failure on the fast tests that doesn't look anything to do with this PR?

@stevehipwell stevehipwell force-pushed the helm-chart-refactor branch 4 times, most recently from 62542e1 to 7a50f1e Compare December 15, 2021 16:29
@bwagner5
Copy link
Contributor

Looks like a kubectl error having to do with a selector:

error: You must provide one or more resources by argument or filename.
Example resource specifications include:
   '-f rsrc.yaml'
   '--filename=rsrc.json'
   '<resource> <name>'
   '<resource>'

@stevehipwell
Copy link
Contributor Author

@bwagner5 I wonder if it's related to my change of NOTES.txt? I've been a bit busy but will try and update the README and then test the E2E scripts locally.

It'd be good to get your thoughts on the changes if there is anything you want clarifying or changing before I spend the effort on the README.

@bwagner5
Copy link
Contributor

Just a note that #529 was merged recently (between when you opened this PR) that adds customLabels to the helm chart.

Copy link
Contributor

@bwagner5 bwagner5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this is trending in the right direction! Thanks for doing this!

@stevehipwell
Copy link
Contributor Author

Just a note that #529 was merged recently (between when you opened this PR) that adds customLabels to the helm chart.

@bwagner5 I think these changes align pretty well with the ones I've made, I'll merge it in.

@github-actions
Copy link

This PR has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want this PR to never become stale, please ask a maintainer to apply the "stalebot-ignore" label.

@github-actions github-actions bot added the stale Issues / PRs with no activity label Dec 31, 2021
@stevehipwell
Copy link
Contributor Author

/not-stale

@github-actions github-actions bot removed the stale Issues / PRs with no activity label Jan 1, 2022
@bwagner5
Copy link
Contributor

bwagner5 commented Jan 4, 2022

Sorry for the delay! I've been off for the holidays.

@stevehipwell stevehipwell force-pushed the helm-chart-refactor branch 11 times, most recently from 978ff1d to 2b66407 Compare January 4, 2022 16:02
@stevehipwell
Copy link
Contributor Author

@bwagner5 I've got the tests fixed, it was a side effect of the old daemonset values being applied to a deployment making the tests work.

If you're happy with the work so far I'll finish off the README for final review?

@bwagner5
Copy link
Contributor

bwagner5 commented Jan 4, 2022

@bwagner5 I've got the tests fixed, it was a side effect of the old daemonset values being applied to a deployment making the tests work.

If you're happy with the work so far I'll finish off the README for final review?

Yep! LGTM!

@stevehipwell
Copy link
Contributor Author

@bwagner5 after making a few minor changes I've updated the PR description and the chart README ready for review.

I looked through the code and it looks like the ASG tagging is only used in the queue processor mode and the metadata config is only used by the IMDS mode, could you confirm this?

@stevehipwell stevehipwell marked this pull request as ready for review January 5, 2022 12:36
@stevehipwell stevehipwell requested a review from a team as a code owner January 5, 2022 12:36
@bwagner5
Copy link
Contributor

bwagner5 commented Jan 6, 2022

@bwagner5 after making a few minor changes I've updated the PR description and the chart README ready for review.

I looked through the code and it looks like the ASG tagging is only used in the queue processor mode and the metadata config is only used by the IMDS mode, could you confirm this?

Yes, that is correct

Copy link
Contributor

@bwagner5 bwagner5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the one comment on the chart version and then I think this is good to merge!

Signed-off-by: Steve Hipwell <steve.hipwell@gmail.com>
@stevehipwell
Copy link
Contributor Author

@bwagner5 I've reverted the chart version so once the tests go though I think we're ready.

Copy link
Contributor

@bwagner5 bwagner5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Thanks!

Comment on lines +71 to +74
{{- define "aws-node-termination-handler.selectorLabelsDeployment" -}}
{{ include "aws-node-termination-handler.selectorLabels" . }}
app.kubernetes.io/component: deployment
{{- end -}}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be mindful when changing immutable fields, it requires additional work when updating the helm chart to ensure high availability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cainelli this work was intended to be a breaking change and these changes were fully intentional and made with the understanding of the implications. I think there might have been a some missed communication when this was released (see above) but technically all changes to a pre v1 SemVer package are breaking.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying @stevehipwell and sorry for the grumpiness.

@tamirhad
Copy link

@stevehipwell Hey,
why ENABLE_SPOT_INTERRUPTION_DRAINING always false when using deployment(sqs mode)?

@stevehipwell
Copy link
Contributor Author

@tamirhad because it's only relevant to the IMDS mode. I did notice earlier that in some of the AWS docs they incorrectly specify setting it for queue mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor Helm chart

4 participants