Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A blockDeviceMapping change in EC2NodeClass does not trigger drift replace #5447

Closed
rtomadpg opened this issue Jan 10, 2024 · 3 comments · Fixed by #5454
Closed

A blockDeviceMapping change in EC2NodeClass does not trigger drift replace #5447

rtomadpg opened this issue Jan 10, 2024 · 3 comments · Fixed by #5454
Labels
bug Something isn't working

Comments

@rtomadpg
Copy link

rtomadpg commented Jan 10, 2024

Description

Observed Behavior:

When changing spec.blockDeviceMapping[0].ebs.volumeSize no drift replace is triggered. Also, the karpenter.k8s.aws/ec2nodeclass-hash annotation remains unchanged.

However, when I flip spec.detailedMonitoring a drift replace is triggered. And the karpenter.k8s.aws/ec2nodeclass-hash annotation gets updated.

Since the hash value does not get updated for block device mapping changes , I assume the bug is inside the hashing code.

Expected Behavior:

Changing spec.blockDeviceMapping[0].ebs.volumeSize triggers a drift replace.

Reproduction Steps (Please include YAML):

  • Create a EC2NodeClass resource and merge in this spec:
spec:
  blockDeviceMappings:
  - deviceName: /dev/xvda
    ebs:
      encrypted: true
      volumeSize: 50Gi
      volumeType: gp3
  detailedMonitoring: true
  • Make note of the karpenter.k8s.aws/ec2nodeclass-hash annotation on the EC2NodeClass resource
  • Now edit the EC2NodeClass resource and change the volumeSize.
  • Check the annotation again. For me the value is unchanged (which is unexpected).

To ensure Karpenter drift does work:

  • Edit the EC2NodeClass resource again and flip spec.detailedMonitoring
  • Check the annotation. You should see it changed.

Versions:

  • Chart Version: v0.33.0
  • Kubernetes Version (kubectl version): v1.27.7
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@rtomadpg rtomadpg added bug Something isn't working needs-triage Issues that need to be triaged labels Jan 10, 2024
@rtomadpg rtomadpg changed the title A blockDeviceMappings change in EC2NodeClas does not trigger drift replace A blockDeviceMapping change in EC2NodeClass does not trigger drift replace Jan 10, 2024
@jonathan-innis jonathan-innis removed the needs-triage Issues that need to be triaged label Jan 11, 2024
@jonathan-innis
Copy link
Contributor

This looks like a regression related to #3330

@jonathan-innis
Copy link
Contributor

So, given that updating the hash here to capture the volume size (since it wasn't being captured before) is going to be a breaking change, we are going to have to rely on some in-progress work that allows us to version the hash that we are using to evaluate drift to ensure that we don't drift all of the existing nodes on the cluster at once as soon as we introduce #5454.

That work is going to take a bit to go in, so until that point, I'd recommend making another change to the EC2NodeClass that forces the drift of all of the NodeClaims associated with it or change a field in the NodePool (such as the template.spec.annotations that would also cause drift to occur across the NodeClaims that were using the EC2NodeClass. If you went the NodePool route, you might have to change that across multiple NodePools if you had multiple NodePools referencing the same EC2NodeClass.

@jonathan-innis
Copy link
Contributor

There's some discussion in this issue here around versioning the hash that we use for evaluating drift: kubernetes-sigs/karpenter#909

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants