Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improve pod disruption budget validation #98

Merged
merged 1 commit into from
Nov 13, 2024

Conversation

nemanja-kovacevic-thinkit
Copy link
Contributor

@nemanja-kovacevic-thinkit nemanja-kovacevic-thinkit commented Nov 13, 2024

https://techfromsage.atlassian.net/browse/PLT-1083

Problem: While upgrading EKS to Kubernetes 1.30 and 1.31 we ran into issues with Node Group upgrades as they couldn’t evict some Pods. We found the reason to be PodDisruptionBudget violations where PDB is configured with minAvailable: 1 while there is only 1 replica.

Fix: WebService construct validates that when PodDisruptionBudget is enabled, then Deployment’s min replicas is greater (but not equal) than PDB’s minAvailable.

@nemanja-kovacevic-thinkit nemanja-kovacevic-thinkit added the 90% 90% Code Review label Nov 13, 2024
const { minAvailable, maxUnavailable } =
props.podDisruptionBudget as PodDisruptionBudgetProps;
podDisruptionBudget as PodDisruptionBudgetProps;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to force the type because it's already set in the function argument

Suggested change
podDisruptionBudget as PodDisruptionBudgetProps;
podDisruptionBudget;

Comment on lines 740 to 749
[{ replicas: 2 }, { minAvailable: IntOrString.fromNumber(2) }],
[{ replicas: 2 }, { maxUnavailable: IntOrString.fromNumber(0) }],
[{ replicas: 3 }, { minAvailable: IntOrString.fromString("100%") }],
[{ replicas: 3 }, { maxUnavailable: IntOrString.fromString("0%") }],
[{ replicas: 10 }, { minAvailable: IntOrString.fromString("91%") }],
[{ replicas: 42 }, { maxUnavailable: IntOrString.fromString("0%") }],
[
{ minReplicas: 5, maxReplicas: 10 },
{ minAvailable: IntOrString.fromString("85%") },
],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these removed for a reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, these were expected to fail before. Now, after validation condition is changed, these 4 test cases are failing because replicas == minAvailable. That's why they are removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are checking cases when validation does fail. So, for example, case

[{ replicas: 2 }, { minAvailable: IntOrString.fromNumber(2) }],

should still fail, because if we only have 2 pods, and PDB enforces that there must be 2, we would get eviction errors.

So I think the change to if (min > minReplicas) check is wrong, it should be >= (error if PDB's minAvailable is equal to minReplicas).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That reverts it to the initial condition.

@nemanja-kovacevic-thinkit nemanja-kovacevic-thinkit merged commit 9cc1a97 into main Nov 13, 2024
8 checks passed
@nemanja-kovacevic-thinkit nemanja-kovacevic-thinkit deleted the PLT-108-improve-pdb-validation branch November 13, 2024 15:31
nemanja-kovacevic-thinkit added a commit that referenced this pull request Nov 14, 2024
https://techfromsage.atlassian.net/browse/PLT-1083

Problem: PR
#98 is
merged. After merging we noticed that we are unable to get a green build
against main branch which means that there is no release. Problem caused
one e2e test which expectation is invalid. Green build on the branch
happened because tests are flaky.

Fix: Update e2e test expectation
@talisaspire
Copy link
Collaborator

🎉 This issue has been resolved in version 5.3.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
90% 90% Code Review released
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants