-
Notifications
You must be signed in to change notification settings - Fork 1.6k
KEP-4369: Promote to beta (allow ~all ASCII characters in env vars) #4805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-4369: Promote to beta (allow ~all ASCII characters in env vars) #4805
Conversation
26ff8ac to
1090b25
Compare
|
/cc @BenTheElder |
1090b25 to
ee1d6c5
Compare
keps/sig-node/4369-allow-special-characters-environment-variable/README.md
Show resolved
Hide resolved
| ###### How can a rollout or rollback fail? Can it impact already running workloads? | ||
|
|
||
| When a feature gate is closed, already running workloads are not affected in any way, but update fields for workload will cause the workload to fail. | ||
| When a feature gate is disabled, already running workloads are not affected in any way, but update fields for workload will cause the workload to fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What fields would change in an update for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any update to the workload's fields results in failure because pods need to be recreate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I missed this in the KEP but I don't understand how that is related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we update any field of workloads that use an environment variable with relaxed validation when disabling feature gate, the workload might fail to recreate pods or ReplicaSets because it cannot pass the validation logic of the Apiserver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. So you are saying that if you have expanded environment variable support and then you downgrade it could be possible to start failing validation.
Your phrase above seems to be more general than what you expanded here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is, unfortunately, true of any feature on pod - if you enable, use in Deployment, disable, then the ".. or already in use" logic doesn't matter much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used more friendly words to describe it :)
ee1d6c5 to
1c0d1f4
Compare
1c0d1f4 to
56045f6
Compare
|
Thanks! /lgtm |
|
This really needs a sig-node approver |
|
/cc @kubernetes/sig-node-leads Could anyone review and approve it? |
We will discuss KEPs and assign approvers at today's meeting that starts in a few minutes |
|
/assign @mrunalp |
SergeyKanzhelev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
SergeyKanzhelev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
keps/sig-node/4369-allow-special-characters-environment-variable/kep.yaml
Show resolved
Hide resolved
56045f6 to
f161bdd
Compare
SergeyKanzhelev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
/cc @jpbetz Could you review for it as PRR reviewer? It has already been approved by sig-node. |
jpbetz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments about ratcheting validation for downgrade/rollback. LGTM for PRR once those are addressed.
| ###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? | ||
|
|
||
| If close the feature gate, already running workloads will not be affected in any way, | ||
| If disable the feature gate, already running workloads will not be affected in any way, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also updated the above "Downgrade" section ("users need to reset their environment variables for special characters to normal characters."). I was expecting it to say something like "After downgrade, environment variables containing special characters will continue to work as expected, but any writes to resources to add or change environment variables must set the environment variable names to only use normal characters."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified as suggested.
| ###### How can a rollout or rollback fail? Can it impact already running workloads? | ||
|
|
||
| When a feature gate is closed, already running workloads are not affected in any way, but update fields for workload will cause the workload to fail. | ||
| When the feature gate is disabled, workloads that are already running will not be affected. However, if user update the workloads, they may fail to recreate pods or ReplicaSets due to failing the Apiserver's validation logic, which could cause the workloads to fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a environment variable has special characters, and the cluster is rolled back, do updates that change other fields (but that do not modify the fields containing environment variables) fail? I am expecting updates like this to be allowed.
The trick here is usually to modify the validation rules to compare the old value with the new value for updates, and allow the less restrictive rule (use of special characters in this situation) so long as the field value is not changing. This way, only controllers that are actually changing the environment fields are at risk of being impacted by a rollback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified as suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for my hindsight. I suddenly realized that we can't achieve this. Modifying any controller will delete the old pods and create new ones, and we can't retrieve the spec of the old pods during pod creation validation. So, after disabling the feature gate, we won't be able to update other fields of the workload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I follow. That limitation makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for confirming! @HirazawaUi I think what is implemented should be OK than
|
|
||
| ``` | ||
| kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.containers[].env[]?.name | test("^[a-zA-Z_][a-zA-Z0-9_]*$") | not) | [.metadata.namespace, .metadata.name, .spec.containers[].env[]?.name] | @tsv' | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for including a command!
f161bdd to
a658078
Compare
|
ping @jpbetz |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: HirazawaUi, jpbetz, SergeyKanzhelev, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@SergeyKanzhelev Could you lgtm for it again? |
|
/lgtm |
Uh oh!
There was an error while loading. Please reload this page.