-
Notifications
You must be signed in to change notification settings - Fork 802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No more duplicates of puller pods #854
No more duplicates of puller pods #854
Conversation
This deletion policy requires the newer Helm version.
Having a single name is fine now thanks to the before-hook-creation deletion policy that will ensure that we don't run into the error that the resource already exists because helm will now simply recreate the k8s resource.
labels: | ||
{{- include "jupyterhub.labels" . | nindent 4 }} | ||
hub.jupyter.org/deletable: "true" | ||
annotations: | ||
"helm.sh/hook": pre-install,pre-upgrade | ||
"helm.sh/hook-delete-policy": hook-succeeded,hook-failed | ||
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens on hook-failed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point! The hook-failed
part did not have any impact before or won't have for quite some time. This was a bug (unexpected behavior) but Helm did not consider it very important to fix. If it would be fixed though, I'd prefer to have it removed as if the upgrade fails we can inspect the status better if we would like to.
Currently, the hook-failed
will only function on Job
resources. Further, it will only work on the last Job resource that fails, not the previously executed Jobs. PR's exist but they are stale/rotten at this point.
So that is why I'm removing it as it is dead code causing confusion for anyone thinking about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I edited the reply above, it was not really readable before - I will stop writing from mobile without reviewing my text haha sorry @minrk!
Let's merge this! It is tested by me over the course of about 20+ helm upgrade command or so and it has worked like a charm. If my helm upgrade would fail for some reason and then after a rerun succeed there would not remain any image-pullers daemonset or pods left thanks to this PR! It can and have been a headache with such issues. I'm hoping for a lot of trust in me getting this right even though we have not gotten all testing infrastructure in place. What do you say @minrk ? |
Awesome, thanks @consideRatio! |
Thanks to the new
before-hook-creation
deletion policy, we can avoid theobject already exist
error as Helm would simply delete and recreate the resource instead if it found a resource with the same name. This means that we can use a single name for the puller daemonset and won't ever orphan a puller daemonsets after a failed upgrade followed by a successful upgrade.