Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No more duplicates of puller pods #854

Merged
merged 3 commits into from
Aug 18, 2018

Conversation

consideRatio
Copy link
Member

@consideRatio consideRatio commented Aug 18, 2018

Thanks to the new before-hook-creation deletion policy, we can avoid the object already exist error as Helm would simply delete and recreate the resource instead if it found a resource with the same name. This means that we can use a single name for the puller daemonset and won't ever orphan a puller daemonsets after a failed upgrade followed by a successful upgrade.

This deletion policy requires the newer Helm version.
Having a single name is fine now thanks to the before-hook-creation 
deletion policy that will ensure that we don't run into the error that 
the resource already exists because helm will now simply recreate the 
k8s resource.
labels:
{{- include "jupyterhub.labels" . | nindent 4 }}
hub.jupyter.org/deletable: "true"
annotations:
"helm.sh/hook": pre-install,pre-upgrade
"helm.sh/hook-delete-policy": hook-succeeded,hook-failed
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens on hook-failed?

Copy link
Member Author

@consideRatio consideRatio Aug 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point! The hook-failed part did not have any impact before or won't have for quite some time. This was a bug (unexpected behavior) but Helm did not consider it very important to fix. If it would be fixed though, I'd prefer to have it removed as if the upgrade fails we can inspect the status better if we would like to.

Currently, the hook-failed will only function on Job resources. Further, it will only work on the last Job resource that fails, not the previously executed Jobs. PR's exist but they are stale/rotten at this point.

So that is why I'm removing it as it is dead code causing confusion for anyone thinking about it.

Copy link
Member Author

@consideRatio consideRatio Aug 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I edited the reply above, it was not really readable before - I will stop writing from mobile without reviewing my text haha sorry @minrk!

@consideRatio
Copy link
Member Author

Let's merge this! It is tested by me over the course of about 20+ helm upgrade command or so and it has worked like a charm. If my helm upgrade would fail for some reason and then after a rerun succeed there would not remain any image-pullers daemonset or pods left thanks to this PR! It can and have been a headache with such issues.

I'm hoping for a lot of trust in me getting this right even though we have not gotten all testing infrastructure in place. What do you say @minrk ?

@minrk
Copy link
Member

minrk commented Aug 18, 2018

Awesome, thanks @consideRatio!

@minrk minrk merged commit 9731556 into jupyterhub:master Aug 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants