Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm upgrade error "... with the name ... not found" and how to handle it #620

Merged
merged 5 commits into from
Apr 2, 2018

Conversation

consideRatio
Copy link
Member

@consideRatio consideRatio commented Apr 1, 2018

Update

This PR was supposed to fix #619, but became documentation on how to handle the errors encountered on helm upgrades. These are supposedly fixed in Helm 2.8.2 assuming a clean slate of helm release revisions.

Error: UPGRADE FAILED: <object type> with the name <object name> not found

Symptoms

You attempt to make a helm upgrade, but something like this hits you...

# Errors experienced by me
Error: UPGRADE FAILED: no ServiceAccount with the name "pod-culler" found
Error: UPGRADE FAILED: no Deployment with the name "hub" found
Error: UPGRADE FAILED: no ServiceAccount with the name "autohttps" found

# Error experienced by Yuvi
Error: UPGRADE FAILED: no DaemonSet with the name "continuous-image-puller" found

Diagnosis

# Do you have multiple release revisions considered deployed?
kubectl get configmap --namespace kube-system --selector STATUS=DEPLOYED
# Do you have multiple release revisions pending upgrade?
kubectl get configmap --namespace kube-system --selector STATUS=PENDING_UPGRADE

Cure

# Fix 1 (recommended) - no consequences as far as I understand it
kubectl get configmap --namespace kube-system --selector STATUS=DEPLOYED
kubectl delete configmap --namespace kube-system --selector STATUS=PENDING_UPGRADE
kubectl delete configmap --namespace kube-system <all configmaps except the latest deployed release revision>
# - optional cleanup...
kubectl delete configmap --namespace kube-system --selector STATUS=FAILED
kubectl delete configmap --namespace kube-system --selector STATUS=DELETED
kubectl delete configmap --namespace kube-system --selector STATUS=SUPERSEDED

# Fix 2 - invasive, you will loose the hub database of users etc. if you do this
helm delete <revision name> --purge

Related

Helm 2.7 introduced the --history-max feature flag to helm init - this way you can limit helms memory of release revisions.

Actual PR content

This ended up being quite irrelevant I think...

  • I moved the ServiceAccount definition to the top of rbac.yaml files (hub, autohttps, pod-culler). The ServiceAccounts are standalone objects required by the objects below. I figured it could matters in certain situations, but I doubt that now.
  • I added a conditional in the pod-culler rbac.yaml to not create ServiceAccount and roles etc when the culler wasn't used.

Very unimportant stuff also in PR

  • Adjusted some indentation to make it in-file consistent.

@minrk
Copy link
Member

minrk commented Apr 1, 2018

Interesting! Fix seems fine to me. Feel free to remove the WIP tag when you feel it's ready. I wonder how this didn't show up before. What helm & kubernetes versions are you running with?

@consideRatio
Copy link
Member Author

consideRatio commented Apr 1, 2018

@minrk I'm very confused about all the issues that pops up. I'm thinking this can become an issue under certain circumstances, but I don't know which... So I sat out to reset and start over in some way. While cleaning in my cluster I found out that Helm considered multiple release numbers to be deployed. I don't know if that is reasonable (perhaps due to a rolling upgrade that got stuck?). Anyhow, I think that might have caused issues, so now I'm cleaning it all up.

kubectl get configmap --namespace kube-system --selector STATUS=FAILED
kubectl get configmap --namespace kube-system --selector STATUS=PENDING_UPGRADE
kubectl get configmap --namespace kube-system --selector STATUS=SUPERSEDED
kubectl get configmap --namespace kube-system --selector STATUS=DEPLOYED
kubectl get configmap --namespace kube-system --selector STATUS=DELETED

UPDATE 1

This resolved my issues... Hmmm...

@consideRatio consideRatio changed the title [WIP] RBAC issues - ServiceAccount with the name ... not found RBAC.yaml exposing a Helm bug? (ServiceAccount with the name ... not found) Apr 1, 2018
@consideRatio consideRatio requested review from yuvipanda and minrk April 1, 2018 19:33
@consideRatio consideRatio changed the title RBAC.yaml exposing a Helm bug? (ServiceAccount with the name ... not found) Helm upgrade error "... with the name ... not found" and how to handle it Apr 1, 2018
@minrk minrk merged commit 544e355 into jupyterhub:master Apr 2, 2018
@minrk
Copy link
Member

minrk commented Apr 2, 2018

Great!

- kind: ServiceAccount
name: autohttps
namespace: {{ .Release.Namespace }}
- kind: ServiceAccount
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding these indentation changes: there seem to be two standards: one where - is aligned with the parent element and another where - is indented like a dict. The latter seems more common and more readable (it's easier to see the levels this way), but this is converting to the former. Is there a reason to dedent this here (e.g. motivated by a helm linter)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#622 - The kubernetes/chart linting ignores the difference, allowing both. The repositories I've looked at have been mixing both. I've seen situations where both indentation variants have been the most readable. I'm happy to decide on any variant and go with it.

parent1:
- key1: value1
- key2: value2

parent2:
  - key1: value1
  - key2: value2

I've never seen really ugly examples of the first kind of indentation, but some due to the second kind of indentation. But that is often not an issue.

Example of nested situation causing a big challenge no matter what kind of indentation scheme used, but I figure it would be easier to use the former kind as the key-names would always start two spaces in.

networkPolicy:
enabled: false
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0

I'm setting up some linting in #624 - we can configure it to however we please, I'm looking a directive to follow!

@dragos-cojocari
Copy link

Thanks for the tip but in our case 2.8.2 has solved the problem automatically. This is what we did:

  1. updated to 2.8.2
  2. run a helm upgrade - FAILED
  3. did a rollback to the previous deployed revision ( one of the 7 revisions marked as DEPLOYED)

Once the rollback was finished all the previous DEPLOYED revisions were marked as SUPERSEEDED so no manual intervention was required.

@manics manics mentioned this pull request Aug 15, 2018
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pod-culler upgrade - ServiceAccount required before initialized?
3 participants