Skip to content

Commit

Permalink
[release-0.12] Add cluster upgrade failure revert docs (#4106)
Browse files Browse the repository at this point in the history
* Add cluster upgrade failure revert docs

* Update docs/content/en/docs/tasks/troubleshoot/troubleshooting.md

Co-authored-by: Chris Negus <striker57@gmail.com>

Co-authored-by: Terry Howe <tlhowe@amazon.com>
Co-authored-by: Terry Howe <terrylhowe@gmail.com>
Co-authored-by: Chris Negus <striker57@gmail.com>
  • Loading branch information
4 people authored Nov 16, 2022
1 parent 3e3ad1e commit 536dd14
Showing 1 changed file with 38 additions and 0 deletions.
38 changes: 38 additions & 0 deletions docs/content/en/docs/tasks/troubleshoot/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,44 @@ If that doesn't work, you can manually delete the old cluster:
kind delete cluster --name cluster-name
```

### Cluster upgrade fails with management cluster on bootstrap cluster

If a cluster upgrade of a management (or self managed) cluster fails or is halted in the middle, you may be left in a
state where the management resources (CAPI) are still on the KinD bootstrap cluster on the Admin machine. Right now, you will have to
manually move the management resources from the KinD cluster back to the management cluster.

First create a backup:
```shell
CLUSTER_NAME=squid
KINDKUBE=${CLUSTER_NAME}/generated/${CLUSTER_NAME}.kind.kubeconfig
MGMTKUBE=${CLUSTER_NAME}/${CLUSTER_NAME}-eks-a-cluster.kubeconfig
DIRECTORY=backup
# Substitute the version with whatever version you are using
CONTAINER=public.ecr.aws/eks-anywhere/cli-tools:v0.12.0-eks-a-19

rm -rf ${DIRECTORY}
mkdir ${DIRECTORY}

docker run -i --network host -w $(pwd) -v /var/run/docker.sock:/var/run/docker.sock -v $(pwd):/$(pwd) --entrypoint clusterctl ${CONTAINER} backup \
--namespace eksa-system \
--kubeconfig $KINDKUBE \
--directory ${DIRECTORY}

#After the backup, move the management cluster back
docker run -i --network host -w $(pwd) -v /var/run/docker.sock:/var/run/docker.sock -v $(pwd):/$(pwd) --entrypoint clusterctl ${CONTAINER} move \
--to-kubeconfig $MGMTKUBE \
--namespace eksa-system \
--kubeconfig $KINDKUBE
```

Before you delete your bootstrap KinD cluster, verify there are no import custom resources left on it:
```shell
kubectl get crds | grep eks | while read crd rol
do
echo $crd
kubectl get $crd -A
done
```
## Bare Metal troubleshooting

### Creating new workload cluster hangs or fails
Expand Down

0 comments on commit 536dd14

Please sign in to comment.