Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OS upgrade with osImage is never triggered #1983

Closed
ldevulder opened this issue Feb 28, 2024 · 5 comments
Closed

OS upgrade with osImage is never triggered #1983

ldevulder opened this issue Feb 28, 2024 · 5 comments
Labels
kind/bug Something isn't working

Comments

@ldevulder
Copy link
Contributor

elemental-toolkit version: 1.1.1

elemental-operator version: 1.5.0 (Dev)

CPU architecture, OS, and Version: x86, Elemental OS v2.0.2 (based on SLE Micro 5.5)

Describe the bug

Configuring an OS upgrade with osImage does nothing.

To Reproduce

Install Rancher Manager v2.8.2, elemental-operator Dev (1.5.0), deploy a simple one node cluster with Elemental OS Stable (2.0.2) and update it to Dev (2.1.0) with osImage. Nothing will happen.

osImage used:

apiVersion: elemental.cattle.io/v1beta1
kind: ManagedOSImage
metadata:
  name: osimage
  namespace: fleet-default
spec:
  clusterTargets:
    - clusterName: cluster-k3s
  osImage: registry.opensuse.org/isv/rancher/elemental/dev/containers/suse/sle-micro/sle-micro-iso-5.5:latest

Expected behavior

Logs

elemental-operator.log

I found these messages related to osimage:

I0228 08:43:53.927453  1 managedosimage_controller.go:122] controller/managedosimage "msg"="Reconciling managed OS image object" "name"="osimage" "namespace"="fleet-default" "reconciler group"="elemental.cattle.io" "reconciler kind"="ManagedOSImage"
E0228 08:43:53.940020  1 controller.go:317] controller/managedosimage "msg"="Reconciler error" "error"="error reconciling managed OS image object: failed to update managed OS image status: Bundle.fleet.cattle.io \"mos-osimage\" not found" "name"="osimage" "namespace"="fleet-default" "reconciler group"="elemental.cattle.io" "reconciler kind"="ManagedOSImage"
I0228 08:43:53.940069  1 managedosimage_controller.go:122] controller/managedosimage "msg"="Reconciling managed OS image object" "name"="osimage" "namespace"="fleet-default" "reconciler group"="elemental.cattle.io" "reconciler kind"="ManagedOSImage"
E0228 08:43:53.955371  1 controller.go:317] controller/managedosimage "msg"="Reconciler error" "error"="[failed to patch managed OS image object: Operation cannot be fulfilled on managedosimages.elemental.cattle.io \"osimage\": the object has been modified; please apply your changes to the latest version and try again, failed to patch status for managed OS image object: Operation cannot be fulfilled on managedosimages.elemental.cattle.io \"osimage\": the object has been modified; please apply your changes to the latest version and try again]" "name"="osimage" "namespace"="fleet-default" "reconciler group"="elemental.cattle.io" "reconciler kind"="ManagedOSImage"

Additional context

N/A

@ldevulder ldevulder added the kind/bug Something isn't working label Feb 28, 2024
@ldevulder ldevulder moved this to 🗳️ To Do in Elemental Feb 28, 2024
@anmazzotti
Copy link
Contributor

anmazzotti commented Feb 28, 2024

I'm testing locally with the same setup and it works fine for me.
If you have access to the system I'd check if:

  1. The bundle was created on the management cluster: kubectl -n fleet-default get bundle mos-osimage
  2. The system-upgrade-controller is actually running on the "to-be-upgraded" node: kubectl -n cattle-system get pods

Also "does nothing" is the intended outcome here, at least from a user point of view.
We do not reboot the machine after upgrade, so nothing happens indeed, but if you reboot the machine manually you should notice that it generated a new grub entry for the previous system and it will boot into the upgraded system by default.

Scratch the last part, that is just incorrect. The node is supposed to reboot after upgrade.

@frelon
Copy link
Contributor

frelon commented Feb 28, 2024

I got the same reconcile errors as you when testing @ldevulder, maybe we should move this issue to the operator as it looks like a race-condition/lock-problem during reconcile?

@frelon
Copy link
Contributor

frelon commented Feb 28, 2024

Also the osImage should probably be registry.opensuse.org/isv/rancher/elemental/dev/containers/suse/sle-micro/sle-micro-container-5.5:latest instead of the iso, but that makes no difference until the upgrade is actually run on a node.

@ldevulder
Copy link
Contributor Author

ldevulder commented Feb 28, 2024

Also "does nothing" is the intended outcome here, at least from a user point of view.
We do not reboot the machine after upgrade, so nothing happens indeed,

@anmazzotti In that case something has been changed: previously the reboot was automatic. And it's done automatically with managedOSVersionName instead of osImage. If such behaviour is changed then it could be good to let us know ;-) And it should also be consistent across osImage and managedOSVersionName for me. But anyway I checked the logs on the node and clearly nothing was done.

@frelon Yes looks more an operator issue for me. David just asked to open it in the toolkit for now until further investigation has been done. I will also retry manually with sle-micro-container image to be sure, maybe I used the wrong URI in my last change.

@ldevulder
Copy link
Contributor Author

@frelon after testing with sle-micro-container it works... so it was a chair/keyboard interface bug for this one... I will fix it with a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

3 participants