Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surprising behavior trying to manually change kargs while an update is already queued #963

Open
betermieux opened this issue Sep 16, 2021 · 8 comments

Comments

@betermieux
Copy link

Describe the bug
I want to switch to cgroups v2 by removing systemd.unified_cgroup_hierarchy from the kernel arguments. While the rpm-ostree command executes succesfully, the removal of the kernel argument is not propagated, and in grub I can see systemd.unified_cgroup_hierarchy=0 again.
Any ideas where to look for details?

Reproduction steps

  1. sudo rpm-ostree kargs --delete=systemd.unified_cgroup_hierarchy --reboot

System details

  • VMWare instance
  • Fedora CoreOS 34.20210821.1.1
@lucab
Copy link
Contributor

lucab commented Sep 16, 2021

Thanks for the report.
Is this an old installed systemd? 34.20210821.1.1 should already come without that kernel argument.

Can you please post the output of:

  • cat /sysroot/.coreos-aleph-version.json
  • cat /proc/cmdline
  • sudo journalctl -u rpm-ostreed.service

@betermieux
Copy link
Author

Well yes, it is a node I installed in July 2020, auto-updating on the next stream (stream shouldn't matter, because I have also nodes on stable, which show the same behaviour). I have also included the log output of rpm-ostreed after executing rpm-ostree kargs

# cat /sysroot/.coreos-aleph-version.json
{
	"build": "32.20200629.3.0",
	"ref": "fedora/x86_64/coreos/stable",
	"ostree-commit": "6df95bdb2fe2d36e091d4d18e3844fa84ce4b80ea3bd0947db5d7a286ff41890",
	"imgid": "fedora-coreos-32.20200629.3.0-qemu.x86_64.qcow2"
}

# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt1)/ostree/fedora-coreos-daacce5351564f0d1ffe9898a69968b7e62b73500667f39a3ec4042b84ce6bd6/vmlinuz-5.13.12-200.fc34.x86_64 mitigations=auto,nosmt systemd.unified_cgroup_hierarchy=0 console=tty0 console=ttyS0,115200n8 ostree=/ostree/boot.0/fedora-coreos/daacce5351564f0d1ffe9898a69968b7e62b73500667f39a3ec4042b84ce6bd6/0 ignition.platform.id=vmware

# sudo journalctl -u rpm-ostreed.service
-- Journal begins at Thu 2021-09-16 07:14:02 UTC, ends at Thu 2021-09-16 12:12:11 UTC. --
-- No entries --

# sudo rpm-ostree kargs --delete=systemd.unified_cgroup_hierarchy
Staging deployment... done
Kernel arguments updated.
Run "systemctl reboot" to start a reboot

# sudo journalctl -u rpm-ostreed.service
-- Journal begins at Thu 2021-09-16 07:14:02 UTC, ends at Thu 2021-09-16 12:14:07 UTC. --
Sep 16 12:13:54 docker1 systemd[1]: Starting rpm-ostree System Management Daemon...
Sep 16 12:13:54 docker1 rpm-ostree[324341]: Reading config file '/etc/rpm-ostreed.conf'
Sep 16 12:13:56 docker1 rpm-ostree[324341]: In idle state; will auto-exit in 60 seconds
Sep 16 12:13:56 docker1 systemd[1]: Started rpm-ostree System Management Daemon.
Sep 16 12:13:56 docker1 rpm-ostree[324341]: client(id:cli dbus:1.865 unit:session-4.scope uid:0) added; new total=1
Sep 16 12:13:57 docker1 rpm-ostree[324341]: Locked sysroot
Sep 16 12:13:57 docker1 rpm-ostree[324341]: Initiated txn KernelArgs for client(id:cli dbus:1.865 unit:session-4.scope uid:0): /org/projectatomic/rpmostree1/fedora_coreos
Sep 16 12:13:57 docker1 rpm-ostree[324341]: Process [pid: 324339 uid: 0 unit: session-4.scope] connected to transaction progress
Sep 16 12:13:58 docker1 rpm-ostree[324341]: note: Deploying commit 7ea8d089d135c027ff98e911dd43fc1886d223e6694f2cc21637645ee42eca37 which contains content in /var/lib that will be ignored.
Sep 16 12:13:59 docker1 rpm-ostree[324341]: Created new deployment /ostree/deploy/fedora-coreos/deploy/7ea8d089d135c027ff98e911dd43fc1886d223e6694f2cc21637645ee42eca37.11
Sep 16 12:13:59 docker1 rpm-ostree[324341]: sanitycheck(/usr/bin/true) successful
Sep 16 12:14:00 docker1 rpm-ostree[324341]: Txn KernelArgs on /org/projectatomic/rpmostree1/fedora_coreos successful
Sep 16 12:14:03 docker1 rpm-ostree[324341]: Unlocked sysroot
Sep 16 12:14:03 docker1 rpm-ostree[324341]: Process [pid: 324339 uid: 0 unit: session-4.scope] disconnected from transaction progress
Sep 16 12:14:04 docker1 rpm-ostree[324341]: client(id:cli dbus:1.865 unit:session-4.scope uid:0) vanished; remaining=0
Sep 16 12:14:04 docker1 rpm-ostree[324341]: In idle state; will auto-exit in 63 seconds

After rebooting systemd.unified_cgroup_hierarchy=0 is still present in grub menu and /proc/cmdline

@betermieux
Copy link
Author

betermieux commented Sep 16, 2021

Maybe I found the culprit, I have specified a periodic update window later this week. New kernel arguments are probably added to the not yet used version. Any chance to force an upgrade? A regular systemctl reboot just returns back to 34.20210821.1.1

# cat /etc/zincati/config.d/55-updates-strategy.toml
[updates]
strategy = "periodic"
[[updates.periodic.window]]
days = [ "Fri", "Sat" ]
start_time = "20:00"
length_minutes = 720

# rpm-ostree status
State: idle
AutomaticUpdatesDriver: Zincati
  DriverState: active; update staged: 34.20210904.1.0; reboot pending due to update strategy
Deployments:
  fedora:fedora/x86_64/coreos/next
                   Version: 34.20210904.1.0 (2021-09-07T00:13:14Z)
                    Commit: 7ea8d089d135c027ff98e911dd43fc1886d223e6694f2cc21637645ee42eca37
              GPGSignature: Valid signature by 8C5BA6990BDB26E19F2A1A801161AE6945719A39
                      Diff: 34 upgraded, 1 removed

* fedora:fedora/x86_64/coreos/next
                   Version: 34.20210821.1.1 (2021-08-24T03:31:02Z)
                    Commit: 55e40560b1a4008f8d4fd70eb73d65cc834ca03a985fb380e76c9ae2c2c459fa
              GPGSignature: Valid signature by 8C5BA6990BDB26E19F2A1A801161AE6945719A39

  fedora:fedora/x86_64/coreos/next
                   Version: 34.20210808.1.0 (2021-08-09T22:57:42Z)
                    Commit: 966e80e2789383d8403b46d77c449e1c858e7116e60a6c49da9f84e1c9a95f4d
              GPGSignature: Valid signature by 8C5BA6990BDB26E19F2A1A801161AE6945719A39

# rpm-ostree upgrade --bypass-driver -r
2 metadata, 0 content objects fetched; 788 B transferred in 1 seconds; 0 bytes content written
No upgrade available.

@lucab
Copy link
Contributor

lucab commented Sep 16, 2021

Not directly yet, but:

# echo 'updates.strategy = "immediate"' > /run/zincati/config.d/99-finalize-once-out-of-maintenance-window.toml
# systemctl restart zincati.service

Will perform an immediate update, once.

Though I'm still not clear about what happens to the new/updated kargs. Possibly they get applied to the staged update, which is however discarded by the manual reboot?

@lucab
Copy link
Contributor

lucab commented Sep 16, 2021

If the above guess is correct, a more reliable way to tweak kargs and immediately apply changes would be:

# systemctl stop zincati.service
# rpm-ostree cleanup -p
# rpm-ostree kargs --delete=systemd.unified_cgroup_hierarchy --reboot

Although if that is the case, we should look for a way to improve the UX of this.

@betermieux
Copy link
Author

OK, now it works. rpm-ostree kargs changed the kernel arguments of the staged update (probably worked all the time).
After letting zincati upgrade the node with the immediate strategy, the new kernel arguments are used. I wouldn't fix anything, but maybe you should output a warning if rpm-ostree kargs changes kernel arguments of a (not yet) active deployment.

@lucab
Copy link
Contributor

lucab commented Sep 16, 2021

I suspect we can improve the UX of this in two directions:

  • if rpm-ostree kargs is invoked without --reboot and there is an update driver registered, do not suggest systemctl reboot to the user (but point to the driver instead)
  • if rpm-ostree kargs is invoked with --reboot but there is an update driver registered, require --bypass-driver too

@lucab lucab reopened this Sep 16, 2021
@lucab lucab changed the title Changed kernel arguments not propagated to next boot Surprising behavior trying to manually change kargs while an update is already queued Sep 16, 2021
@betermieux
Copy link
Author

I agree with your first case, but I had tested rpm-ostree kargs --reboot, which will reboot the system immediatly. To my knowledge, --bypass-driver is only used for rpm-ostree upgrade not for rpm-ostree kargs . You will always have to point to the update driver if an update is pending.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants