Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"systemctl status" Regression when running libvirt VM #22089

Closed
Toolybird opened this issue Jan 12, 2022 · 7 comments · Fixed by #22095
Closed

"systemctl status" Regression when running libvirt VM #22089

Toolybird opened this issue Jan 12, 2022 · 7 comments · Fixed by #22095
Labels
pid1 regression ⚠️ A bug in something that used to work correctly and broke through some recent commit
Milestone

Comments

@Toolybird
Copy link

systemd version the issue has been seen with

systemd 250 (250.2-2-arch)

Used distribution

Arch

Linux kernel version used (uname -a)

Linux ryzen 5.15.13-arch1-1 #1 SMP PREEMPT Wed, 05 Jan 2022 16:20:59 +0000 x86_64 GNU/Linux

CPU architecture issue was seen on

x86_64

Expected behaviour you didn't see

normal output from "systemctl status"

Unexpected behaviour you saw

systemctl status
Failed to dump process list for 'ryzen', ignoring: Operation not supported

Steps to reproduce the problem

fire up any libvirt VM
run systemctl status

Not sure if this is a libvirt or a systemd issue. It only started as of systemd-2.50.

Obvious suspect:
#21172

Arch Forum report:
https://bbs.archlinux.org/viewtopic.php?id=272962

Thanks

@mrc0mmand mrc0mmand added the regression ⚠️ A bug in something that used to work correctly and broke through some recent commit label Jan 12, 2022
@mrc0mmand
Copy link
Member

Can reproduce this locally with systemd 250-3 as well.

[root@arch ~]# virsh list
 Id   Name           State
------------------------------
 1    root_default   running
[root@arch ~]# SYSTEMD_LOG_LEVEL=debug systemctl status --no-pager
Bus n/a: changing state UNSET → OPENING
sd-bus: starting bus by connecting to /run/systemd/private...
Bus n/a: changing state OPENING → AUTHENTICATING
Bus n/a: changing state AUTHENTICATING → RUNNING
Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.DBus.Properties member=GetAll cookie=1 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Got message type=method_return sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/a cookie=1 reply_cookie=1 signature=a{sv} error-name=n/a error-message=n/a
● arch.localdomain
    State: running
     Jobs: 0 queued
   Failed: 0 units
    Since: Wed 2022-01-12 10:28:41 UTC; 1h 0min ago
   CGroup: /
Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=GetUnitProcesses cookie=2 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Got message type=error sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/a cookie=2 reply_cookie=2 signature=s error-name=org.freedesktop.DBus.Error.NotSupported error-message=Operation not supported
Failed to dump process list for 'arch.localdomain', ignoring: Operation not supported
Bus n/a: changing state RUNNING → CLOSED

@mrc0mmand
Copy link
Member

Reproducer without systemctl:

[root@arch ~]# SYSTEMD_LOG_LEVEL=debug busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager GetUnitProcesses "s" -- "-.slice"
Bus n/a: changing state UNSET → OPENING
sd-bus: starting bus by connecting to /run/dbus/system_bus_socket...
Bus n/a: changing state OPENING → AUTHENTICATING
Bus n/a: changing state AUTHENTICATING → HELLO
Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=Hello cookie=1 reply_cookie=0 signature=n/a error-name=n/a error-message=n/a
Got message type=method_return sender=org.freedesktop.DBus destination=:1.120 path=n/a interface=n/a member=n/a cookie=1 reply_cookie=1 signature=s error-name=n/a error-message=n/a
Bus n/a: changing state HELLO → RUNNING
Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=GetUnitProcesses cookie=2 reply_cookie=0 signature=s error-name=n/a error-message=n/a
Got message type=error sender=:1.1 destination=:1.120 path=n/a interface=n/a member=n/a cookie=8414 reply_cookie=2 signature=s error-name=org.freedesktop.DBus.Error.NotSupported error-message=Operation not supported
Call failed: Operation not supported
Bus n/a: changing state RUNNING → CLOSED

@mrc0mmand mrc0mmand added the pid1 label Jan 12, 2022
@mrc0mmand
Copy link
Member

mrc0mmand commented Jan 12, 2022

So, the problematic commit appears to be 038cae0. And the root cause is that GetUnitProcesses is not a valid operation for the root slice (-.slice) when a libvirt VM is running:

# busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager GetUnitProcesses "s" --  "-.slice"
Call failed: Operation not supported
# busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager GetUnitProcesses "s" --  "system.slice"
a(sus) 19 "/system.slice/systemd-networkd.service" 251  ...
# vagrant halt
# busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager GetUnitProcesses "s" --  "-.slice"
a(sus) 29 "/user.slice/user-1000.slice/user@1000.service/init.scope"

/cc @poettering @keszybz

@mrc0mmand
Copy link
Member

Ah, it's not -.slice, but machine.slice:

# busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager GetUnitProcesses "s" --  "machine.slice"
Call failed: Operation not supported

@mrc0mmand
Copy link
Member

A bit more, possibly interesting stuff:

# systemd-cgls -a  -u machine.slice
Unit machine.slice (/machine.slice):
└─machine-qemu\x2d3\x2drootdefault.scope … (#2876)
  → trusted.delegate: 1
  → trusted.invocation_id: d436daf25abd4597979168389893713b
  └─libvirt (#2928)
    ├─1507 /usr/bin/qemu-system-x86_64 -name guest=root_default,debug-threads=on -S -object {"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-3-root_default/master-key>
    ├─vcpu1 (#3033)
    ├─vcpu0 (#3005)
    └─emulator (#2977)
# busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager GetUnitProcesses "s" --  "machine-qemu\x2d3\x2drootdefault.scope"
Call failed: Operation not supported
# systemctl cat "machine-qemu\x2d3\x2drootdefault.scope"
# /run/systemd/transient/machine-qemu\x2d3\x2drootdefault.scope
# This is a transient unit file, created programmatically via the systemd API. Do not edit.
[Scope]
Slice=machine.slice

[Unit]
Description=Virtual Machine qemu-3-rootdefault

[Scope]
Delegate=yes

[Unit]
CollectMode=inactive-or-failed

[Scope]
TasksMax=16384
Slice=machine.slice

[Unit]
After=libvirtd.service
Before=virt-guest-shutdown.target

@mrc0mmand
Copy link
Member

And to be even more specific:

openat(AT_FDCWD, "/sys/fs/cgroup/machine.slice/machine-qemu\\x2d3\\x2drootdefault.scope/cgroup.procs", O_RDONLY|O_CLOEXEC) = 46
newfstatat(46, "", {st_mode=S_IFREG|0644, st_size=0, ...}, AT_EMPTY_PATH) = 0
read(46, "", 4096)                      = 0
openat(AT_FDCWD, "/sys/fs/cgroup/machine.slice/machine-qemu\\x2d3\\x2drootdefault.scope", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 51
newfstatat(51, "", {st_mode=S_IFDIR|0755, st_size=0, ...}, AT_EMPTY_PATH) = 0
getdents64(51, 0x565141e6fa80 /* 54 entries */, 32768) = 1960
openat(AT_FDCWD, "/sys/fs/cgroup/machine.slice/machine-qemu\\x2d3\\x2drootdefault.scope/libvirt/cgroup.procs", O_RDONLY|O_CLOEXEC) = 54
newfstatat(54, "", {st_mode=S_IFREG|0644, st_size=0, ...}, AT_EMPTY_PATH) = 0
read(54, "1507\n", 4096)                = 5
openat(AT_FDCWD, "/proc/1507/stat", O_RDONLY|O_CLOEXEC) = 57
newfstatat(57, "", {st_mode=S_IFREG|0444, st_size=0, ...}, AT_EMPTY_PATH) = 0
read(57, "1507 (qemu-system-x86) S 1 1506 "..., 1024) = 338
ioctl(57, TCGETS, 0x7fff81c78510)       = -1 ENOTTY (Inappropriate ioctl for device)
read(57, "", 1024)                      = 0
close(57)                               = 0
openat(AT_FDCWD, "/proc/1507/cmdline", O_RDONLY|O_NOCTTY|O_CLOEXEC) = 57
newfstatat(57, "", {st_mode=S_IFREG|0444, st_size=0, ...}, AT_EMPTY_PATH) = 0
read(57, "/usr/bin/qemu-system-x86_64\0-nam"..., 4104) = 2671
close(57)                               = 0
read(54, "", 4096)                      = 0
openat(AT_FDCWD, "/sys/fs/cgroup/machine.slice/machine-qemu\\x2d3\\x2drootdefault.scope/libvirt", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 57
newfstatat(57, "", {st_mode=S_IFDIR|0755, st_size=0, ...}, AT_EMPTY_PATH) = 0
getdents64(57, 0x565141e8b9b0 /* 53 entries */, 32768) = 1928
openat(AT_FDCWD, "/sys/fs/cgroup/machine.slice/machine-qemu\\x2d3\\x2drootdefault.scope/libvirt/vcpu1/cgroup.procs", O_RDONLY|O_CLOEXEC) = 58
newfstatat(58, "", {st_mode=S_IFREG|0644, st_size=0, ...}, AT_EMPTY_PATH) = 0
read(58, 0x565141e8a430, 4096)          = -1 EOPNOTSUPP (Operation not supported)

@yuwata yuwata added this to the v251 milestone Jan 12, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Jan 12, 2022
The function append_cgroup() is called recursively, and some subcgroup
may not be read.

Fixes systemd#22089.
@yuwata
Copy link
Member

yuwata commented Jan 12, 2022

Hopefully fixed by #22095.

yuwata added a commit to yuwata/systemd that referenced this issue Jan 12, 2022
The function is called in recursion, and cgroup.procs in some subcgroups
may not be read.

Fixes systemd#22089.
keszybz pushed a commit that referenced this issue Jan 18, 2022
The function is called in recursion, and cgroup.procs in some subcgroups
may not be read.

Fixes #22089.
codepeon pushed a commit to codepeon/systemd that referenced this issue Jan 25, 2022
The function is called in recursion, and cgroup.procs in some subcgroups
may not be read.

Fixes systemd#22089.

(cherry picked from commit 1fb5040)
Werkov pushed a commit to Werkov/systemd that referenced this issue Nov 1, 2023
The function is called in recursion, and cgroup.procs in some subcgroups
may not be read.

Fixes systemd#22089.

(cherry picked from commit 1fb5040)
(cherry picked from commit 1b003bb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pid1 regression ⚠️ A bug in something that used to work correctly and broke through some recent commit
Development

Successfully merging a pull request may close this issue.

3 participants