Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VM support #216

Merged
merged 6 commits into from
Mar 16, 2022
Merged

Add VM support #216

merged 6 commits into from
Mar 16, 2022

Conversation

cpaelzer
Copy link
Contributor

@cpaelzer cpaelzer commented Jul 9, 2021

Hi,
Needrestart is awesome in detecting long running services or user sessions that are still running with old binaries. But there is another common kind of long-running process - VM Guests.

On one hand this branch adds a bit of protection by greylisting common virtualization services which restart would break running guests.

On the other hand it adds detection and reporting of outdated VMs.
Here an example of a system running a KVM guest in libvirt:

# virsh list
 Id   Name          State
-----------------------------
 4    i-testguest   running

And on package (re)install with a depending library libspice-server1 we see the detection and reporting in action:

root@i-kvm:~# sudo apt install --reinstall libspice-server1 libvirt0
...
Setting up libspice-server1:amd64 (0.14.3-2ubuntu3) ...
Setting up libvirt0:amd64 (7.4.0-0ubuntu1~impishppa9) ...
Processing triggers for libc-bin (2.33-0ubuntu7) ...
Scanning processes...                                                                                                                                                                          
Scanning candidates...                                                                                                                                                                         

Restarting services...
Service restarts being deferred:
 systemctl restart libvirtd.service

No containers need to be restarted.

No user sessions are running outdated binaries.

VM guests running outdated binaries:
 VM 'i-testguest' with pid 30693

The same in verbose/debug mode:

root@i-kvm:~# needrestart -v -l
[main] eval /etc/needrestart/needrestart.conf
[main] eval /etc/needrestart/conf.d/virtlogd.conf
[main] needrestart v3.5
[main] running in root mode
[Core] Using UI 'NeedRestart::UI::stdio'...
[main] systemd detected
[main] container detected
[Core] #194 is a NeedRestart::Interp::Python
[Python] #194: source=/usr/bin/networkd-dispatcher
[Core] #241 is a NeedRestart::Interp::Python
[Python] #241: source=/usr/share/unattended-upgrades/unattended-upgrade-shutdown
[main] #24611 uses deleted /usr/lib/x86_64-linux-gnu/libvirt-qemu.so.0.7004.0
[main] #24611 is not a child
[main] #25779 uses deleted /usr/lib/x86_64-linux-gnu/libvirt.so.0.7004.0
[main] #25779 is not a child
[main] #30693 uses deleted /usr/lib/x86_64-linux-gnu/libspice-server.so.1.14.0
[main] #30693 is not a child
[main] #24611 exe => /usr/sbin/libvirtd
[main] #24611 is libvirtd.service
[main] #25779 exe => /usr/sbin/virtlogd
[main] #25779 is virtlogd.service
[main] #30693 exe => /usr/bin/qemu-system-x86_64
[main] #30693 detected as VM guest 'i-testguest' in group '/machine.slice/machine-qemu\x2d4\x2di\x2dtestguest.scope'
[main] virtlogd.service is blacklisted -> ignored
[main] inside container, skipping kernel checks
[main] inside container or vm, skipping microcode checks

Restarting services...
Services to be restarted:
Restart «libvirtd.service»? [yNas?] 
Service restarts being deferred:
 systemctl restart libvirtd.service

No containers need to be restarted.

No user sessions are running outdated binaries.

VM guests running outdated binaries:
 VM 'i-testguest' with pid 30693

And in easy mode we get:
# needrestart -l -m e => "This system runs outdated binaries and outdated VM guests - you should consider rebooting! "

Right now this only detects the "common" way of a libvirt/qemu setup for guests.
We could move it out of the session checks and try to parse ANY qemu processes if you prefer that.

Furthermore I had no idea what nagios would need/expect and no way to test it so this feature has no nagios support for now.

Please consider this overall as an RFC as I'd need your preference (do we want all qemu processes) and guidance (nagios).
Once discussions happened and this becomes more polished and tested we can un-tag the RFC label and hopefully consider it for inclusion.

@cpaelzer
Copy link
Contributor Author

… which doesn’t seem to provide much value.

Thanks for the feedback @wgrant ,
I'm unsure about what the right default would be - but in general a feature that suppresses output if "nothing needs updates" could help. Once it exists we could debate if that should be the default or opt-in.

There is a "quiet" flag -q already, but that isn't really helpful as-is IMHO.
From the normal output of

$ needrestart -f noninteractive
Scanning processes...
Scanning candidates...

Services to be restarted:

Service restarts being deferred:
 systemctl restart libvirtd.service

No containers need to be restarted.

No user sessions are running outdated binaries.

VM guests running outdated binaries:
 VM 'i-testguest' with pid 306

With -q we'd in this case get only:

$ needrestart -q -f noninteractive
 systemctl restart libvirtd.service

This is IMHO misleading at best, as it missed the header you don't know in the apt-output if it meas that it did a restart or suggests one (in this case it is the latter).
Therefore I'm not suggesting to use -q.

It already has quiet and verbose mode.
I'm not sure if a somewhat quiet mode would really be wanted by upstream.
For me (personally) the extra output never bothered me, in fact I appreciated to
know that all is fine. But I agree that we should have a way to opt-out.

This can be done already on various levels.
a) via config by modifying (or dpkg-diverting) /etc/apt/apt.conf.d/99needrestart.
b) you can also supress it per call sudo NEEDRESTART_SUSPEND=1 apt ...
c) Or the same as (b) but via setting the same in the profile/environment of your choice.

@cpaelzer
Copy link
Contributor Author

There’s also a long delay after the “No user sessions” line.

This one is more important IMHO - thanks @wgrant

At first I need to clarify that it seems that when calling needrestart
itself it doesn't consume that time.
Also if needrestart is suspended and not even called (see above), it still has
the very same delay.

I can only assume, but think that it is other things part of the apt
transaction and since needrestart had the last message one thinks it
is the one spending the time.

I had a look what is running at that time and the time consuming bit is
actually /usr/lib/update-notifier/apt-check --human-readable from
/etc/apt/apt.conf.d/99update-notifier.

This is odd, IIRC we had a bunch of activities to not slow down login
when similar checks ran at e.g. ssh connect. I wonder if over the years this
now became stalling apt - IMHO it could as well run in background without
and problem. I need to check if this was tried and revoked for some reason.

I'll file a bug in update-notifier at some point and subscribe you ...

@liske
Copy link
Owner

liske commented Sep 18, 2021

Thanks for the PR. Would it also work with Xen VMs?

I still need to take a closer look ;-)

@liske liske added this to the v3.6 milestone Sep 18, 2021
needrestart Outdated Show resolved Hide resolved
@cpaelzer
Copy link
Contributor Author

Thanks for the PR. Would it also work with Xen VMs?

This is based on how libvirt puts guests into namespaces.
I have no Xen system to test, but if the same applies to libvirt driven xen using qemu and guest= argument then it might even work. But tested and designed it is for KVM.

Copy link
Owner

@liske liske left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm almost fine with this PR, just a minor output issue.

Thanks!

needrestart Outdated Show resolved Hide resolved
needrestart Outdated Show resolved Hide resolved
needrestart Show resolved Hide resolved
Restarting virtlogd / virtlockd is really bad as guests would be fenced
and restarted. This is intentionally skipped on the package upgrade [1]
and tried to be the last that dies by upstream [2][3].

We should at least not preselect those services to avoid these issues,
therefore add them to the default override config.

[1]: https://salsa.debian.org/libvirt-team/libvirt/-/blob/debian/master/debian/rules#L254
[2]: https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/logging/virtlogd.service.in;h=8ab547851772298ff570487f1ce2d3680c7e4d68;hb=HEAD#l13
[3]: https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/locking/virtlockd.service.in;h=4a6fab05ab309a04a8565a5c73cbcfb47dae8d5d;hb=HEAD#l13

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Needrestart is awesome in detecting long running services or user
sessions that are still running with old binaries. But there is
another common kind of long-running process - VM Guests.

Detecting the common qemu/KVM guests and reporting those to the
user will help them to realize that also those guests should be
considered to be restarted (or migrated onto an updated host)

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
As discussed in the PR this shal avoid misunderstandings that we might
scan inside the guest.

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
@liske liske merged commit 4e3c2b4 into liske:master Mar 16, 2022
@liske
Copy link
Owner

liske commented Mar 16, 2022

Thank you for your contribution!

@cpaelzer
Copy link
Contributor Author

Thank you for your contribution!

No problem, thanks for your patience with my low perl skill!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants