Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update NTP makestep for qemu #1431

Closed
xordspar0 opened this issue Feb 27, 2023 · 2 comments
Closed

Update NTP makestep for qemu #1431

xordspar0 opened this issue Feb 27, 2023 · 2 comments

Comments

@xordspar0
Copy link

In a6ed7b3a1ebe4a97febe3dbfab88222fc5c42f76, the NTP configuration for FCOS VM images running in cloud hosts was updated so that Chrony updates the system time to match NTP time immediately instead of gradually over a long period of time. This came out of a bug report that affects a variety of VM environments, not just cloud deployments.

The clock getting out of sync at least affects Podman's qemu VMs running on laptops that go to sleep occasionally, as detailed in this bug report: containers/podman#11541

Should we make the same change for all qemu images, something like this, in coreos-platform-chrony?

  platform=$(karg ignition.platform.id)
  case "${platform}" in
-     azure|azurestack|aws|gcp) ;;  # OK, this is a platform we know how to support
+     azure|azurestack|aws|gcp|qemu) ;;  # OK, this is a platform we know how to support
      *) exit 0 ;;
  esac
  
  ...
  
  (echo "# Generated by $self - do not edit directly"
   sed -e s,'^makestep,#makestep,' -e s,'^pool,#pool,' -e s,'^leapsectz,#leapsectz,' < /etc/chrony.conf
  cat <<EOF
  
  # Allow the system clock step on any clock update.
  # It will avoid the time resynchronization issue when VMs are resumed from suspend.
  # See https://bugzilla.redhat.com/show_bug.cgi?id=1780165 for more information.
  makestep 1.0 -1
  
  EOF
  ) > "${confpath}"

It's not clear to me why qemu's default RTC setting of host doesn't cover this issue, but the fact is that it doesn't (according to my experience, the experience of the people reporting the Podman bug, and others), and NTP seems to be the only reliable way to keep a FCOS VM's clock in sync.

I can't say with certainty that the is the right choice for all VMs, or even all qemu VMs, but it makes sense to me that it should be the default for VMs running on a laptop and possibly in other cases.

@jlebon jlebon transferred this issue from coreos/fedora-coreos-config Feb 27, 2023
@jlebon
Copy link
Member

jlebon commented Feb 27, 2023

I can't say with certainty that the is the right choice for all VMs, or even all qemu VMs, but it makes sense to me that it should be the default for VMs running on a laptop and possibly in other cases.

Right. The problem is that QEMU as a platform can be used in many different contexts, and there's no easy way to tell from within the guest in what context it's being used (e.g. developer's laptop vs. production). The linked RHBZ in the generator mentions that there could be compatibility and security issues with allowing steps all the time (see https://bugzilla.redhat.com/show_bug.cgi?id=1780165#c6). The platforms where we currently enable this have cloud-managed endpoints we've accepted to trust.

It's not clear to me why qemu's default RTC setting of host doesn't cover this issue, but the fact is that it doesn't (according to my experience, the experience of the people reporting the Podman bug, and others)

That's interesting. That user post got no replies on the QEMU list, but it sounds like there may be a bug there. Using ptp_kvm would be another way to fix this (related: coreos/fedora-coreos-config#2263), but may not be available on ARM. The easiest workaround for podman machines would probably be for podman to enable stepping at provisioning time like we do on those cloud platforms, assuming the related concerns are deemed acceptable.

@xordspar0
Copy link
Author

xordspar0 commented Feb 27, 2023

The linked RHBZ in the generator mentions that there could be compatibility and security issues with allowing steps all the time

Understood

That's interesting. That user post got no replies on the QEMU list, but it sounds like there may be a bug there.

Yes, this is an XKCD #979 moment. It would be really convenient if the hw clock worked reliably; customizing NTP settings for this use-case wouldn't be necessary.

Using ptp_kvm would be another way to fix this (related: coreos/fedora-coreos-config#2263)

Ha, it's funny that someone else proposed a very similar change to solve a different problem at around the same time.

The easiest workaround for podman machines would probably be for podman to enable stepping at provisioning time like we do on those cloud platforms, assuming the related concerns are deemed acceptable.

I agree, I'll bring this up with Podman.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants