Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overlay: Add 21dhcp-chrony to set default DHCP NTP settings #412

Merged
merged 3 commits into from
Aug 21, 2020

Conversation

rfairley
Copy link
Contributor

@rfairley rfairley commented May 20, 2020

See commit message for details.

@cgwalters
Copy link
Member

As I said in the BZ I believe this applies to not just all of Fedora/RHEL, but any distribution combining NetworkManager and chrony and using the internal dhcp client in NM. I think probably this script should at least land in chrony upstream git, and we could create a chrony-networkmanager subpackage for the script or so.

Another option is for this to be directly built into the NM sources - dynamically detect if chrony is running and if so talk to it.

@cgwalters
Copy link
Member

To be clear I'm opposed to landing this here first, but it seems like it shouldn't be too hard to land in e.g. https://git.tuxfamily.org/chrony/chrony.git/tree/contrib

@rfairley rfairley changed the title overlay: Add 21dhcp-chrony to set default DHCP NTP settings [WIP] overlay: Add 21dhcp-chrony to set default DHCP NTP settings May 20, 2020
@rfairley rfairley changed the title [WIP] overlay: Add 21dhcp-chrony to set default DHCP NTP settings [WIP, hold] overlay: Add 21dhcp-chrony to set default DHCP NTP settings May 20, 2020
@miabbott
Copy link
Member

@rfairley
Copy link
Contributor Author

rfairley commented May 22, 2020

👍 - aiming to upstream the chrony-helper script per https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-dev/2020/05/msg00020.html (other option would be patching just the Fedora and RHEL RPMs with https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-dev/2020/05/msg00023.html, but it doesn't seem too out of reach to upstream chrony-helper too).

Another option for the short term is installing dhcp-client suggested in https://bugzilla.redhat.com/show_bug.cgi?id=1800901#c20, which I have tested in FCOS, and this does fix the NM-chrony integration.

Will focus on getting the upstream patch approved first, then we could either carry the patch in an overlay, or install dhcp-client temporarily, while the upstream patch makes it downstream to Fedora and RHEL.

@cgwalters
Copy link
Member

I'm totally fine carrying this work in this repo while we wait for the upstream bits to land and percolate!
(Though a tricky detail with that is ensuring the right thing happens if we have a hook in this repo and the hook lands in one of the packages)

Another option for the short term is installing dhcp-client suggested in https://bugzilla.redhat.com/show_bug.cgi?id=1800901#c20, which I have tested in FCOS, and this does fix the NM-chrony integration.

Yeah, the NM developers also have had hesitation about the internal dhcp client because it's code copied from systemd, see this subthread https://bugzilla.redhat.com/show_bug.cgi?id=1204226#c13

But...man at one point I also had occasion to read the dhclient code and...it's not the most beautiful. I'd take systemd code over that any day - and part of that I think is pushing for NM and systemd-networkd share more code. (Of course what we really want is a nice clean dhcp library/process in Rust 😉 )

@rfairley
Copy link
Contributor Author

(Though a tricky detail with that is ensuring the right thing happens if we have a hook in this repo and the hook lands in one of the packages)

I think this should be handled okay - from local testing, the rootfs override in COSA simply overwrote the packaged files - I think overlay.d goes through the same flow and will do the same? Will check this - if this is the case it'd be fine carrying the identical files as upstream. If we carried patches to files to overlay, which would give an error if the patch didn't apply correctly - alerting us during a build if there is a problem, I wonder if that'd help with this kind of situation?

But agreed - we do want to make sure to remove this shortly after upstream patches land (maybe a note in the overlay.d README will be visible enough for now).

I wonder if some general automation/notification would be useful for detecting when patches have made it downstream too (I remember some mention of automation for notifying when bodhi updates for lockfile overrides land). Difficult part would be detecting where a patch is, would probably just require checking for new upstream changes (e.g. a bot that subscribes to the chrony dist-git repo and notifies if upstream updated).

Another option for the short term is installing dhcp-client suggested in https://bugzilla.redhat.com/show_bug.cgi?id=1800901#c20, which I have tested in FCOS, and this does fix the NM-chrony integration.

Yeah, the NM developers also have had hesitation about the internal dhcp client because it's code copied from systemd, see this subthread https://bugzilla.redhat.com/show_bug.cgi?id=1204226#c13

Thank you for the extra context!

Just to clarify the option I mentioned - installing the dhcp-client RPM alone will would bring in the 11-dhclient NM dispatch script - and that'd be enough to make the integration work. We'd still have NM exercising dhclient's dispatcher while still using the internal DHCP client - so no need to change the NM config at all if we do go the route of including dhcp-client. The upstream patch as well, has a check on whether dhclient is installed, and would not operate at the same time as the integration coming from 11-dhclient - so it'd be safe to carry the two at the same time.

(Of course what we really want is a nice clean dhcp library/process in Rust wink )

Heh - I see the dhcp crate is available :)

Overall, will finish testing the upstream patch, and will update this PR with the same patch to have an updated copy here ready.

@rfairley rfairley force-pushed the rfairley-21dhcp-chrony branch from fe31be7 to 764efd6 Compare May 25, 2020 14:21
@rfairley
Copy link
Contributor Author

Updated with latest changes which have been tested with FCOS - I have received feedback on the ML since proposing these changes, and will be revising.

@rfairley rfairley force-pushed the rfairley-21dhcp-chrony branch from 764efd6 to 92d6c37 Compare May 26, 2020 14:07
@rfairley
Copy link
Contributor Author

rfairley commented May 26, 2020

Saw error: Hardlinking 93/f54744641341385d6e0a6108aae5ca3e65bff97df88a7090e00d973886b2ed.file to chrony-helper: File exists in CI - we will need to carry separate coreos--prefixed files for this rather than overwrite existing files.

Updated this PR with the latest changes proposed to upstream, with an additional edit to prevent running if the upstream changes land while this is overlaid.

The two files can be generated by:

  • checking out the commit: https://github.com/rfairley/chrony/tree/41ca820757fe1c7066f07130b8d5ebe3511499ac
  • running:
    #!/bin/bash
    
    for infile in examples/*.in; do
        sed -e 's,@CHRONYC@,/usr/bin/chronyc,g' \
            -e 's,@CHRONY_CONF@,/etc/chrony.conf,g' \
            -e 's,@CHRONY_SERVICE@,chronyd.service,g' \
            -e 's,@CHRONY_HELPER_DIR@,/var/run/chrony-helper,g' \
            -e 's,@CHRONY_SERVER_DIR@,/var/lib/chrony/servers,g' < \
            $infile > ${infile%.in}
    done
  • rename the files to coreos-chrony-helper and 20-coreos-chrony, and apply the manual edits that this PR makes (copy/paste the file contents from this PR)

Manually applying these files to /usr/libexec/chrony-helper and /usr/lib/NetworkManager/dispatcher.d/20-chrony, the following testing worked (pasting from my upstream comment):

So far, I have tested these changes by manually applying the files,
with the downstream patch for Fedora, in Fedora CoreOS (the F32-based
"next" stream) with and without dhclient present on the system, with
a DHCP server on the same network using the `ntp-servers` option. In
both cases NTP server config files are written to `/var/lib/dhclient`
or `/var/lib/chrony/servers`, and the NTP servers from the DHCP server
are listed with `chronyc sources`.

Will now check these changes using the coreos--prefixed paths to make sure this doesn't hit SELinux issues - and will write up some more detailed steps to verify in FCOS and RHCOS.

@rfairley
Copy link
Contributor Author

The changes proposed for the downstream package (includes the proposed upstream patches sent to the chrony-dev mailing list): https://src.fedoraproject.org/rpms/chrony/pull-request/3

@rfairley rfairley force-pushed the rfairley-21dhcp-chrony branch 2 times, most recently from dd3ee16 to d71f5de Compare May 27, 2020 14:07
@rfairley rfairley changed the title [WIP, hold] overlay: Add 21dhcp-chrony to set default DHCP NTP settings overlay: Add 21dhcp-chrony to set default DHCP NTP settings May 27, 2020
@cgwalters
Copy link
Member

This is debatable, but I think we should generally defer to the platform chrony config over DHCP.

I don't know if we get NTP via DHCP on any of those platforms, but I can't think of any downside of using the link-local NTP (or in Azure's case the even better virtual PHC).

IOW, let's add a:

if test -f /run/coreos-platform-chrony.conf; then
  exit 0
fi

in here. (And that won't race because the former runs as a generator, before we do DHCP)

@rfairley rfairley changed the title overlay: Add 21dhcp-chrony to set default DHCP NTP settings [WIP] overlay: Add 21dhcp-chrony to set default DHCP NTP settings May 27, 2020
@rfairley rfairley force-pushed the rfairley-21dhcp-chrony branch 4 times, most recently from d8134d4 to 49bfeed Compare May 27, 2020 16:59
@rfairley
Copy link
Contributor Author

This is debatable, but I think we should generally defer to the platform chrony config over DHCP.

I don't know if we get NTP via DHCP on any of those platforms, but I can't think of any downside of using the link-local NTP (or in Azure's case the even better virtual PHC).

IOW, let's add a:

if test -f /run/coreos-platform-chrony.conf; then
  exit 0
fi

Makes sense - added now!

With the upstream patches in their present state (https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-dev/2020/05/msg00027.html, https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-dev/2020/05/msg00028.html), I realize the upstream change should also address the need to override platform specific config. Otherwise, 20-chrony which is not aware of platform config would overwrite the platform config here, once the dhcp support is added. Fixing my upstream patch to make use of config fragments, added in https://git.tuxfamily.org/chrony/chrony.git/commit/?id=3470ab66f02c982d9ef5ebfe6ffff0d8833c2f83, and not override static chrony config would fix this. Will be fixing the upstream patch for this.

As long as the upstream changes to use dropins when handling DHCP-specified NTP servers are made, we should be safe carrying this without the introducing a regression later on where the check on /run/coreos-platform-chrony is not present in the upstream 20-chrony.

@rfairley rfairley force-pushed the rfairley-21dhcp-chrony branch from 49bfeed to 8846697 Compare May 27, 2020 17:28
@rfairley rfairley changed the title [WIP] overlay: Add 21dhcp-chrony to set default DHCP NTP settings overlay: Add 21dhcp-chrony to set default DHCP NTP settings May 27, 2020
@rfairley
Copy link
Contributor Author

rfairley commented May 27, 2020

This one is ready for review - lifted WIP.

The steps I used to test this (no need to run through unless verifying), but posting here for a record / in case it helps.

High level: configure a custom DHCP server statically routed within a NAT network in libvirt, have other hosts obtain an IP address from the custom DHCP server, and check whether the hosts receive the NTP servers set on the DHCP server.

This will require:

  • Virtual Machine Manager (virt-manager RPM package) installed
  • qemu-kvm installed
  • Internet connection throughout
  1. First, download an OS which can run the DHCP server dhcpd. In this, I used Fedora 32 Workstation, downloading an ISO from: https://getfedora.org/en/workstation/download/

  2. While waiting for the OS image to download, configure a network in Virtual Machine Manager (VMM) (Edit > Connection Details). Create a new network, call it dhcptest (editing the default network also works here) - just select the right network when adding new VMs in the following steps. Configure the network with the following XML (a different subnet than 192.168.122.0 can be used if it conflicts with any existing config you may have):

    <network>
      <name>dhcptest</name>
      <uuid>f0217186-6814-4e55-b453-8c9a4cba21af</uuid>
      <forward mode="nat">
        <nat>
          <port start="1024" end="65535"/>
        </nat>
      </forward>
      <bridge name="virbr1" stp="on" delay="0"/>
      <mac address="52:54:00:68:28:12"/>
      <domain name="dhcptest"/>
      <ip address="192.168.122.1" netmask="255.255.255.0">
        <dhcp>
          <host mac="52:54:00:77:22:44" name="dhcpserver" ip="192.168.122.241"/>
        </dhcp>
      </ip>
    </network>

    This will create a NAT network on the virbr1 interface, with a static IP address for the DHCP server, which we will give the mac address of 52:54:00:77:22:44. Start the network by pressing the play button at the bottom. Note no tag is specified for dhcp, so we don't have two DHCP servers (the libvirt one, and the custom one we are setting up) trying to assign IP addresses on the network.

  3. Once the OS image is downloaded, install as a new VM in VMM (File > New Virtual Machine). Select the ISO image option; select the .iso file; set RAM, number CPUs (I did 4096 MB and 2 cores); select storage (I did 20GB); and select the "Virtual network 'dhcptest': NAT" network. Also select "Customize configuration before install". Then click Finish.

    In the window that comes up, there should be a NIC: option in the left. Click it, then enter for the MAC address 52:54:00:77:22:44, replacing the one that was present initially. Click Apply, then Begin Installation.

    Let the image boot - the network connection will fail at first. Click the top left menu in GNOME to change the network settings, and click "Wired settings". Click the gear icon under the Wired connection, then click the IPv4 tab. Click "Manual" for IPv4 Method. Enter for the Address: 192.168.122.241, Netmask: 255.255.255.0, Gateway: 192.168.122.1 (libvirt's default gateway). For DNS servers, deselect the Automatic toggle and enter 8.8.8.8,8.8.4.4 (Google's, but other DNS servers would work here too).

    Note the connection may still report failed after applying the changes but this should not block the install. The installer should persist these settings, and after reboot once the install is finished, the connection should be set up successfully.

    Then, go through the installer steps to install Fedora on the persistent disk (default settings for everything else should be fine). Once complete, reboot the VM.

  4. While waiting for the Fedora install, download the OS artifacts that to test the chrony patches in. We will use .qcow2 files. As an example, download the Fedora CoreOS testing release .qcow2 at: https://getfedora.org/en/coreos/download?tab=metal_virtualized&stream=testing. Following these same instructions with RHCOS and FCOS .qcow2 images should also work.

    Once downloaded, uncompress using unxz <artifact name> and rename the .qcow2 to fcos-testing.qcow2 for easier reference.

  5. Once the Fedora install is finished, reboot the VM, and go through the intial setup (create user account). Then, run sudo dnf install -y dhcp (network from the Fedora VM should be working by this point). Once finished, create a config file /etc/dhcp/dhcpd.conf with the following contents:

    default-lease-time 600;
    max-lease-time 7200;
    option subnet-mask 255.255.255.0;
    option broadcast-address 192.168.122.255;
    option routers 192.168.122.1;
    option domain-name-servers 8.8.8.8, 8.8.4.4;
    option domain-name "mydomain.org";
    option ntp-servers time.google.com;
    
    subnet 192.168.122.0 netmask 255.255.255.0 {
      range 192.168.122.10 192.168.122.100;
      range 192.168.122.150 192.168.122.200;
    }
    

    Clients that connect to this server will use the gateway set up by libvirt (routers option above). The time.google.com NTP servers are specified as means of testing; we will check that hosts that connect end up with these NTP servers in their chrony sources.

    Once the DHCP config is created, run touch /var/lib/dhcpd/dhcpd.leases, then sudo systemctl start dhcpd.

  6. Going back to the testing artifact, we will need to first boot it outside of VMM with an Ignition config, which will let us log in the image from VMM in the next boot. I did not see a way to pass in the Ignition config via userdata through VMM - so we will first boot the .qcow2 passing an Ignition config using qemu-kvm, and use VMM on the second boot. If testing Fedora CoreOS, create an Ignition config autologin.ign with the following contents:

    {
      "ignition": {
        "config": {
          "replace": {
            "source": null,
            "verification": {}
          }
        },
        "security": {
          "tls": {}
        },
        "timeouts": {},
        "version": "3.0.0"
      },
      "passwd": {},
      "storage": {},
      "systemd": {
        "units": [
          {
            "dropins": [
              {
                "contents": "[Service]\n# Override Execstart in main unit\nExecStart=\n# Add new Execstart with `-` prefix to ignore failure\nExecStart=-/usr/sbin/agetty --autologin core --noclear %I $TERM\nTTYVTDisallocate=no\n",
                "name": "autologin-core.conf"
              }
            ],
            "name": "getty@tty1.service"
          }
        ]
      }
    }

    If testing RHCOS, use the following (adding your SSH pubkey instead of ssh-rsa AAA ...):

    {"ignition":{"version":"2.2.0"},"passwd":{"users":[{"groups":["sudo"],"name":"core","sshAuthorizedKeys":["ssh-rsa AAA ..."]}]}}
    

    Now run qemu-kvm -m 2048M -accel kvm -smp cores=4 -fw_cfg name=opt/com.coreos/config,file=autologin.ign -drive file=fcos-testing.qcow2 -net nic,model=virtio -net user,hostfwd=tcp::2222-:22 -nographic. If testing FCOS, once booted, press Ctrl-A to exit QEMU, which will shut down the image. If testing RHCOS, SSH into the machine once booted, from another terminal window with: ssh -p 2222 core@127.0.0.1. Then run sudo passwd, and set a password for root you can remember. You can use this password to login, once installed in VMM in step 7.

  7. In VMM, open Files > Add New Virtual Machine. Select "Import existing disk image"; select the fcos-testing.qcow2 file, and choose any of the Fedora options for the OS; set RAM, number CPUs (defaults of 2048 MB and 2 cores are fine); and select the "Virtual network 'dhcptest': NAT" network. Then click Finish. Once booted, the machine should be assigned an IP address 192.168.122.10 from the DHCP server we set up, and internet (e.g. ping example.org) should work.

  8. Now in the FCOS VM, run git clone https://github.com/coreos/fedora-coreos-config. cd into the checked out repository, and git fetch origin pull/412/head:dhcp-chrony-test; checkout dhcp-chrony-test. Now, install the files:

    sudo su
    ostree admin unlock
    install -m 755 overlay.d/21dhcp-chrony/etc/NetworkManager/dispatcher.d/20-coreos-chrony \
        /etc/NetworkManager/dispatcher.d/20-coreos-chrony
    install -m 755 overlay.d/21dhcp-chrony/usr/libexec/coreos-chrony-helper /usr/libexec/coreos-chrony-helper

    Then, do sudo systemctl restart NetworkManager to restart the DHCP client. chronyc sources should now show the custom-configured time servers (time{1,2,3,4}.google.com for this example). Correspondingly, a /var/lib/chrony/servers/chrony.servers.eth0 file containing the IP addresses of the added timeservers, and the iburst parameter, should exist. See the attachment below.

ntp-config-result

@ashcrow ashcrow requested review from cgwalters and dustymabe May 27, 2020 20:23
Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few optional changes; overall I'm OK to merge as is too!

echo " list-static-sources"
echo " set-static-sources < sources.list"
echo " is-running"
echo " command CHRONYC-COMMAND"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow a lot going on here. But I guess we're just copying the bits that came from upstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - decided to just keep all of the bits from the chrony.helper script, which the same between Fedora 32, 33 and RHEL8. The only part we really need is update-daemon. Should have included a diff earlier - this is the diff between the original and this PR:

diff --git a/helper-f33 b/overlay.d/21dhcp-chrony/usr/libexec/coreos-chrony-helper
old mode 100644
new mode 100755
index 95414af..46784a3
--- a/helper-f33
+++ b/overlay.d/21dhcp-chrony/usr/libexec/coreos-chrony-helper
@@ -1,4 +1,12 @@
 #!/bin/bash
+
+# This is a copy of /usr/libexec/chrony-helper carried in FCOS/RHCOS
+# temporarily while upstream changes and the package update to
+# chrony-helper percolate (see
+# https://src.fedoraproject.org/rpms/chrony/pull-request/3). It should
+# not by used by any script other than the 20-coreos-chrony NM
+# dispatcher script.
+
 # This script configures running chronyd to use NTP servers obtained from
 # DHCP and _ntp._udp DNS SRV records. Files with servers from DHCP are managed
 # externally (e.g. by a dhclient script). Files with servers from DNS SRV
@@ -9,10 +17,20 @@ chronyc=/usr/bin/chronyc
 chrony_conf=/etc/chrony.conf
 chrony_service=chronyd.service
 helper_dir=/var/run/chrony-helper
-added_servers_file=$helper_dir/added_servers
+server_dir=/var/lib/chrony/servers
+
+# If dhclient is installed, the 11-dhclient NetworkManager dispatcher script
+# will have the necessary integration to bring NTP servers from NM/DHCP into
+# chrony. The chrony dispatcher script (20-chrony) will not take action when
+# a dhclient installation is detected. Therefore, make sure we are using the
+# dhclient state directory `/var/lib/dhclient`.
+if [ -e /usr/sbin/dhclient ]; then
+    server_dir=/var/lib/dhclient
+fi
 
 network_sysconfig_file=/etc/sysconfig/network
-dhclient_servers_files="/var/lib/dhclient/chrony.servers.*"
+dhcp_servers_files="${server_dir}/chrony.servers.*"
+added_servers_file=$helper_dir/added_servers
 dnssrv_servers_files="$helper_dir/dnssrv@*"
 dnssrv_timer_prefix=chrony-dnssrv@
 
@@ -27,7 +45,7 @@ is_running() {
 }
 
 get_servers_files() {
-    [ "$PEERNTP" != "no" ] && echo "$dhclient_servers_files"
+    [ "$PEERNTP" != "no" ] && echo "$dhcp_servers_files"
     echo "$dnssrv_servers_files"
 }
 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part I added to the original

+if [ -e /usr/sbin/dhclient ]; then
+    server_dir=/var/lib/dhclient
+fi

is also dead code, as the NM dispatcher script will not run coreos-chrony-helper at all if dclient is detected, but my aim was to keep it in line with the proposal in: https://src.fedoraproject.org/rpms/chrony/pull-request/3#_4__30, such that coreos-chrony-helper could replace chrony-helper and still work with dhclient if we needed. Will leave it in for now - we will be coming back to this for sure when landing changes from chrony upstream.

rm -f "$dhcp_server_file"

# Don't add NTP servers if PEERNTP=no specified; return early.
[ "$PEERNTP" = "no" ] && return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this is dead code for FCOS which doesn't support /etc/sysconfig/network-scripts (that's fine, we need it for RHCOS), and other Fedora editions need it too even.

It also doesn't support NM keyfiles but that's something we can investigate later.

@rfairley rfairley force-pushed the rfairley-21dhcp-chrony branch from 8846697 to 95f6d3d Compare May 27, 2020 21:32
@rfairley rfairley changed the title [hold] overlay: Add 21dhcp-chrony to set default DHCP NTP settings overlay: Add 21dhcp-chrony to set default DHCP NTP settings Jun 10, 2020
@rfairley
Copy link
Contributor Author

rfairley commented Jun 10, 2020

Realized having the generator write PEERNTP=no may be an alternative, to avoid conflict with the chrony DHCP dispatcher. The plan upstream is to respect PEERNTP=no in the dispatcher, so we'd avoid the situation with 20-chrony-dhcp landing from upstream without checking for platform config (/run/coreos-platform-chrony.conf).

Also confirmed that the upstream dispatcher name will be 20-chrony-dhcp - so the checks for this file in the dispatcher script here (https://github.com/coreos/fedora-coreos-config/pull/412/files#diff-5ee5713c5016d726075bdb06b8a7348eR16) would work.

Did some tests to check the /etc/sysconfig/network file was generated correctly in a COSA build (modifying the generator to skip the check on platform ID), and that the dispatcher script responds appropriately with/without PEERNTP=no. Will try to do a full end-to-end test in a cloud platform, but this is good to review now.

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't validate the details here...a lot going on and I'd be much more confident if we had even a manual test for this. But overall seems sane.

# TODO: once https://bugzilla.redhat.com/show_bug.cgi?id=1828434 is
# resolved, this won't be required.
if [ ! -e /etc/sysconfig/network ] || ! grep -q "PEERNTP" /etc/sysconfig/network; then
cat <<EOF >> /etc/sysconfig/network
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait but...is this file actually read on FCOS today? That's part of the "initscripts networking" bits...which NetworkManager might read...let me see.
It does look like it reads /etc/sysconfig/network, but rg -i peerntp has zero hits in NM git.

Ahh but I see, we're sourcing that ourselves in the other script. OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - as far as I could tell, only the script added here, and the dhclient handler shipped in chrony, https://src.fedoraproject.org/rpms/chrony/blob/cafc2b7a759c623057666ae578e847c78b6e7811/f/chrony.dhclient#_7, will read that environment variable. If something else is installed and happens to read and parse PEERNTP=no and not propagate the time sources (e.g. dhclient or ntpd), it should be expected behavior that the cloud time source is used by default when on the cloud platform and DHCP sources are not propagated.

echo "$server"
done) | sort -u)

comm -23 <(echo -n "$added_servers") <(echo -n "$all_servers") |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use a comment here, I have never seen comm used before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a first for me too :)

update_daemon() {
local all_servers_with_args all_servers added_servers

if ! is_running; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, won't this potentially race if NM and chrony start up at the same time?

@rfairley
Copy link
Contributor Author

rfairley commented Jul 22, 2020

The closest I got to a manual test is this - #412 (comment) which checks that DHCP -> chrony propagation of NTP sources works. Unfortunately it is quite long-winded and heavyweight using multiple VMs - never got a test using dummy interfaces (like the Debian chrony package has) working https://gist.github.com/rfairley/0a126d583636ab7d14113cb60397ab86.

There should also really be another test that runs on a cloud where a time source is provided (aws, azure, or gcp) and a DHCP server providing NTP sources is present on the local network, and check that the time source remains the one provided by the cloud (doesn't get overwritten by the DHCP-provided source). I did check this locally by adding a fake cloud platform to the list of checked values of ignition.platform.id which provide time sources in the generator, and if I passed ignition.platform.id=fake, the DHCP-provided time-sources did not get propagated, as we want (due to PEERNTP=no getting written by the generator).

Automating both of the scenarios above are really what's left to tie off this work - then the bits upstream (the effect of https://git.tuxfamily.org/chrony/chrony.git/commit/?id=bf7f63eaed83a271ba1eeefdd21f187e3999962b and https://src.fedoraproject.org/rpms/chrony/pull-request/3) should trickle down and eventually replace the 21dhcp-chrony overlay (and the use of PEERNTP=no in /etc/sysconfig/network as a way of avoiding conflict with the platform time source would be replaced by the new sourcedirs config directive added in chrony). The automation would be possible with libvirt, but would be nice to get it working with the dhcpd server on a dummy interface and NetworkManager internal DHCP client requesting at the dummy interface if possible (to avoid the heavy overhead).

@dustymabe
Copy link
Member

Thanks @rfairley. @sohankunkerkar is picking up this work and moving it forward. If there are any needed changes to this PR he'll probably open a new PR with commits on top of what you've got here. He's digging in to the problem now.

sohankunkerkar added a commit to sohankunkerkar/coreos-assembler that referenced this pull request Aug 19, 2020
Writing this test to verify coreos/fedora-coreos-config#412
change to enable DHCP propagation of NTP servers. We don't need this test once the
upstream/downstream patch merges.
upstream patch: https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-dev/2020/05/msg00023.html
downstream patch: https://src.fedoraproject.org/rpms/chrony/pull-request/3
sohankunkerkar added a commit to sohankunkerkar/coreos-assembler that referenced this pull request Aug 19, 2020
Writing this test to verify coreos/fedora-coreos-config#412
to enable the DHCP propagation of NTP servers. We don't need this test once the
upstream/downstream patch merges.
upstream patch: https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-dev/2020/05/msg00023.html
downstream patch: https://src.fedoraproject.org/rpms/chrony/pull-request/3
sohankunkerkar added a commit to sohankunkerkar/coreos-assembler that referenced this pull request Aug 19, 2020
Writing this test to verify coreos/fedora-coreos-config#412
to enable the DHCP propagation of NTP servers. We don't need this test once the
upstream/downstream patch merges.
upstream patch: https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-dev/2020/05/msg00023.html
downstream patch: https://src.fedoraproject.org/rpms/chrony/pull-request/3
Copy link
Member

@sohankunkerkar sohankunkerkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
waiting for coreos/coreos-assembler#1667 to get in before we merge this PR.

sohankunkerkar added a commit to sohankunkerkar/coreos-assembler that referenced this pull request Aug 19, 2020
Writing this test to verify coreos/fedora-coreos-config#412
to enable the DHCP propagation of NTP servers. We don't need this test once the
upstream/downstream patch merges.
upstream patch: https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-dev/2020/05/msg00023.html
downstream patch: https://src.fedoraproject.org/rpms/chrony/pull-request/3
sohankunkerkar added a commit to sohankunkerkar/coreos-assembler that referenced this pull request Aug 19, 2020
Writing this test to verify coreos/fedora-coreos-config#412
to enable the DHCP propagation of NTP servers. We don't need this test once the
upstream/downstream patch merges.
upstream patch: https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-dev/2020/05/msg00023.html
downstream patch: https://src.fedoraproject.org/rpms/chrony/pull-request/3
sohankunkerkar added a commit to sohankunkerkar/coreos-assembler that referenced this pull request Aug 19, 2020
Writing this test to verify coreos/fedora-coreos-config#412
to enable the DHCP propagation of NTP servers. We don't need this test once the
upstream/downstream patch merges.
upstream patch: https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-dev/2020/05/msg00023.html
downstream patch: https://src.fedoraproject.org/rpms/chrony/pull-request/3
sohankunkerkar added a commit to sohankunkerkar/coreos-assembler that referenced this pull request Aug 19, 2020
Writing this test to verify coreos/fedora-coreos-config#412
to enable the DHCP propagation of NTP servers. We don't need this test once the
upstream/downstream patch merges.
upstream patch: https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-dev/2020/05/msg00023.html
downstream patch: https://src.fedoraproject.org/rpms/chrony/pull-request/3
sohankunkerkar added a commit to sohankunkerkar/coreos-assembler that referenced this pull request Aug 19, 2020
Writing this test to verify coreos/fedora-coreos-config#412
to enable the DHCP propagation of NTP servers. We don't need this test once the
upstream/downstream patch merges.
upstream patch: https://listengine.tuxfamily.org/chrony.tuxfamily.org/chrony-dev/2020/05/msg00023.html
downstream patch: https://src.fedoraproject.org/rpms/chrony/pull-request/3
sohankunkerkar added a commit to sohankunkerkar/coreos-assembler that referenced this pull request Aug 20, 2020
Writing this test to verify coreos/fedora-coreos-config#412
to enable the DHCP propagation of NTP servers.
sohankunkerkar added a commit to sohankunkerkar/coreos-assembler that referenced this pull request Aug 20, 2020
Writing this test to verify coreos/fedora-coreos-config#412 to enable
the DHCP propagation of NTP servers.This will also avoid any regression
that might cause in RHCOS or FCOS when the upstream changes come down
and obsolete the temporary work.
sohankunkerkar added a commit to sohankunkerkar/coreos-assembler that referenced this pull request Aug 20, 2020
Writing this test to verify coreos/fedora-coreos-config#412 to enable
the DHCP propagation of NTP servers.This will also avoid any regression
that might cause in RHCOS or FCOS when the upstream changes come down
and obsolete the temporary work.
sohankunkerkar added a commit to sohankunkerkar/coreos-assembler that referenced this pull request Aug 20, 2020
Writing this test to verify coreos/fedora-coreos-config#412 to enable
the DHCP propagation of NTP servers.This will also avoid any regression
that might cause in RHCOS or FCOS when the upstream changes come down
and obsolete the temporary work.
Copy link
Member

@dustymabe dustymabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - will give it another few hours for review and then will merge.

@dustymabe dustymabe merged commit e6ea408 into coreos:testing-devel Aug 21, 2020
dustymabe pushed a commit that referenced this pull request Aug 21, 2020
In order to make chrony use NTP settings from DHCP (#412),
we need to chmod the following files to unset the writable permissions. Git tracks only the executable bit
of the permissions so when the files get pulled locally they could have the group write bit set. When that
happens we get an error like: `Cannot execute '/etc/NetworkManager/dispatcher.d/20-coreos-chrony-dhcp': writable by group or other`
dustymabe pushed a commit that referenced this pull request Aug 21, 2020
…vers

Writing this test to verify #412 to enable the DHCP propagation
of NTP servers. This will also avoid any regression that might
cause in RHCOS or FCOS whenthe upstream changes come down and
obsolete the temporary work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants