Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F33: transitioning existing systems to systemd-resolved on upgrade #646

Closed
dustymabe opened this issue Oct 7, 2020 · 23 comments · Fixed by coreos/fedora-coreos-config#700
Assignees
Labels
fallout/f33 jira for syncing to jira

Comments

@dustymabe
Copy link
Member

With the change to systemd-resolved we need to do some sort of intervention to make the systemd-resolved change take effect on existing systems that are upgraded.

The scriptlets for the systemd rpm have something for this but "upgrade" logic for OSTree systems doesn't really work because the compose starts with a fresh world view every time:

$ rpm -q --scripts systemd
<snip>
# Create /etc/resolv.conf symlink.
# We would also create it using tmpfiles, but let's do this here
# too before NetworkManager gets a chance. (systemd-tmpfiles invocation above
# does not do this, because it's marked with ! and we don't specify --boot.)
# https://bugzilla.redhat.com/show_bug.cgi?id=1873856
if systemctl -q is-enabled systemd-resolved.service &>/dev/null; then
  ln -fsv ../run/systemd/resolve/stub-resolv.conf /etc/resolv.conf
fi
<snip>

For us on existing systems the resolv.conf file will already exist and contain some contents like:

[core@fedora ~]$ cat /etc/resolv.conf 
# Generated by NetworkManager
nameserver 192.168.1.1

I suggest we write some migration logic that basically detects the # Generated by NetworkManager and runs the ln -fsv ../run/systemd/resolve/stub-resolv.conf /etc/resolv.conf if it has that in it. According to the change document, as long as that symlink is set up NetworkManager knows what to do to take advantage of systemd-resolved.

Any resolv.conf that had been hand edited and not managed by NetworkManager would be left alone.

@cgwalters
Copy link
Member

Yeah, all upgrade logic like this that depends on per-user/per-system state needs to be a systemd unit.
RPM %post should only be about things like generating cache files (ldconfig) etc.

Related discussion: https://mail.gnome.org/archives/ostree-list/2020-February/msg00000.html

@dustymabe
Copy link
Member Author

Discussed with Luca and Jonathan. We agreed we don't have to solve this migration problem for the first release of F33 into the next stream, since DNS will continue to work through NetworkManager controlled resolv.conf for now.

We also noted that this is a problem that will need to be solved for other OSTree distributions for upgrades as well.

@slankes
Copy link

slankes commented Oct 11, 2020

Just a data point: This change broke an install of mine that has unbound running as a caching dns server in a container. Because that container could no longer start all the others that depend on working dns ceased to work as well. I have fixed this for now by masking systemd-resolved.service.

@dustymabe
Copy link
Member Author

First off, thank you for running next and helping find issues for yourself and other users/community members.

Just so I understand fully, the unbind caching DNS server in a container is now failing to start because it's trying to bind to the same ports that systemd-resolved is now using and that's the conflict?

@dustymabe
Copy link
Member Author

Discussed this briefly with @jlebon and @bgilbert. In the past we may have considered only shipping the enabled systemd-resolved in newly installted systems and left upgraded systems alone. However we would like to minimize "drift" from what the rest of Fedora is doing. The current proposal is:

  • For users running local resolvers (like @slankes) we'll put out a coreos-status post that details the problem and recommends they mask systemd-resolved sometime between now and the time Fedora 33 hits testing/stable. They'll need to do it anyway for fresh installs, so there is some action on their part needed anyway.
[MANUAL RUN]
ln -sf /dev/null /etc/systemd/system/systemd-resolved.service
[via FCCT]
storage:
  links:
  - path: /etc/systemd/system/systemd-resolved.service
    target: /dev/null
  • For all other users we'll auto migrate them by using a systemd service in a barrier release. This systemd service will run before NetworkManager and systemd-resolved. It will detect if systemd-resolved is enabled (i.e. it won't be if it's masked) and update resolv.conf to be a symlink to ../run/systemd/resolve/stub-resolv.conf if it's detected to have been managed by NM (detected via the # Generated by NetworkManager at the top of the file).

@slankes
Copy link

slankes commented Oct 14, 2020

That sounds sensible - thanks for picking the issue up.

@travier
Copy link
Member

travier commented Oct 14, 2020

With FCCT, you can also use:

systemd:
  units:
    - name: systemd-resolved.service
      mask: true

@dustymabe
Copy link
Member Author

dustymabe commented Oct 14, 2020

With FCCT, you can also use:

systemd:
  units:
    - name: systemd-resolved.service
      mask: true

The only problem there that I think people will run in to is that there is no systemd-resolved.service in our current stable and testing streams. So I think it will fail.

@dustymabe
Copy link
Member Author

oh actually, I forgot we just added it (but disabled by default).

@jlebon
Copy link
Member

jlebon commented Oct 14, 2020

For all other users we'll auto migrate them by using a systemd service in a barrier release. This systemd service will run before NetworkManager and systemd-resolved. It will detect if systemd-resolved is enabled (i.e. it won't be if it's masked) and update resolv.conf to be a symlink to ../run/systemd/resolve/stub-resolv.conf if it's detected to have been managed by NM (detected via the # Generated by NetworkManager at the top of the file).

One thing we discussed in the community meeting was whether we should have the "conditional on NM-managed" bit. It technically deviates from the Fedora version of this.

It makes sense offhand, though... since we're not actively disabling systemd-resolved in that case, it basically means that anyone who's touched /etc/resolv.conf will have it running, even though it's likely they would have wanted it disabled.

If instead we just say "we only look at is-enabled, just like the rest of Fedora", that simplifies the message and forces sysadmins to really see what they want there. (Also, if you've tweaked /etc/resolv.conf just because you were testing something temporarily, you don't lose out on the auto-migration.)

Not sure, I think I'm good either way, but just wanted to flag this.

@basvdlei
Copy link

basvdlei commented Oct 14, 2020

I ran into the following issue when trying out next release 33.20201006.1.0: https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/#known-issues

Since systemd-resolved configures a stub listener on the loopback interface. While the kubelet by default will have pods with the dnsPolicy: "Default" inherit /etc/resolv.conf. But since containers have their own loopback interface, they will not be able to connect to systemd-resolved.

As far as I can tell, this will probably break DNS resolving on most Kubernetes installations. Since most clusters will have a CoreDNS deployed with the dnsPolicy: "Default".

@dustymabe
Copy link
Member Author

Thanks for pointing that out @basvdlei - the doc you linked to mentioned to use --resolv-conf /run/systemd/resolve/resolv.conf. Does that work in your testing?

That file isn't the stub (which is at /run/systemd/resolve/stub-resolv.conf), so it won't have the 127.0.0.53 address in it.

cc @dghubble @vrutkovs - do OKD and Typhoon have any special casing here?

@vrutkovs
Copy link
Member

We're running OKD 4.6 nightlies with systemd-resolved enabled and didn't hit any problems so far

@basvdlei
Copy link

@dustymabe for historical reasons our kubelet container is running with /etc/resolv.conf volume mounts, which is why I hit this issue. Setting --resolv-conf /run/systemd/resolve/resolv.conf does work. But that's not compatible with the current Fedora CoreOS stable.

Both podman and moby now have (mostly undocumented) work-arounds to use systemd-resolved non-stubbed's resolv.conf when provisioning the container's resolv.conf. A kubelet running in one of those runtimes should get the correct DNS server(s) by default. Which would explain why OKD and Typhoon didn't run into this.

I can probably also remove the resolv.conf volume mount. This should work with all streams and allow the nodes to be updated as well. But I'll have to do some additional testing.

Hopefully I'm the only one crazy enough to have done this 😀

@dghubble
Copy link
Member

dghubble commented Oct 15, 2020

FCOS 32

/etc/resolv.conf (nameserver upstream)

FCOS 33

/etc/resolv.conf --> /run/systemd/resolve/stub-resolv.conf
/run/systemd/resolve/resolv.conf (nameserver upstream)
/run/systemd/resolve/stub-resolv.conf (nameserver 127.0.0.1:53)

CL / Flatcar

/etc/resolv.conf --> /run/systemd/resolve/resolv.conf 
/run/systemd/resolve/resolv.conf (nameserver upstream)
/run/systemd/resolve/stub-resolv.conf (nameserver 127.0.0.1:53)

Kubelet uses the default /etc/resolv.conf. So initially I'd expect what @basvdlei mentioned. But the Typhoon Kubelet is run as an image by podman. podman determines the /etc/resolv.conf Kubelet sees. And the effective /etc/resolv.conf has the upstream nameserver. I'm not sure if podman is intentionally handling this or its a coincidence.

$ sudo podman exec -it 75e08b27e104 /bin/bash
$ cat /etc/resolv.conf
search region.compute.internal
nameserver 10.0.0.2

@dghubble
Copy link
Member

Oh nice, thanks @basvdlei. So looks like podman is indeed intentionally seeing that /etc/resolv.conf is a symlink and using /run/systemd/resolve/resolv.conf (nameserver upstream). So I'd expect no change.

@basvdlei
Copy link

@dghubble yeah, it's not really obvious/transparent. Thanks for looking and all your hard work on Typhoon!

@dustymabe just wanted to confirm I can work around this issue by relying on podman to provision the resolv.conf in my kubelet container. I've never tried running the kubelet directly on the host, but if anyone is doing that, they might still run into this when upgrading.

Thinking out loud, so feel free to ignore. With "everything running in containers" on FCOS, almost nothing will be able to use the stub listener. Container workloads have none of the benefits, while DNS queries might give different results whether your inside a container of not. While not using stub config is a safer upgrade path.

@dustymabe
Copy link
Member Author

@dustymabe for historical reasons our kubelet container is running with /etc/resolv.conf volume mounts, which is why I hit this issue. Setting --resolv-conf /run/systemd/resolve/resolv.conf does work. But that's not compatible with the current Fedora CoreOS stable.

Right. The systemd-resolved change is only in next for now because that's the only stream that has been rebased to F33.

dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Oct 19, 2020
This systemd unit migrates the /etc/resolv.conf file on systems to
point to ../run/systemd/resolve/stub-resolv.conf if users haven't
set up a custom resolv.conf. It will only run on Fedora 33 systems
and will only execute once (a single migration).

Fixes: coreos/fedora-coreos-tracker#646
dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Oct 19, 2020
This systemd unit migrates the /etc/resolv.conf file on systems to
point to ../run/systemd/resolve/stub-resolv.conf if users haven't
set up a custom resolv.conf. It will only run on Fedora 33 systems
and will only execute once (a single migration).

Fixes: coreos/fedora-coreos-tracker#646
@dustymabe dustymabe self-assigned this Oct 20, 2020
@dustymabe dustymabe added the jira for syncing to jira label Oct 20, 2020
dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Oct 20, 2020
This systemd unit migrates the /etc/resolv.conf file on systems to
point to ../run/systemd/resolve/stub-resolv.conf if users haven't
set up a custom resolv.conf. It will only run on Fedora 33 systems
and will only execute once (a single migration).

Fixes: coreos/fedora-coreos-tracker#646
dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Oct 20, 2020
This systemd unit migrates the /etc/resolv.conf file on systems to
point to ../run/systemd/resolve/stub-resolv.conf if users haven't
set up a custom resolv.conf. It will only run on Fedora 33 systems
and will only execute once (a single migration).

Fixes: coreos/fedora-coreos-tracker#646
dustymabe added a commit to coreos/fedora-coreos-config that referenced this issue Oct 21, 2020
This systemd unit migrates the /etc/resolv.conf file on systems to
point to ../run/systemd/resolve/stub-resolv.conf if users haven't
set up a custom resolv.conf. It will only run on Fedora 33 systems
and will only execute once (a single migration).

Fixes: coreos/fedora-coreos-tracker#646
sinnykumari added a commit to sinnykumari/fedora-coreos-streams that referenced this issue Oct 21, 2020
sinnykumari added a commit to coreos/fedora-coreos-streams that referenced this issue Oct 21, 2020
dustymabe added a commit to dustymabe/fedora-coreos-streams that referenced this issue Nov 18, 2020
This is the first f33 release on the `testing` stream. Let's make
it a barrier as agreed upon in coreos/fedora-coreos-tracker#646.
dustymabe added a commit to dustymabe/fedora-coreos-streams that referenced this issue Nov 18, 2020
This is the first f33 release on the `testing` stream. Let's make
it a barrier as agreed upon in coreos/fedora-coreos-tracker#646.
dustymabe added a commit to coreos/fedora-coreos-streams that referenced this issue Nov 19, 2020
This is the first f33 release on the `testing` stream. Let's make
it a barrier as agreed upon in coreos/fedora-coreos-tracker#646.
kelvinfan001 pushed a commit to kelvinfan001/fedora-coreos-config that referenced this issue Dec 14, 2020
This systemd unit migrates the /etc/resolv.conf file on systems to
point to ../run/systemd/resolve/stub-resolv.conf if users haven't
set up a custom resolv.conf. It will only run on Fedora 33 systems
and will only execute once (a single migration).

Fixes: coreos/fedora-coreos-tracker#646
dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this issue Dec 16, 2020
This systemd unit migrates the /etc/resolv.conf file on systems to
point to ../run/systemd/resolve/stub-resolv.conf if users haven't
set up a custom resolv.conf. It will only run on Fedora 33 systems
and will only execute once (a single migration).

Fixes: coreos/fedora-coreos-tracker#646
(cherry picked from commit 56b0ceb)
dustymabe added a commit to coreos/fedora-coreos-config that referenced this issue Dec 16, 2020
This systemd unit migrates the /etc/resolv.conf file on systems to
point to ../run/systemd/resolve/stub-resolv.conf if users haven't
set up a custom resolv.conf. It will only run on Fedora 33 systems
and will only execute once (a single migration).

Fixes: coreos/fedora-coreos-tracker#646
(cherry picked from commit 56b0ceb)
@dustymabe
Copy link
Member Author

Note that due to complications we decided to not use systemd-resolved, but leave it enabled for now. This migration script will still run and systemd-resolved is still serving a function of populating entries in the file that is pointed to by /etc/resolv.conf, but it won't be the stub listener and glibc's resolver won't query systemd-resolved for DNS.

@icedream
Copy link

icedream commented Dec 18, 2020

Some of the servers I maintain have updated to the latest Fedora CoreOS stable release this morning and all of them suddenly ran into DNS issues and it turned out to be an issue with systemd-resolved having been automatically enabled.

After the update, /etc/resolv.conf pointed to /run/systemd/resolve/stub-resolv.conf which does not exist (before the update /etc/resolv.conf was a file produced by NetworkManager).

systemd-resolved itself logged a rather confusing error Failed to symlink /run/systemd/resolve/stub-resolv.conf: Permission denied. Considering the last comment in this issue I think this is intended? I for now fixed it by disabling systemd-resolved, deleting the symlink and restarting NetworkManager.

Just wanted to ask if I should actually go ahead and completely mask systemd-resolved instead if I am already on F33 or whether what I have done is enough to solve the issue.

EDIT: Even with the service being masked before upgrading to F33 it seems the resolv.conf file still gets replaced with a symlink to stub-resolv.conf. Not sure if this has any relevance or whether that's erroneous behavior.

@dustymabe
Copy link
Member Author

dustymabe commented Dec 18, 2020

hey @icedream - I'm almost certain you're hitting the SELinux issue mentioned in https://discussion.fedoraproject.org/t/fedora-coreos-rebasing-to-fedora-33-features-and-known-issues/25474. If that's the case follow the steps to restore the SELinux policy from the base config and then apply your settings back on top. If possible please leave systemd-resolved unmasked so that you can stay with the defaults provided by FCOS, which will lead to less problems in the future. Otherwise, what you've done to restore /etc/resolv.conf to be managed by NetworkManager should suffice. Either way, you'll want to bring your SELinux policy up to date.

@icedream
Copy link

icedream commented Dec 18, 2020

It turns out I do have a changed SELinux policy due to me launching SSH without systemd socket activation on a separate port. That change occurs on every boot though due to a system service I set up, so I will follow this and see if it works, thank you very much for the info @dustymabe!

@icedream
Copy link

icedream commented Jan 18, 2021

Unfortunately after a bit of researching it seems that semanage port is the only way to label a port as being usable by the SSH server, and from what I understand it always overwrites the policy files. For now I will have to manually restore the policy, which unfortunately also means I will have to do manual upgrades instead of automatic ones to avoid this error for future updates until this is solved in another way.

EDIT: Experimental thought, but one could immediately overwrite the policy files via rsync as soon as semanage port has done its job, but that feels fragile and I doubt it will go well.

Anyways, that will do as a workaround for me to get DNS with systemd-resolved back working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fallout/f33 jira for syncing to jira
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants