Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.70.8] masking systemd-tmpfiles-setup/systemd-tmpfiles-setup-dev breaks things elsewhere #9126

Closed
1 of 2 tasks
cerebrate opened this issue Nov 6, 2022 · 14 comments
Closed
1 of 2 tasks
Labels

Comments

@cerebrate
Copy link

cerebrate commented Nov 6, 2022

Version

WSL version: 0.70.8.0

WSL Version

  • WSL 2
  • WSL 1

Kernel Version

5.15.74.2-20221105-1-microsoft-custom-WSL2+

Distro Version

Debian sid

Other Software

No response

Repro Steps

After installing WSL 0.70.8, WSL appears to inject a generator-based masking of the systemd-tmpfiles-setup.service and systemd-tmpfiles-dev.service, as can be seen by the contents of /run/systemd/generator-early:

❯ l /run/systemd/generator.early
total 0
lrwxrwxrwx 1 root root 9 Nov  6 11:57 systemd-tmpfiles-setup-dev.service -> /dev/null
lrwxrwxrwx 1 root root 9 Nov  6 11:57 systemd-tmpfiles-setup.service -> /dev/null

I presume this is intended to solve the issue listed in the release notes as "Ensure /tmp/.X11-unix is not cleared by systemd [GH 9038]".

While this does solve the issue of systemd-tempfiles-setup.service removing the contents of /tmp/.X11-unix (although not the mount-order-based issue with .X11-unix, the problem is that it also breaks all the unrelated functionality of that unit, including that added by other packages (such as Ceph) after the fact.

That unrelated functionality includes a lot of distro self-repair functions even other than cleaning up trash, including setting up and properly permissioning various directories under /run and /var, and properly setting up initial conditions for polkit, quotas, systemd-journald, systemd-logind, systemd-networkd, systemd-machined, systemd-resolved, systemd-udevd, etal.

In short, while not fatal in the short term on a simple system, in the long term on a complex system, this is begging for hard to diagnose errors.

(I've had to clean up some temps from interrupted operations and fix directories not created or with incorrect permissions in /run myself, and while my usage may be atypical, it's not extraordinary.)

Expected Behavior

All the trash cleaned up, directories and links created, and permissions set as per the contents of /lib/tmpfiles.d, with the possible exception of /tmp/.X11-unix.

(I would also like to suggest as a possible solution, since it is possible to inject generator-based files into /run/systemd/generator.early, instead injecting a unit file into /run/systemd/generator to bind mount /mnt/wslg/.X11-unix over /tmp/.X11-unix, configured to run after systemd-tmpfiles-setup.service? (Presumably also removing the current component that mounts /tmp/.X11-unix) This should have the same ultimate effect where #9038 is concerned, and would also fix the mount race problem described in the comments to #8888.

I only have a user's-eye view of WSL, but this is the approach I use in the current version of bottle-imp , and it seems quite effective, although ironically broken by this fix in 0.70.8.)

Actual Behavior

No actions configured in /lib/tmpfiles.d are executed.

Diagnostic Logs

No response

@odbayar
Copy link

odbayar commented Nov 7, 2022

I think it broke php-fpm too.

/lib/tmpfiles.d$ cat php7.4-fpm.conf
#Type Path                  Mode UID      GID      Age Argument
    d /run/php              0755 www-data www-data -   -

@OneBlue
Copy link
Collaborator

OneBlue commented Nov 8, 2022

Thank you for the detailed issue @cerebrate. This change was indeed made to resolve #9038.

Under the hood WSL's init starts systemd and waits a specific amount of time for systemd to be done booting, but sometimes this timeout will be reached and WSL needs to give access to a shell to the user (or else the boot time would just take too long).

Injecting a systemd unit via /run/systemd is a pretty good suggestion. Looking into it made me realize that systemd-tmpfiles's configuration can be modified via /run/tmpfiles.d/ (which takes precedence over the distro provided configuration, in /usr/lib/tmpfiles.d/).

By injecting something like this:

L /tmp/.X11-unix - - - -  /path/to/wsl's/.X11-unix

We can actually have the systemd unit create the symlink itself, which I like because as long as WSL's init has already created the symlink before, it will either be there because systemd-tmpfiles has run (where it might delete and re-create the symlink) or because it hasn't (and so the symlink is untouched).

@cerebrate
Copy link
Author

Thanks for the detailed response, and apologies for my lack of earlier response, having had to step away from things previously this week. Unfortunately, while this is kinda-sorta fixed for me in 1.0, there are still some caveats.

  1. According to mount, /tmp/.X11-unix is still being bind mounted (invisibly with systemd enabled, since it's mounted before /tmp), which seems to suggest there are now two different ways of providing /tmp/.X11-unix in play:
...
none on /mnt/wslg/versions.txt type overlay (rw,relatime,lowerdir=/systemvhd,upperdir=/system/rw/upper,workdir=/system/rw/work,xino=off)
none on /mnt/wslg/doc type overlay (rw,relatime,lowerdir=/systemvhd,upperdir=/system/rw/upper,workdir=/system/rw/work,xino=off)
none on /tmp/.X11-unix type tmpfs (rw,relatime,inode64)
drvfs on /mnt/c type 9p (rw,noatime,dirsync,aname=drvfs;path=C:\;uid=1000;gid=1000;uid=996;gid=996;metadata;umask=002;fmask=002;symlinkroot=/mnt/,mmap,access=client,msize=262144,trans=virtio)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd)
none on /run/user/1000 type tmpfs (rw,relatime,inode64)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,size=4040024k,nr_inodes=409600,inode64)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
...

Which seems redundant/confusing. Given my druthers, I'd rather keep the bind mount and drop the symlink, on the grounds that a read-only bind-mount is less likely to be accidentally damaged; I updated my own code to do that after @benhillis 's comment here:

#8888 (comment)

But move the bind mount to the post-systemd-tmpfile-setup.service point. Should be no more complicated a generator.

  1. At least on my Debian installation, systemd-tmpfile-setup.service always fails with exist status 243/CREDENTIALS. This is because of Systemd/Snapd: /dev/shm symlink breaks some snaps with "Private Shared Memory" enabled. #8996: WSL mounts a tmpfs for shared memory at /run/shm and puts a symlink to it at /dev/shm, rather than the other way around - this would be how systemd mounts it itself if it was not running in container mode - which causes systemd's credential handling to fail. Easy solution is to swap them over (or, what would be my preference, to not run systemd in container mode because I have a lot of workarounds for various things that don't get done when it's run in container mode).

  2. As per Upgrade from 0.70.4 to 1.0 kills /tmp/.X11/X0 iff systemd is enabled #9158, I get a /usr/lib/tmpfiles.d/x11.conf:12: Duplicate line for path "/tmp/.X11-unix", ignoring. warning message if I run the executable behind the service manually. In my case, at least, that's harmless because it's ignoring the lower-priority distro file in favor of the injected file and so the symlink still gets created, but it's still a warning message people may worry about.

@OneBlue
Copy link
Collaborator

OneBlue commented Nov 17, 2022

Thank you for the followup @cerebrate.

To give a bit more context on why we took the systemd-tmpfiles override instead of injecting a new systemd unit:

  • If systemd doesn't start within a given timeout, init will stop waiting and give control of the distro to the user
  • Unfortunately, if systemd-tmpfiles (or a dependent unit) isn't done when the user gets control, there will be a race where the user might invoke an X11 app before the bind mount / symlink is created.

That's why we went for the systemd-tmpfiles override. Since creating the symlink is atomic, there will either be a symlink or a bind mind at any given time, even if the systemd boot times out (but clearly it's not a perfect solution as you pointed out).

Having a deeper look at systemd-tmpfiles's documentation, this might help:
Files in /run/tmpfiles.d override files with the same name in /usr/lib/tmpfiles.d.

WSL currently creates /run/tmpfiles.d/wsl.conf, but if that file was instead named /run/tmpfiles.d/x11.conf, it could completely override it (so an empty file should be enough since the bind mount is created by init before starting systemd).

Unfortunately, this does make the assumption that the X11 socket deletion is configured in /usr/lib/tmpfiles.d/x11.conf, which isn't guaranteed (and that's the main challenge of this issue: finding a generic solution that works with all distros), so maybe the best solution would a combination of both an x11.conf override, and a dependent unit to re-create the bind mount if needed as you suggested.

@cerebrate
Copy link
Author

Aha! From that perspective, that makes sense.

(I confess I did not know that there was a delay for systemd to start - hence my creation of #8886 - because evidently my hardware is old and/or loaded enough that I always get control of the distro before systemd, or even the system dbus, have properly started. That was my main motivation for putting delays for a full startup into bottle-imp. To illustrate:

imp: dbus is not available yet, please wait......................................
imp: systemd is starting up, please wait..........................................................................

that's my WSL startup, all of which code presumably starts after the in-built timeout, and where each . represents a one-second wait. So having been using systemd since, well, before there was systemd support, I've generally internalized the "wait for it or expect errors" paradigm.)

Considering the X11 socket deletion, it looks like /usr/lib/tmpfiles.d/x11.conf is part of systemd itself and supplied by the systemd package or equivalent on all the distros I've checked, but your point is well taken.

@cerebrate
Copy link
Author

There's a regression in 1.0.1 which no longer creates the symlink; the race condition is once again mounting /tmp over the existing bind mount.

@OneBlue
Copy link
Collaborator

OneBlue commented Dec 2, 2022

There's a regression in 1.0.1 which no longer creates the symlink; the race condition is once again mounting /tmp over the existing bind mount.

Indeed there was!

We just published WSL 1.0.3 (pre-release) which addresses the /tmp mount issue.

This release adds logic to inject a new unit (inspired from @cerebrate's suggestion) to recreate the bind mount if it's not accessible anymore.

To summarize, there are now three layers of protection for the X11 socket:

  1. /tmp/.X11-unix is mounted ready-only
  2. WSL injects an x11.conf to override systemd-tmpfile's default x11 configuration (and get rid of the associated warning if the socket file can't be deleted)
  3. WSL injects a new systemd unit: wslg-mount.service, which will recreate the bind mount (but only if it's not accessible)

Couple notes:

  • Although a mount unit would have been cleaner for wslg-mount.service, it's not possible here because systemd will not actually create the mount if another mountpoint on the same target exists (and in the case where /tmp if mounted over the original /tmp/.X11-unix mountpoint, that wouldn't work)

  • In the case where a /tmp mount is hiding the /tmp/.X11-unix mountpoint, another will be created, which will create another mount entry in /proc/mounts, which is a bit un-satisfying, but shouldn't be an issue

  • If in the future we discover more systemd units that manage to hide / delete our mountpoint, the official solution will be make sure those unit run before wslg-mount.service

@OneBlue OneBlue closed this as completed Dec 2, 2022
@AndASM
Copy link

AndASM commented Dec 10, 2022

We just published WSL 1.0.3 (pre-release) which addresses the /tmp mount issue.

To summarize, there are now three layers of protection for the X11 socket:

1. `/tmp/.X11-unix` is mounted ready-only

So that's why my secondary x server is breaking. Because someone decided to randomly make .X11-unix read-only. I wish WSL would provide services instead of trying to rewrite the base system.

@luc-vocab
Copy link

Has anyone tried wsl 1.0.3 and does it fix WSLg when systemd is enabled ?

@g2flyer
Copy link

g2flyer commented Dec 11, 2022

Has anyone tried wsl 1.0.3 and does it fix WSLg when systemd is enabled ?

I've tried 1.0.3 and wslg does work for me with systemd (although see also my observation in #9158 (comment) on some permission issue with gdm auto-start)

@luc-vocab
Copy link

looks like my wsl updated itself to 1.0.3 and i'm able to use WSLg with systemd enabled now, so great success as far as i'm concerned.

@AndASM
Copy link

AndASM commented Dec 11, 2022

Has anyone tried wsl 1.0.3 and does it fix WSLg when systemd is enabled ?

WSLg worked fine with systemd enabled before 1.0.3. You need to run a distribution that is configured to be compatible with it. Such as the newer Ubuntu 22.10 release with the WSL packages installed.

You can look at projects like genie or distrod for the different workarounds needed to get systemd working with WSL. You don't need those full projects, but the configuration changes they make such as the tmpfiles.d/x11.conf override and other service units they disable or alter.

The problem is, WSL2 is trying to act as both the init process/service manager (like systemd) and a system container manager in the docker or LXD vein. But it isn't following standards. Instead it is trying to forcibly change configuration and system setup without knowing how the guest system is set up or what it requires. Whatever the true intent is, it feels like a clumsy version of embrace, extend... and some other word.

@OneBlue
Copy link
Collaborator

OneBlue commented Dec 12, 2022

@AndASM: If you need to write in /tmp/.X11-unix, you can make the mount writable with: mount -o remount,rw /tmp/.X11-unix/

@AndASM
Copy link

AndASM commented Dec 12, 2022

@AndASM: If you need to write in /tmp/.X11-unix, you can make the mount writable with: mount -o remount,rw /tmp/.X11-unix/

Thanks @OneBlue! I appreciate it. Fortunately I already know, and have implemented that. (But I much prefer people repeat answers if they think it hasn't been, rather than not answer and leave someone stuck and lost. So genuinely, thank you!)

I created a separate issue (#9303) for the discussion of how making /tmp/.X11-unix read-only breaks a bunch of other software like X servers and remote access software. There are multiple comments with variations of this workaround including my own here. Mine is specifically about adding a line to a systemd service unit to automatically remount the filesystem before running said service.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants