Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade from 0.70.4 to 1.0 kills /tmp/.X11/X0 iff systemd is enabled #9158

Closed
1 of 2 tasks
g2flyer opened this issue Nov 16, 2022 · 26 comments
Closed
1 of 2 tasks

Upgrade from 0.70.4 to 1.0 kills /tmp/.X11/X0 iff systemd is enabled #9158

g2flyer opened this issue Nov 16, 2022 · 26 comments
Assignees

Comments

@g2flyer
Copy link

g2flyer commented Nov 16, 2022

Version

10.0.22000.1219

WSL Version

  • WSL 2
  • WSL 1

Kernel Version

5.15.74.2

Distro Version

Ubuntu 22.04

Other Software

No response

Repro Steps

start distro with systemd enabled in /etc/wls.conf using powershell wsl.exe and run X-application, say xterm

Expected Behavior

directory /tmp/.X11 should contain a X0 socket and Xapps work.

Actual Behavior

DISPLAY var is defined correctly as :0 but directory is /tmp/.X11 is empty (whereas if systemd is disabled it contains a X0 socket and Xapps work) and naturally commands such as xterm fail with xterm: Xt error: Can't open display: :0.
Also note that if i log into system partition before starting distro, i see /tmp/.X11/X0 and /mnt/wslg/.X11-unix/X0` but it gets wiped out (presumably by systemd) once distro starts ...

This looks closely related to issue #9126: Seems while 0.70.8 did not have the issue with /tmp/.X11, it had other systemd setup issues and probably in an attempt to fix the broader issue the original problem resurfaced?
The behaviour is also similar to microsoft/wslg#880 but as that was for version 0.70.4.0 which for me worked it must be a different issue (though maybe same root-cause?)

Diagnostic Logs

wsl-logs.zip

@OneBlue
Copy link
Collaborator

OneBlue commented Nov 16, 2022

Thank you for reporting this @g2flyer. Can you share the output of journalctl when this happens ? I wonder if there's a timing issue causing the removal of the socket ?

Also, /logs

@ghost
Copy link

ghost commented Nov 16, 2022

Hello! Could you please provide more logs to help us better diagnose your issue?

To collect WSL logs, download and execute collect-wsl-logs.ps1 in an administrative powershell prompt:

Invoke-WebRequest -UseBasicParsing "https://raw.githubusercontent.com/microsoft/WSL/master/diagnostics/collect-wsl-logs.ps1" -OutFile collect-wsl-logs.ps1
Set-ExecutionPolicy Bypass -Scope Process -Force
.\collect-wsl-logs.ps1

The scipt will output the path of the log file once done.

Once completed please upload the output files to this Github issue.

Click here for more info on logging

Thank you!

@g2flyer
Copy link
Author

g2flyer commented Nov 16, 2022

WslLogs-2022-11-16_10-36-23.zip
(this should i think match the same time period/session as above log-files)

@ghost ghost removed the needs-author-feedback label Nov 16, 2022
@g2flyer
Copy link
Author

g2flyer commented Nov 16, 2022

Oh, didn't see that the collect was from bot and the real feedback requested was journalctl. Unfortunately, it will take some time to get that as i'm working right now in the problematic distro. however, i tried with a separate minimal vanilla 20.04 distro and in that one i actually do not get that problem, so race condition sounds quite plausible. Will update with journal log once i get a chance ...

@g2flyer
Copy link
Author

g2flyer commented Nov 16, 2022

ok, now also the requested journalctl.log. Hope that helps ..

@hajeka
Copy link

hajeka commented Nov 17, 2022

I have the same issue (running Fedora 36). "journalctl -xe" says: "systemd-socket-proxyd[539]: Failed to get remote socket: Too many open files" whenever I run an X-command (eg. xterm).
After downgrading to 0.70.4 everything works fine again. Version 0.70.5 is also OK, but 0.70.8 gives the same problem as 1.0.0

@elsaco
Copy link

elsaco commented Nov 17, 2022

/run/tmpfiles.d/wsl.conf was added to create the link in /tmp to /mnt/wslg where X can set a socket, however it fails to do it:

# Note: This file is generated by WSL to prevent systemd-tmpfiles from removing /tmp/.X11-unix during boot.

L /tmp/.X11-unix - - - - /mnt/wslg/.X11-unix

When run manually here is the output:

sudo systemd-tmpfiles --boot --create
/usr/lib/tmpfiles.d/x11.conf:12: Duplicate line for path "/tmp/.X11-unix", ignoring.

After enabling systemd, to have GUI apps working a link need to be setup, until a fix arrives:

sudo ln -sf /mnt/wslg/.X11-unix /tmp/.X11-unix

This helps with having snap and other systemd dependent apps working.

@kovan
Copy link

kovan commented Nov 18, 2022

I can confirm the regression after upgrading WSL, and I also can confirm that the fix that @elsaco wrote works.

@OneBlue
Copy link
Collaborator

OneBlue commented Nov 18, 2022

Thank you @elsaco and @kovan.

I did some digging and I'm starting to suspect that the culprit might be gdm.

Can you both share the output of systemctl status gdm ?

@kovan
Copy link

kovan commented Nov 18, 2022

~ » systemctl status gdm
Unit gdm.service could not be found.

@OneBlue
Copy link
Collaborator

OneBlue commented Nov 18, 2022

Interesting. I wonder if you have other units installed that could explain this.

What's the output of systemctl -t service ?

@kovan
Copy link

kovan commented Nov 18, 2022

  UNIT                                 LOAD   ACTIVE SUB     DESCRIPTION
  console-getty.service                loaded active running Console Getty
  dbus.service                         loaded active running D-Bus System Message Bus
  docker.service                       loaded active running Docker Application Container Engine
  getty@tty1.service                   loaded active running Getty on tty1
  systemd-boot-update.service          loaded active exited  Automatic Boot Loader Update
  systemd-homed-activate.service       loaded active exited  Home Area Activation
  systemd-homed.service                loaded active running Home Area Manager
  systemd-journal-flush.service        loaded active exited  Flush Journal to Persistent Storage
  systemd-journald.service             loaded active running Journal Service
  systemd-logind.service               loaded active running User Login Management
  systemd-network-generator.service    loaded active exited  Generate network units from Kernel command line
● systemd-networkd-wait-online.service loaded failed failed  Wait for Network to be Configured
  systemd-networkd.service             loaded active running Network Configuration
  systemd-remount-fs.service           loaded active exited  Remount Root and Kernel File Systems
  systemd-resolved.service             loaded active running Network Name Resolution
● systemd-sysctl.service               loaded failed failed  Apply Kernel Variables
● systemd-tmpfiles-clean.service       loaded failed failed  Cleanup of Temporary Directories
● systemd-tmpfiles-setup-dev.service   loaded failed failed  Create Static Device Nodes in /dev
● systemd-tmpfiles-setup.service       loaded failed failed  Create Volatile Files and Directories
  systemd-udev-trigger.service         loaded active exited  Coldplug All udev Devices
  systemd-udevd.service                loaded active running Rule-based Manager for Device Events and Files
  systemd-update-utmp.service          loaded active exited  Record System Boot/Shutdown in UTMP
  systemd-user-sessions.service        loaded active exited  Permit User Sessions
  systemd-userdbd.service              loaded active running User Database Manager
  user-runtime-dir@1000.service        loaded active exited  User Runtime Directory /run/user/1000
  user@1000.service                    loaded active running User Manager for UID 1000

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.
26 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

@g2flyer
Copy link
Author

g2flyer commented Nov 18, 2022

@OneBlue at least on my Ubuntu 22.04 -- not sure whether that's also distro @elsaco and @kovan are using ... -- i did have gdm running and, more importantly, once i disable gdm (and restart wsl) the socket survives the systemd start!
BTW: i can also confirm the duplicate line warning @elsaco mentions but the symlink wouldn't work for me (with gdm enabled) as the bind mount already happened and the socket got removed also from /mnt/wslg/.X11-unix ...

@GuiFerreira11
Copy link

I have the same problem, but with only one of my two WSL distro.

Versão do WSL: 1.0.0.0
Versão do kernel: 5.15.74.2
Versão do WSLg: 1.0.47
Versão do MSRDC: 1.2.3575
Versão do Direct3D: 1.606.4
Versão do DXCore: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows versão: 10.0.22621.819

NAME STATE VERSION

  • Arch Running 2
    Ubuntu Stopped 2

After upgrading from 0.70.4 to 1.0.0 just the Ubuntu distro continuous open the X apps.

And I also can confirm that the fix that @elsaco wrote works.

@hajeka
Copy link

hajeka commented Nov 19, 2022

@OneBlue at least on my Ubuntu 22.04 -- not sure whether that's also distro @elsaco and @kovan are using ... -- i did have gdm running and, more importantly, once i disable gdm (and restart wsl) the socket survives the systemd start! BTW: i can also confirm the duplicate line warning @elsaco mentions but the symlink wouldn't work for me (with gdm enabled) as the bind mount already happened and the socket got removed also from /mnt/wslg/.X11-unix ...

That solved my problem! Thanks @g2flyer

@AzureZeng
Copy link

  UNIT                                   LOAD   ACTIVE SUB     DESCRIPTION
  console-getty.service                  loaded active running Console Getty
  dbus.service                           loaded active running D-Bus System Message Bus
  getty@tty1.service                     loaded active running Getty on tty1
  ldconfig.service                       loaded active exited  Rebuild Dynamic Linker Cache
  systemd-journal-catalog-update.service loaded active exited  Rebuild Journal Catalog
  systemd-journal-flush.service          loaded active exited  Flush Journal to Persistent Storage
  systemd-journald.service               loaded active running Journal Service
  systemd-logind.service                 loaded active running User Login Management
  systemd-remount-fs.service             loaded active exited  Remount Root and Kernel File Systems
● systemd-sysctl.service                 loaded failed failed  Apply Kernel Variables
● systemd-sysusers.service               loaded failed failed  Create System Users
● systemd-tmpfiles-setup-dev.service     loaded failed failed  Create Static Device Nodes in /dev
● systemd-tmpfiles-setup.service         loaded failed failed  Create Volatile Files and Directories
  systemd-udev-trigger.service           loaded active exited  Coldplug All udev Devices
  systemd-udevd.service                  loaded active running Rule-based Manager for Device Events and Files
  systemd-update-done.service            loaded active exited  Update is Completed
  systemd-update-utmp.service            loaded active exited  Record System Boot/Shutdown in UTMP
  systemd-user-sessions.service          loaded active exited  Permit User Sessions
  user-runtime-dir@0.service             loaded active exited  User Runtime Directory /run/user/0
● user@0.service                         loaded failed failed  User Manager for UID 0

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.
20 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.
[root@DESKTOP-5H2A6MV Desktop]# systemctl status systemd-tmpfiles-setup.service
× systemd-tmpfiles-setup.service - Create Volatile Files and Directories
     Loaded: loaded (/usr/lib/systemd/system/systemd-tmpfiles-setup.service; static)
     Active: failed (Result: exit-code) since Sun 2022-11-20 02:44:12 CST; 3min 51s ago
       Docs: man:tmpfiles.d(5)
             man:systemd-tmpfiles(8)
   Main PID: 53 (code=exited, status=243/CREDENTIALS)

Nov 20 02:44:12 DESKTOP-5H2A6MV systemd[1]: Starting Create Volatile Files and Directories...
Nov 20 02:44:12 DESKTOP-5H2A6MV systemd[53]: systemd-tmpfiles-setup.service: Failed to set up credentials: Protocol error
Nov 20 02:44:12 DESKTOP-5H2A6MV systemd[53]: systemd-tmpfiles-setup.service: Failed at step CREDENTIALS spawning systemd-tmpfiles: Protocol error
Nov 20 02:44:12 DESKTOP-5H2A6MV systemd[1]: systemd-tmpfiles-setup.service: Main process exited, code=exited, status=243/CREDENTIALS
Nov 20 02:44:12 DESKTOP-5H2A6MV systemd[1]: systemd-tmpfiles-setup.service: Failed with result 'exit-code'.
Nov 20 02:44:12 DESKTOP-5H2A6MV systemd[1]: Failed to start Create Volatile Files and Directories.

Something strange.
I used ArchLinux distro and executed pacman -Syu, then these errors appeared, and /tmp/.X11-unix is not created automatictally.

@Aerocatia
Copy link

Getting failures on these (Arch Linux, systemd 252.1)

systemd-sysctl.service
systemd-tmpfiles-setup-dev.service
systemd-tmpfiles-setup.service

Could it be related to a newer version of systemd?

@AzureZeng
Copy link

Getting failures on these (Arch Linux, systemd 252.1)

systemd-sysctl.service
systemd-tmpfiles-setup-dev.service
systemd-tmpfiles-setup.service

Could it be related to a newer version of systemd?

some workarounds to solve this problem
execute these commands to disable LoadCredential part

sed -i 's/LoadCredential=/#LoadCredential=/' /usr/lib/systemd/system/systemd-tmpfiles-setup-dev.service
sed -i 's/LoadCredential=/#LoadCredential=/' /usr/lib/systemd/system/systemd-tmpfiles-setup.service
sed -i 's/LoadCredential=/#LoadCredential=/' /usr/lib/systemd/system/systemd-sysctl.service
sed -i 's/LoadCredential=/#LoadCredential=/' /usr/lib/systemd/system/systemd-tmpfiles-clean.service
sed -i 's/LoadCredential=/#LoadCredential=/' /usr/lib/systemd/system/systemd-sysusers.service

also you may want to do this when upgrading systemd, so just make a pacman hook

[Trigger]
Type = Package
Operation = Install
Operation = Upgrade
Target = systemd

[Action]
Description = Applying WSLg fixes...
When = PostTransaction
Exec = /bin/sh /etc/wslgfix.sh
# the wslgfix.sh refers to the script file including commands above

@elsaco
Copy link

elsaco commented Nov 22, 2022

Instead of modifying the unit files use overrides. This way unit files don't need to be edited every systemd upgrade.

Example:

sudo systemd edit systemd-sysctl will add an override in /etc/systemd/system/systemd-sysctl.service.d/override.conf

and sudo systemd revert systemd-sysctl to undo it.

@OneBlue
Copy link
Collaborator

OneBlue commented Nov 22, 2022

Update on this issue: We have published a pre-release for WSL 1.0.1 which contain a fix that should work better: Bind mounting /tmp/.X11-unix as read-only. This should prevent daemons like gdm from removing the socket file.

WSL will also create an empty /run/tmpfiles.d/X11.conf to override the default tmpfiles behavior of trying to remove the socket file (to prevent the unit from generating a warning at boot).

@sbradnick
Copy link
Contributor

The rename of /run/tmpfiles.d/wsl.conf w/ the symlink generation contents to a commented/empty /run/tmpfiles.d/x11.conf appears to have broken using X11 apps out-of-the-box on openSUSE Tumbleweed using v1.0.1.0 (github .msix release). This is for Win10 and Win11 22H2.

I have a /usr/lib/tmpfiles.d/x11.conf which would create a /tmp/.X11-unix directory according to how it's defined - but that doesn't happen when systemd is enabled through /etc/wsl.conf and the systemd-tmpfiles-setup.service doesn't come up clean (seems to be the LoadCredentials stuff).

For both Win10 and Win11 manually creating the symlink as the old /run/tmpfiles.d/wsl.conf prescribed makes it work in a quick-and-dirty sense.

I can comment out LoadCredential items (via systemctl edit --full <some item>.service) from the 5 systemd .service files mentioned above as well as put L /tmp/.X11-unix - - - - /mnt/wslg/.X11-unix in (a new) /usr/lib/tmpfiles.d/wsl.conf and get things working again at "boot".

So seems like a regression for me, not overall - but now I'm in the reverse position where things were working, but these changes make it not work. I know things aren't universal between distros, so it happens.

@Aerocatia
Copy link

WSL 1.01 did not fix the X11 issue on my end.

firefox
Error: cannot open display: :0
WSL version: 1.0.1.0
Kernel version: 5.15.74.2
WSLg version: 1.0.47
MSRDC version: 1.2.3575
Direct3D version: 1.606.4
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22621.819

@AzureZeng
Copy link

The best solution to this problem (for me) is to support LoadCredential option in systemd support.

@OneBlue
Copy link
Collaborator

OneBlue commented Dec 2, 2022

Fixed with WSL 1.0.3

More details on #9126 .

@OneBlue OneBlue closed this as completed Dec 2, 2022
@g2flyer
Copy link
Author

g2flyer commented Dec 2, 2022

i can confirm that with 1.0.3 enabled gdm does not prevent running X anymore. However, i noticed that gdm actually doesn't properly start automatically (but i can successfully start it manually after a boot). Looking at the journalctl it seems it's caused by permission issu on /tmp/.X11-unix so there still might be some corner-cases? (I also noticed that from journalctl even before 1.0.0 gdm didn't seemed to have started properly but that seemed to be due to different issues)

BTW: after updating to 1.0.3 it seems also windows was in a funky state that \wsl.localhost\Ubuntu was accessible only as Administrator. Shutting down and restarting WSL didn't help, only a reboot of win11 proper remediated this ...

journalctl.dead-gdm.log.gz

@curbengh
Copy link

Upgrading to 1.0.3 did not fix for me, I still need to clear LoadCredential as a workaround.

Instead of modifying the unit files use overrides. This way unit files don't need to be edited every systemd upgrade.

Example:

sudo systemd edit systemd-sysctl will add an override in /etc/systemd/system/systemd-sysctl.service.d/override.conf

and sudo systemd revert systemd-sysctl to undo it.

To expand on this

sudo mkdir -p /etc/systemd/system/systemd-tmpfiles-setup-dev.service.d/
echo '[Service]\nLoadCredential=' | sudo tee /etc/systemd/system/systemd-tmpfiles-setup-dev.service.d/override.conf
sudo mkdir -p /etc/systemd/system/systemd-tmpfiles-setup.service.d/
echo '[Service]\nLoadCredential=' | sudo tee /etc/systemd/system/systemd-tmpfiles-setup.service.d/override.conf
sudo mkdir -p /etc/systemd/system/systemd-sysctl.service.d/
echo '[Service]\nLoadCredential=' | sudo tee /etc/systemd/system/systemd-sysctl.service.d/override.conf
sudo mkdir -p /etc/systemd/system/systemd-tmpfiles-clean.service.d/
echo '[Service]\nLoadCredential=' | sudo tee /etc/systemd/system/systemd-tmpfiles-clean.service.d/override.conf
sudo mkdir -p /etc/systemd/system/systemd-sysusers.service.d/
echo '[Service]\nLoadCredential=' | sudo tee /etc/systemd/system/systemd-sysusers.service.d/override.conf

# If you have systemd-resolved running
sudo mkdir -p /etc/systemd/system/systemd-resolved.service.d/
echo '[Service]\nLoadCredential=' | sudo tee /etc/systemd/system/systemd-resolved.service.d/override.conf

systemctl daemon-reload
systemctl -t service

may need to wsl.exe --shutdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests