Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic Nvidia driver installation does not work in current Daily builds #686

Closed
jlnr opened this issue Aug 29, 2023 · 24 comments
Closed

Comments

@jlnr
Copy link
Member

jlnr commented Aug 29, 2023

Hardware: Generic desktop box with Intel Core i7-7700, Nvidia GTX1080, ultrawide 38" Acer screen (3840x1600 75Hz).

When I boot into elementaryos-7.0-stable.20230129rc.iso, Nouveau is in use and everything uses the right resolution right from where plymouth(?) shows the animated spinner below my motherboard logo. Usually, I would then boot into my eOS installation (still using Nouveau) and install the proprietary drivers to get rid of Nouveau's slight glitchiness.

When I boot into elementaryos-7.0-daily.20230829.iso however, all I see if a black screen. Choosing the "safe graphics" options works and lets me install using 1024x768 VGA drivers, but the resulting system then also requires nomodeset and only works at 1024x768, otherwise I get the same black screen.

I chose to install proprietary drivers during installation, but I assume that doesn't include the Nvidia ones because they need to be downloaded? In any case, I can't find the Nvidia drivers in AppCenter or in the System prefs.

I am not sure if this issue is related to #546 or #324 because it only started failing so recently for me.

Is there anything I can do to see why Nouveau doesn't activate anymore?

@davidmhewitt
Copy link
Member

The kernel version in the 7.0 daily ISOs is the much newer 6.x Ubuntu HWE kernel. The current stable release of 7.0 uses an older, non HWE kernel, so that would be the major difference.

If you have time to test this, could you check an Ubuntu 22.04 ISO and see if you have the same issue. If not, could you let us know if there's a difference in kernel versions between the two?

Next... could you get a full log file from the elementary OS installation with the optional drivers enabled. In theory, the proprietary nvidia drivers should be downloaded and installed if you have an internet connection during the install. The log file would tell us if that's the case or not.

@jlnr
Copy link
Member Author

jlnr commented Aug 29, 2023

Re-trying installation of the current Daily ISO:

  • I made sure to be online during the driver installation step. Not sure where to find the full installer log, but this is what showed up in the UI: IMG_8082 - it didn't cause the installation to fail.
  • I then rebooted into my fresh installation (with manual nomodeset) and before doing anything else, I ran sudo apt install nvidia-driver-418-server to see if I could reproduce the error from the installer. This time I got what looks like a different/later error: https://pastebin.com/0qG1rtYj
  • I'm not sure why, but after opening AppCenter, I could immediately see four Nvidia driver versions. (Earlier today I couldn't get them to show up even after a lot of refreshing.) v418 was not in the list so I couldn't retry from here.
  • I installed the most recent version (v535) and rebooted. Now my system shows me a graphical FDE unlock screen, but then drops me into a terminal instead of starting X11, and startx displays a generic error. I know that Nvidia+plymouth+FDE is asking for trouble, but that part actually seemed to work for the first time ever. I didn't dig into the 535 driver issue because I feel this is now the third separate thing in this ticket (after Nouveau and failing v418 installation).

I wouldn't miss Nouveau if the Nvidia driver installation had worked. Having to select "safe graphics" during the initial installation is obvious enough. So I guess fixing the initial Nvidia driver installation is better because that's all I use Nouveau for anyway. I am really new to the installer/OS side of things, how can I dig into what is going on there?

@jlnr
Copy link
Member Author

jlnr commented Aug 29, 2023

Ubuntu 22.04.3 has the same issue that only "safe graphics" works from the USB stick. I'll see if the driver installation works for them.

@davidmhewitt
Copy link
Member

davidmhewitt commented Aug 29, 2023

@jlnr Thanks for the detailed description!

I think the issue with the drivers in the installer is caused by the attempted installation of such an old version of the driver. The reason why that version fails to install post-install is probably because of trying to install such an old nvidia driver in a much newer kernel.

However, this feature in the installer uses ubuntu-drivers internally, so I'm not sure why it would be selecting such an old version.

Could you run ubuntu-drivers list on your freshly installed system and see what drivers it suggests? Equally, is the result of that command the same in a live session of the ISO?

Suspect your third issue after installing newer drivers may be from having partially installed v418 drivers from your 2nd bullet point. If you were to repeat the process, install elementary OS, and then see if you can install v535, do you get a working system?

@jlnr
Copy link
Member Author

jlnr commented Aug 29, 2023

Before I get back to elementary, some more info from Ubuntu:

  • Kernel is 6.2.0-31-generic #31~22.04.1-Ubuntu
  • Nvidia drivers v535 were successfully installed during installation

This is the list of drivers in the UI, which matches the output from ubuntu-drivers list.

ubuntu-drivers

I thought that maybe the 418-server one was meant for headless CUDA servers and that's why it's so old/stable, but I guess the "server" means something else?

Now back to the Daily ISO and its ubuntu-drivers.

@jlnr
Copy link
Member Author

jlnr commented Aug 29, 2023

The output of ubuntu-drivers list in an elementary live session actually looks very similar. It has 418-server up to 535.

Is the issue maybe that this bit of Rust tries to install every package in the list? https://github.com/pop-os/distinst/blob/c6d65568701bf8dbc116153acb330d76b1c32f63/src/installer/steps/configure/chroot_conf.rs#L125-L132

I guess what we want is to install only the latest version (with or without -server, not sure)?

@davidmhewitt
Copy link
Member

Indeed. I've just spotted that too!

It looks like there's a --recommended option for ubuntu-drivers. Does that filter the list down to only a newer one?

@jlnr
Copy link
Member Author

jlnr commented Aug 29, 2023

Deleted my last comment, I had a typo in there 🤦‍♂️ ubuntu-drivers list --recommended works and suggests exactly v535 and nothing else.

@jlnr jlnr changed the title Nouveau used to work for me in OS 5 and 7.0, but stopped working in recent Daily ISOs Automatic Nvidia driver installation does not work in current Daily builds Aug 29, 2023
@jlnr
Copy link
Member Author

jlnr commented Aug 29, 2023

Thanks for the pointer to --recommended. I have adjusted the issue title because in my opinion, there's not much point worrying about Nouveau. If anything, it'd probably be better to funnel people harder into installing the proprietary drivers (e.g. by failing the installation when people attempt to do so while offline).

I cannot help much with the distinst thing right now (no Rust environment atm), so I'll try and see what happens if I install no proprietary Nvidia drivers during installation, but then install nvidia-driver-535 manually after the first boot.

@davidmhewitt
Copy link
Member

I'm going to check this --recommended option on my hardware with a Broadcom NIC later, I hope that's "recommended", or else it breaks the feature for my proprietary hardware 😅

Thanks for opening the PR though, and your clean install and install nvidia-driver-535 sounds like a good test!

@jlnr
Copy link
Member Author

jlnr commented Aug 29, 2023

  • Installed today's Daily ISO without drivers.
  • Booted into "Recovery mode" and ran apt install nvidia-driver-535.
  • Rebooted, unlocked my disk (which is still VGA but not even PopOS uses the native resolution here).
  • After logging in, the UI is using the right drivers at native resolution. 👍

@jlnr
Copy link
Member Author

jlnr commented Sep 6, 2023

The distinst patch has been merged, but I am a bit lost as to how this change would make it into an installer ISO.

apt show libdistinst-dev says that it comes from https://ppa.launchpadcontent.net/elementary-os/os-patches/ubuntu. But the launchpad page states that the last source update is from 2017. https://github.com/elementary/os-patches doesn't seem to contain anything related to distinst either.

@davidmhewitt
Copy link
Member

They get manually copied from the Pop PPA into the elementary PPAs in the Launchpad web UI.

I've just requested a sync of the latest version, status available here:
https://launchpad.net/~elementary-os/+archive/ubuntu/os-patches/+packages?field.name_filter=distinst&field.status_filter=published&field.series_filter=

@davidmhewitt
Copy link
Member

Looks like that worked, so the next daily iso to be built should have this included.

@danirabbit danirabbit added this to OS 7.1 Sep 6, 2023
@danirabbit danirabbit moved this to In Progress in OS 7.1 Sep 6, 2023
@jlnr
Copy link
Member Author

jlnr commented Sep 7, 2023

Oh no...

IMG_8160

I am not sure why ubuntu-drivers list --recommended now returns two packages. Maybe I already had the header things installed last time? Anyway, it's good that this bug happened for me with just an Nvidia card, otherwise it would have failed on someone's Nvidia+Broadcom setup later on.

Time for the next distinst PR...

@jlnr
Copy link
Member Author

jlnr commented Sep 7, 2023

Ah duh, it's because --recommended changes the format. It prints all packages separated by spaces, not one per line with metadata after a comma.

@davidmhewitt
Copy link
Member

Looking at the code, it's still one driver per line, but the nvidia lines are special and can have two packages on one line:
https://git.launchpad.net/ubuntu/+source/ubuntu-drivers-common/tree/ubuntu-drivers#n479

So in theory, you could have something like:

nvidia-driver-535 linux-modules-nvidia-535-generic-hwe-22.04
bcmwl-kernel-source

I can do a distinst PR to fix that up this weekend if you don't beat me to it 😉

@jlnr
Copy link
Member Author

jlnr commented Sep 7, 2023

Thanks. My plan is to iterate over each line and then split that by spaces, so that should work. I am already installing Rust, was tempted to learn the language anyway. Let's see how it goes :)

@jlnr
Copy link
Member Author

jlnr commented Sep 7, 2023

Hmm, so I got the string parsing to work with your example. But I'm hesitant to open my PR because I was wondering if I should install the kernel modules or not, and whether bcmwl-kernel-source wouldn't also list a module package (or the same package name twice), as the format seems to be "%s %s" regardless of whether the package is Nvidia or not.

Taking a step back, maybe it would be easier to use ubuntu-drivers install? I would expect it to do exactly the right thing without any string parsing etc.

https://git.launchpad.net/ubuntu/+source/ubuntu-drivers-common/tree/ubuntu-drivers#n159

The old Ubuntu installer (haven't checked the Flutter version) seems to be doing exactly that, just with optional settings:

https://git.launchpad.net/ubiquity/tree/scripts/simple-plugins?h=jammy#n20

What do you think?

@davidmhewitt
Copy link
Member

Hmm, so I got the string parsing to work with your example.

Looks like good Rust! I'd be tempted to add a trim in there too though, just so we avoid any potential issues with leading or trailing whitespace:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=57cfd9a86966bb51cf5d20fcfddb203d

But I'm hesitant to open my PR because I was wondering if I should install the kernel modules or not, and whether bcmwl-kernel-source wouldn't also list a module package (or the same package name twice), as the format seems to be "%s %s" regardless of whether the package is Nvidia or not.

For the broadcom driver, the output is definitely just bcmwl-kernel-source, even with the --recommended flag. I assume that maybe the modules package for other drivers is just an empty string, so you just get one package on a line by itself.

Taking a step back, maybe it would be easier to use ubuntu-drivers install? I would expect it to do exactly the right thing without any string parsing etc.

I think the reason I did this originally is because distinst sets flags on apt to make sure packages can be installed from the "cdrom" (ISO). I think running ubuntu-drivers install may have a chance of not being able to use packages from the ISO because of the differences in the way we build our ISO compared to Ubuntu.

You could probably do this in a live session. If you just do a normal apt install bcmwl-kernel-source, does apt without flags pick up the packages on the "cdrom"? If not, is ubuntu-drivers doing anything clever to pass flags to apt when you do ubuntu-drivers install that would make this work?

@jlnr
Copy link
Member Author

jlnr commented Sep 8, 2023

Running apt install bcmwl-kernel-source in a live session does install packages from cdrom://elementary.../ no matter if I am connected to Wi-Fi or not.

@jlnr
Copy link
Member Author

jlnr commented Sep 9, 2023

The parsing fix has been merged, @davidmhewitt can you please re-sync the PPA?

@jlnr
Copy link
Member Author

jlnr commented Sep 10, 2023

Why did you delete your comment? :) Syncing worked, and so did the driver installation using the 2023-09-10 Daily ISO. 🎉 Thanks a lot for your help.

@jlnr jlnr closed this as completed Sep 10, 2023
@github-project-automation github-project-automation bot moved this from In Progress to Done in OS 7.1 Sep 10, 2023
@davidmhewitt
Copy link
Member

Why did you delete your comment? :) Syncing worked, and so did the driver installation using the 2023-09-10 Daily ISO. 🎉 Thanks a lot for your help.

Ah, the GitHub UI showed it as duplicated so I thought it posted twice and I tried to delete just one 😅

Glad to hear it's working now. Thanks for your persistence in testing it and submitting the PRs to distinst!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Done
Development

No branches or pull requests

2 participants