Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mesa: 24.1.6 -> 24.2.1, now with 100% less Libgbm Thing(tm) #332413

Merged
merged 1 commit into from
Aug 29, 2024

Conversation

K900
Copy link
Contributor

@K900 K900 commented Aug 5, 2024

Description of changes

Bit too late in the cycle but we can get this in.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

Comment on lines 2 to 5
# When updating this package, please verify at least these build (assuming x86_64-linux):
# nix build .#mesa .#pkgsi686Linux.mesa .#pkgsCross.aarch64-multiplatform.mesa .#pkgsMusl.mesa
# Ideally also verify:
# nix build .#legacyPackages.x86_64-darwin.mesa .#legacyPackages.aarch64-darwin.mesa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these go in passthru.tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't really do that without instantiating a whole bunch of extra nixpkgses.

Copy link
Member

@emilazy emilazy Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you took pkgsi686Linux etc. as arguments they’d still only be instantiated if you evaluated passthru.tests, wouldn’t they?

Edit: Oh, or I guess you’re worried about Hydra?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ofborg/hydra mostly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x86_64-linux probably builds at least pkgsi686Linux.mesa anyway, right? I guess I don’t really know how those package sets work in terms of Hydra eval though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're doing that in intel-media-driver and so far no one complained

inherit (pkgsi686Linux) intel-media-driver;

Copy link
Member

@PedroHLC PedroHLC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Diff LGTM, when possible I'm going to start to test this (patch and updated outputs) with my user base.

But I think this patch deserves some explanation on “what it does?”, “why?”, and “how it achieves that?”. (A short paragraph should be enough)

@K900
Copy link
Contributor Author

K900 commented Aug 7, 2024

Updated to 24.2.0-rc4, also updated the libgbm patch to severely simplify the loading logic.

@Atemu

This comment was marked as resolved.

@PedroHLC
Copy link
Member

PedroHLC commented Aug 14, 2024

  1. 24.2.0 is out -- But remembering newcomers/visitors, we always wait **.1
  2. @K900 so far, nobody showed up with issues related to the patch changes (I didn't rebase to -rc4 patch though).

EDIT: Should be staging and not master for this, right?

@alyssais
Copy link
Member

Have we talked to upstream about the patch yet?

@K900
Copy link
Contributor Author

K900 commented Aug 15, 2024

So, the libgbm thing has been discussed a bunch on Matrix for a very long time, but I never sat down and wrote up a proper rationale for doing this. Here goes.

First, the original problem statement: libgbm is basically a fancy memory allocator API for your GPU. To allocate memory, it needs to talk to your graphics driver (duh).

In the case of Mesa, that driver is Mesa itself, which libgbm is also part of, so Mesa assumes (generally correctly) that the libgbm and Mesa on a system will always be from the same build. This is true for conventional distros, but we're, well, not very conventional. We can totally have applications built against an old libgbm running on top of new Mesa, and our users generally don't expect it to explode.

Unfortunately, it does, because the libgbm Mesa ("dri") backend talks to the driver using Mesa internal APIs, and if those no longer line up, kaboom. In fact, upstream noticed the potential kaboom, and made it so the application just aborts immediately instead of slowly melting.

This means that the state of the art for "I want to run an application built against another libgbm on my new drivers" is "don't", which is not great.

Now, Mesa 24.2 brings with it another problem: libgbm now links libgallium_dri, which is like 40 megabytes of stuff that needs to end up in mesa.out, because that's where libgbm lives. That's not really fatal, but still annoying.

So here's the proposed solution: libgbm actually has support for loadable backends, that's used by vendor drivers to provide their own implementations, just like libglvnd, ocl-icd, vulkan-loader etc already do. What if we use the same functionality for Mesa itself? Then libgbm basically becomes a driver loader, instead of a driver, and we put the matching gbm_dri.so into /run/opengl-drivers along with its Mesa, and never have the problem again.

The reason I think this could actually be wanted upstream as well is that anyone running their own local build of Mesa runs into the same issues unless they LD_PRELOAD the right libgbm, which is awkward and comes with the usual LD_WHATEVER caveats. It also makes the loader logic surprisingly simple (and it can be simplified even more, at the cost of making the patch more invasive) - we basically just try a bunch of named backends in order using the exact same code paths for Mesa and non-Mesa. Moving the libgallium_dri linked bits to a separate library also saves 40MB for any distro that wants to ship the drivers separately like we do.

The downside of this is that gbm_dri becomes public API, so it has to follow the usual GBM backend ABI stability constraints, but this doesn't seem to be a problem in practice, as the API has not changed in years.

Also, this does potentially unlock decoupling libgbm from Mesa entirely, which is something no one but NixOS probably cares about, but whatever.

@ofborg ofborg bot added the ofborg-internal-error Ofborg encountered an error label Aug 15, 2024
@K900
Copy link
Contributor Author

K900 commented Aug 16, 2024

@ofborg eval

@vcunat vcunat changed the title mesa: 24.1.x -> 24.2-rc, implement The Libgbm Thing(tm) mesa: 24.1.x -> 24.2.x, and implement The Libgbm Thing(tm) Aug 16, 2024
@K900
Copy link
Contributor Author

K900 commented Aug 26, 2024

I've split the commit in half: the Mesa bump itself is actually pretty trivial if we're fine with the 40MB $out increase, and The Libgbm Thing(tm)(r) applies very minimal changes on top of that.

@Atemu
Copy link
Member

Atemu commented Aug 26, 2024

Have you talked with upstream about this yet?

@K900
Copy link
Contributor Author

K900 commented Aug 26, 2024

No, I've had some Real Life happening lately. I'll see if I can reach out this week.

@github-actions github-actions bot added 6.topic: python 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS labels Aug 29, 2024
@github-actions github-actions bot removed 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: GNOME GNOME desktop environment and its underlying platform 6.topic: qt/kde 8.has: documentation This PR adds or changes documentation 8.has: changelog 6.topic: emacs Text editor 6.topic: golang 6.topic: ruby 6.topic: vim 6.topic: stdenv Standard environment 6.topic: nodejs 6.topic: lua 6.topic: systemd 6.topic: Enlightenment DE The Enlightenment Desktop Environment 6.topic: lib The Nixpkgs function library 6.topic: php 6.topic: dotnet Language: .NET labels Aug 29, 2024
@K900 K900 changed the title mesa: 24.1.x -> 24.2.x, and implement The Libgbm Thing(tm) mesa: 24.1.x -> 24.2.1, now with 100% less Libgbm Thing(tm) Aug 29, 2024
@K900 K900 changed the title mesa: 24.1.x -> 24.2.1, now with 100% less Libgbm Thing(tm) mesa: 24.1.6 -> 24.2.1, now with 100% less Libgbm Thing(tm) Aug 29, 2024
@K900 K900 marked this pull request as ready for review August 29, 2024 05:22
@K900 K900 mentioned this pull request Aug 29, 2024
13 tasks
@K900
Copy link
Contributor Author

K900 commented Aug 29, 2024

Split off the libgbm thing into #338109, because this cycle has been slow enough to just get this in now that 24.2.1 is out.

@K900 K900 merged commit 4c1eeee into NixOS:staging-next Aug 29, 2024
27 of 29 checks passed
@NickCao
Copy link
Member

NickCao commented Aug 29, 2024

Shall we also address #328923?

@K900
Copy link
Contributor Author

K900 commented Aug 29, 2024

We should, but there's no rush on that.

@cole-h cole-h removed the ofborg-internal-error Ofborg encountered an error label Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants