Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tracking] Agree on a name for hardware.graphics #323675

Open
SomeoneSerge opened this issue Jun 30, 2024 · 13 comments
Open

[Tracking] Agree on a name for hardware.graphics #323675

SomeoneSerge opened this issue Jun 30, 2024 · 13 comments
Labels
6.topic: cuda Parallel computing platform and API 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: rocm

Comments

@SomeoneSerge
Copy link
Contributor

SomeoneSerge commented Jun 30, 2024

Issue description

The hardware.opengl.enable option, as well as the related addOpenGLRunpath hook and the /run/opengl-driver path, have been historically very confusing (e.g. it's not obvious one has to "enable opengl" in order to use cuda). #141803 introduced an arguably better name for the hook, addDriverRunpath: although it's not obvious what a "driver" is until you read the documentation, the new name isn't misleading anymore. It's also less ambiguous as to which characters could be upper-case. The /run/opengl-driver/lib is so far kept for compatibility purposes, which is fine because it's mostly invisible to the users. #320228 shattered the status quo concerning the nixos option name, introducing hardware.graphics.enable. However, this might not yet be the permanent solution.

The hook and the option should have consistent names: addDriverRunpath doesn't trivially map into hardware.graphics.enable.

A number of alternative names had been suggested123. More notably, an alternative and more fine-grained configuration scheme was proposed by @Atemu4.

Common concerns that have been raised:

  • hardware.{gp,}gpu, hardware.graphics, hardware.opengl, etc aren't actually just about GPUs, graphics, or OpenGL; these options will concern a variety of external devices (e.g. TPUs) and integrated processors' features, and a variety of software stacks (e.g. Vulkan, OpenCL).
  • hardware.{impure-,}drivers, hardware.accelerators, even if technically correct, are non-suggestive; they're likely to confuse the user on the first encounter.

A related note is that Nixpkgs (pkgs) also has options, and some of them concern "accelerators" and "devices": config.cudaSupport, config.cudaCapabilities, config.rocmSupport; @NixOS/cuda-maintainers (CC @NixOS/rocm-maintainers) have been considering to aggregate this kind of options into a hierarchy: e.g. config.accelerators.{cuda,rocm}.enable instead of {cuda,rocm}Support, config.accelerators.cuda.gencodes instead of cudaCapabilities, etc. We have also been considering adding similar global configuration options for OpenCL, MKL, etc - which would also fit reasonably well under config.accelerators. Whatever the name, Nixpkgs' option names should be consistent with NixOS and the hook too.

CC @Atemu @K900

Footnotes

  1. https://github.com/NixOS/nixpkgs/pull/320228#issuecomment-2183092286

  2. https://github.com/NixOS/nixpkgs/pull/320228#issuecomment-2171511374

  3. https://github.com/NixOS/nixpkgs/pull/276558

  4. https://github.com/NixOS/nixpkgs/issues/141803#issuecomment-2010803483

@SomeoneSerge SomeoneSerge added 6.topic: cuda Parallel computing platform and API 6.topic: rocm 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS labels Jun 30, 2024
@Aleksanaa
Copy link
Member

hardware.userSpaceDrivers (too long, but relatively unambiguous)

@SomeoneSerge
Copy link
Contributor Author

hardware.userSpaceDrivers (too long, but relatively unambiguous)

My argument for impure-drivers is that there are "userspace drivers" which we link purely and they just work. Ideally, we shouldn't be needing an option like hardware.opengl.enable, and we shouldn't be needing /run/opengl-driver. But we do, so we grant ourselves this one ugly hack, and we provide a convenience option for NixOS users to support this semi-impure Nixpkgs software automatically, while non-NixOS users have to deal with wrappers and containers.

I also like accelerators/acceleration, because then there's nothing to argue about wrt the package set options

@K900
Copy link
Contributor

K900 commented Jul 1, 2024

We can fix that on non-NixOS too, if we just load the drivers from normal locations like /usr. I don't like impure because that implies there's a pure option, which there isn't. Also, I was going to propose a change following @Atemu's plan (e.g. hardware.opengl.drivers.mesa.enable = true;, etc), for the next staging cycle (because it's annoyingly invasive to all the loaders).

@SomeoneSerge
Copy link
Contributor Author

We can fix that on non-NixOS too, if we just load the drivers from normal locations like /usr

H'm, this has far-going consequences. To clarify, you're suggesting that we selectively let loaders like libglvnd scan /usr, or that we extend addDriverRunpath to patchelf all of Nixpkgs this way?

@K900
Copy link
Contributor

K900 commented Jul 1, 2024

We let loaders can /usr, yes.

@Atemu
Copy link
Member

Atemu commented Jul 1, 2024

because it's annoyingly invasive to all the loaders

The NixOS options API change can happen independently from the loader implementation. We don't need to have all the functionality which the new API could provide us with from the get-go. Mapping the existing functionality onto it would be a perfectly reasonable first step IMHO.

We can fix that on non-NixOS too, if we just load the drivers from normal locations like /usr.

I actually tried this a while back and this is not trivially possible as you still get glibc mismatches.

@Atemu
Copy link
Member

Atemu commented Jul 1, 2024

Now onto the hardest problem in all computer science: Naming.

Before we discuss different names, I think it's good to first define how to measure their merit.
I think a good benchmark for naming NixOS options is whether a non-technical user would be able to discover the relevant option via text search or, reversely, how tech-savy they would need to be in order to know the correct terms to be able to find the desired option.

I believe that the term "drivers" to be perfectly understandable by non-technical people and even the mainstream to a degree. Hardware drivers is not a new concept or term that we'd need to teach to the current "target audience" of NixOS.
If a NixOS user searches for "nvidia drivers" or "nvidia", they should trivially be able to discover e.g. hardware.acceleration.drivers.nvidia.enable.

I don't like the term "accelerators" or "acceleration [APIs]". To know this term or connect it with configuring your GPU, you must have a bit of experience in the field of accelerated computing. As a recent CS graduate myself, this isn't something I'd expect my fellow graduates to know. While they will have at least made contact with accelerated compute and likely heard of OpenGL and/or CUDA, most will likely not connect this term with it.

There are two saving graces here however:

  • You're unlikely to need to configure this unless you have experience with accelerated computing; it's more something for advanced users of whom you can expect to learn these terms
  • The individual APIs have names which are quite well known. Eventhough a Linux gamer won't think to search for "acceleration api", they will likely know to search for "vulkan" and would therefore be able to discover e.g. hardware.acceleration.api.vulkan.enable.

I also simply do not think there is a better term.

I concur with @K900's opinion on the impure prefix.


One could do away with the term "driver" entirely here by calling userspace drivers acceleration API implementations instead. This way we wouldn't cause any confusion around whether we're talking about whether the "driver" is a kernel driver or userspace library.
I don't think this is worth pursuing though because referring to accel API implementations as "drivers" is quite common. Namespacing both API and implementation/driver config under e.g. hardware.acceleration.{driver,api} should make it clear enough.

@samuela
Copy link
Member

samuela commented Jul 1, 2024

Random thoughts:

  1. I agree that impure is a bit confusing: implies the existence of a pure option, and isn't immediately obvious to me in what sense it is "impure". Also worth bearing in mind that this will be the official, recommended setting.
  2. I like the term "accelerator". It's future proof, captures the essence of what is actually happening (afaiu), and is an increasingly popular term in the ML/computing/GPU/TPU/etc world since GPUs aren't really about graphics anymore.
  3. I'm not too concerned about search-ability. I figure most everyone will be finding these options via googling and finding things on the wiki and/or manual. Search engines are already smart enough to find the right options on the right pages despite currently having a name that doesn't match function.
  4. Seems to me that part of the challenge here is that we've put a lot of varied software behind a single config flag. Could we split this up into multiple flags with smaller scope?
  5. tbh I don't feel the name matters all that much. At least just about anything will be an improvement over hardware.opengl.enable.

@SomeoneSerge
Copy link
Contributor Author

Smirking: "A non-technical user will be confused by "drivers" much less than a technical one"

@SomeoneSerge
Copy link
Contributor Author

We let loaders [s]can /usr[/lib], yes.

Since this was brought up, I'll mention #273389: here the goal was to allow one library (libcuda.so from cuda_compat) to load certain dependencies (libnvrm*.so) from the "system locations". I think this is fairly similar. The complication was that just the libnvrm*.so were incomplete and had transitive dependencies which in turn caused conflicts with the respective Nixpkgs* libraries (e.g. libstdc++) depending on the order of loading libraries and the driver into the process.

@K900
Copy link
Contributor

K900 commented Jul 4, 2024

Yeah we can't have it perfectly working, but we can at least have it usually-mostly-working, as long as your nixpkgs libc is ~newer than your drivers.

@SomeoneSerge
Copy link
Contributor Author

Yeah we can't have it perfectly working, but we can at least have it usually-mostly-working, as long as your nixpkgs libc is ~newer than your drivers.

Unlesss maybe we mess around with the dynamic loader. I'm still comprehending the implications of #248547 (comment). Is there a separate issue to track /usr/lib and /etc/ld.so.cache for driver loaders?

@SomeoneSerge
Copy link
Contributor Author

I didn't comment on the previous two on-topic proposals though.

hardware.acceleration.api.vulkan.enable
hardware.acceleration.drivers.nvidia.enable

Opinion: a path longer than hardware.${something}.vulkan.enable would be unreasonable.

As a matter of fact, I'd be happy with hardware.{cuda,vulkan,opengl}&c if it were not for the confusing history of hardware.opengl.enable (actually, that might still work out)

As a thought experiment concerning the package set options: do you feel like import <nixpkgs> { config = { hardware.cuda.enable = true; hardware.cuda.capabilities = [ "8.6" ]; }; } reads as grammatical and meaningful? Compared to { accelerators.cuda.enable = true; }?

impure is a bit confusing: implies the existence of a pure option,

But there is one. As K900 had pointed out months ago, ROCm links a runtime driver but one with a stable uapi which "just works" regardless of versioning.

I agree this is probably not worth bringing to the users' attention.

I don't like the term "accelerators" or "acceleration [APIs]". To know this term or connect it with configuring your GPU

The option will be discoverable via man configuration.nix, via the online manual, via google. Once found, it'll easily make sense.

Before we discuss different names, I think it's good to first define how to measure their merit.

All that you said plus more global coherence and consistency. As I mentioned, I'd like the Nixpkgs' internals like the hooks and their APIs, the package set options, the NixOS options, the interfaces of wrappers (NixGL, nixglhost, etc; note: might be relevant even after the loader changes proposed by K900), and possibly the paths like /run/opengl-driver to "make sense" together, feel like parts of a single design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: cuda Parallel computing platform and API 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: rocm
Projects
Status: New
Development

No branches or pull requests

5 participants