Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cudaPackages: extend cc-wrapper and rewrite cudaStdenv to properly solve libstdc++ issues #226165

Closed
SomeoneSerge opened this issue Apr 14, 2023 · 6 comments
Labels
6.topic: cuda Parallel computing platform and API

Comments

@SomeoneSerge
Copy link
Contributor

NVCC dictates the host C++ compiler version we use to build CUDA programs, and often drifts behind the nixpkgs' stdenv. At the time of writing nixpkgs' default nvcc uses gcc11, but the stdenv on linux is already at gcc12. When we pass gcc11 to NVCC the naive way the builds pick up gcc11's older libstdc++ and we face runtime issues when dynamically linking normal nixpkgs libraries which expect the newer libstdc++: #220341. For this reason we want the CUDA programs built with GCC11 (or whichever toolchain is pinned) to link a libstdc++ compatible with the rest of nixpkgs. Apparently this is also what nixpkgs llvm toolchains do, because libraries linked against libc++ could not have co-existed with those linked against libstdc++

To accommodate that we currently abuse the libcxx argument of wrapCCWith: https://github.com/SomeoneSerge/nixpkgs/blob/8fd02ce2c20b8c7f6b7be39681cdf71a5b4847e2/pkgs/development/compilers/cudatoolkit/stdenv.nix#L14-L20

This actually has worked for a number of packages, cf. #223664, but for absolutely accidental reasons: wrapCCWith propagates libcxx to downstream packages' buildInputs and cc happily picks it up. Unless it doesn't: #225661, #224150 (comment)

The better solution would be to adjust cc-wrapper to our needs, as discussed in #225661 (comment)

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/cuda-team-roadmap-update-2023-08-29/32379/1

@RuRo
Copy link
Contributor

RuRo commented Oct 1, 2023

By the way, is there any quick-and-dirty way to fix this? Just to get something workable while we wait for a "proper" fix.

For example, would it be possible to just patch/.override cuda and/or nvcc to use gcc12 and "rebuild the world" or would that not work.

@SomeoneSerge
Copy link
Contributor Author

By the way, is there any quick-and-dirty way to fix this?

The current "quick&dirty" way is to use cudaPackages.backendStdenv instead of stdenv, which should work to the same effect (expose the older nvcc-compatible gcc with the newer nixpkgs-compatible libstdc++). Somehow we didn't patch opencv to use the hack:

Usage example:

stdenv = if cudaSupport then backendStdenv else inputs.stdenv;

The present issue is open because cudaPackages.backendStdenv is implemented in a rather dumb and fragile way

would it be possible to just patch/.override cuda and/or nvcc to use gcc12

There's a bunch of places where you could do the override, but you'll also likely need to add --allow-unsupported-compiler to something like PREPEND_NVCC_FLAGS and even then the build might fail depending on the cudatoolkit version. So far it'd seem it's more reliable to use the compiler nvidia declares compatible

@SomeoneSerge
Copy link
Contributor Author

The cuda-specific part had been addressed by #275947 (again, thanks to rrbutani for highlighting the gccForLibs argument and the NIX_LDFLAGS order). This can be verified by building jaxlib without ad hoc modifying the ldflags (again, as done in the previous PoC). The general solution for mixing the shared libraries built by toolchains of different age still needs more work (cf. the follow ups in the PR)

@github-project-automation github-project-automation bot moved this from 🔮 Roadmap to ✅ Done in CUDA Team Jan 20, 2024
@nh2
Copy link
Contributor

nh2 commented Jun 17, 2024

@SomeoneSerge Is it currently possible to build CUDA-using apps and libraries with clang?

For example, OpenCV now has this:

# It's necessary to consistently use backendStdenv when building with CUDA
# support, otherwise we get libstdc++ errors downstream
stdenv = throw "Use effectiveStdenv instead";
effectiveStdenv = if enableCuda then cudaPackages.backendStdenv else inputs.stdenv;

If I want to build OpenCV with clang to be able to use e.g. some of its sanitizers like ASan for debugging crashes, can I do that while still keeping CUDA support?

Is cudaPackages.backendStdenv GCC-only or can I override that somehow to get a cudaPackages where backendStdenv uses clang?

Thank you!

@SomeoneSerge
Copy link
Contributor Author

Hi @nh2! The intention is for opencv and other downstream code to stay as it is, and for the user to pass a modified cudaPackages instance where backendStdenv would wrap a clang compiler. You use cudaPackages' = cudaPackages.overrideScope (...) to make one. We have a bunch of nvcc compatibility checks that we've only currently implemented for gcc. You'd have to carefully disable those if working out of tree or relax the checks to allow clang if contributing to Nixpkgs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: cuda Parallel computing platform and API
Projects
Status: Done
Development

No branches or pull requests

4 participants