-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cudaPackages.setupCudaHook: propagate deps and the hook #271078
cudaPackages.setupCudaHook: propagate deps and the hook #271078
Conversation
I maybe broke |
2ab2ebb
to
39f19c1
Compare
dc7de32
to
02aed8c
Compare
# https://gist.github.com/fd80ff142cd25e64603618a3700e7f82 | ||
depsTargetTargetPropagated = [ | ||
# One'd expect this should be depsHostHostPropagated, but that doesn't work | ||
propagatedBuildInputs = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's evidence that propagatedHostTargetDeps
(propagatedBuildInputs) work and HostHost
doesn't: https://gist.github.com/SomeoneSerge/7d0e633743175bee6470b03281fe74e1
Same goes for BuildHost
instead of BuildBuild
when propagating the hook to nvcc's users. Whatever the reason, a transitive dependency is assigned the offsets of (prevHost + relHost, prevHost + relTarget)
rather than (prevHost + relHost, prevTarget + relTarget)
:
nixpkgs/pkgs/stdenv/generic/setup.sh
Lines 552 to 560 in 82fa717
function mapOffset() { | |
local -r inputOffset="$1" | |
local -n outputOffset="$2" | |
if (( inputOffset <= 0 )); then | |
outputOffset=$((inputOffset + hostOffset)) | |
else | |
outputOffset=$((inputOffset - 1 + targetOffset)) | |
fi | |
} |
The relevant message in the cross-compilation matrix: https://matrix.to/#/%23cross-compiling%3Anixos.org/%248lsLV15zqjt5JdopUy-3aD3o8ym1NdITs18bZRZXMc8?via=someonex.net&via=matrix.org&via=catgirl.cloud&via=nixos.dev (yet to receive replies)
02aed8c
to
535da99
Compare
I suspect that most of this PR wouldn't have been required with #102613 |
8a7abaf
to
895067e
Compare
The builds verified so far: ❯ nix build -f ./. --arg config '{ cudaSupport = true; cudaCapabilities = [ "8.6" ]; allowUnfree = true; }' -L python3Packages.{torch,torchaudio,torchvision,opencv4} cctag --print-out-paths --keep-going
/nix/store/csa6zg49ih3zzfnh28ins5sy0xsa5iiz-python3.11-torch-2.1.1
/nix/store/4sdd2281hbz4cp0f24zinsyq6ma2n83i-python3.11-torchaudio-2.1.1
/nix/store/jr9z2jn008xl6vd17n1y5papl8m2spaj-python3.11-torchvision-0.16.1
/nix/store/v1p716yhdx99jldx7s4vmmrsrl0s6xpj-opencv-4.7.0
/nix/store/i4c5j077qqi25kz8y946ggj7axifz715-cctag-1.0.3 IIRC
|
The closure for One odd thing I observe is that
|
895067e
to
c54f16d
Compare
Back to ~6G:
|
c261530
to
44c7292
Compare
Cf. explanations in NixOS#271078 (cherry picked from commit d031523)
This is useful for the cuda variants of packages like opencv and pytorch, whose xxxxConfig.cmake files do find_package(CUDAToolkit REQUIRED) regardless of whether they actually use it. With the propagated hook, we no longer have to manually add cuda dependencies into torch/opencvs reverse dependencies cudaPackages.cuda_nvcc: fix setupCudaHook propagation
494b0bb
to
2df7ccf
Compare
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin staging-23.11
git worktree add -d .worktree/backport-271078-to-staging-23.11 origin/staging-23.11
cd .worktree/backport-271078-to-staging-23.11
git switch --create backport-271078-to-staging-23.11
git cherry-pick -x be9c779deba0e898802dd341a1ba9c04c4e9abe8 ada3991349beb5880e3994f25c65a0cf68941b83 45698380295187b35f3872542b71efc2223f8201 55af9329429a30ce81f7ad01da95406e1d62f785 71c248ec1309381136bf74339d453a58b400b2a9 3ececb9efafd80058525571d77d881767de6f5b8 44611c4a6d16b0eeb1488e9557b6a11e45193a46 2df7ccfa1498f5038b15acd50bc9277ad768dcbf |
Cf. explanations in NixOS#271078 (cherry picked from commit d031523)
NixOS#271078 caused the configurePhase of pcl to fail when withCuda is set to true. Fix NixOS#275090 by replacing cudatoolkit with cudaPackages.cuda_nvcc.
NixOS#271078 caused the configurePhase of pcl to fail when withCuda is set to true. Fix NixOS#275090 by replacing cudatoolkit with cudaPackages.cuda_nvcc.
Description of changes
This is an attempt to fix the build failures in the cuda variants of torch/opencv4 cmake consumers. Basically, this lets one avoid ad hoc changes like this: https://github.com/NixOS/nixpkgs/pull/269639/files#diff-dcb82f7bc26e70a69ecb21e6801c8f2c32dbd91847e2b2db9b4fbb986f412abc.
The approach is to propagate the build dependencies (which we used to do), but to do so in a separate output, so as to avoid the cuda fat leaking into python
withPackages
environmentsIn addition to propagating libraries like
cuda_cudart
we also have to propagate thesetupCudaHook
because somebody's got to set-DCUDAToolkit_ROOT
&cI tested this on
cctag
which had failed atfind_package(opencv)
in #269639I also realized we need to remove the hook's dependency on cudart and especially on the compiler: when we consume a cross-compiled
opencv4
in a native build, the propagated compiler is going to be useless, because we couldn't have known about the native build when compiling opencv4...I haven't yet figured out what happens if a package has both
buildInputs = [ opencv4 ]
andnativeBuildInputs = [ cuda_nvcc ]
. Both are going to propagate the hook, so the hook might run twice, and then it might add the env hooks, preConfigureHooks, and postFixupHooks twice. I need some sort of#pragma once
@NixOS/cuda-maintainers
Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Priorities
Add a 👍 reaction to pull requests you find important.