-
Notifications
You must be signed in to change notification settings - Fork 85
Use variants to produce separate builds with and without cufile support #754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use variants to produce separate builds with and without cufile support #754
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
jakirkham
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Vyas! 🙏
This looks generally reasonable
Had a few comments below
ci/build_cpp.sh
Outdated
|
|
||
| # Use sed to replace the RAPIDS_CUDA_VERSION string in the variants file with | ||
| # the value of that variable in this shell. | ||
| sed -i "s/RAPIDS_CUDA_VERSION/${RAPIDS_CUDA_VERSION}/g" "conda/recipes/libkvikio/variants_$(arch).yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that has worked well before, is to output this to a variant file
It does a similar thing admittedly. Though it can be a bit clearer about what is happening. That could just be personal preference though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that alternative! I agree it is clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered this possibility as well. However, in this instance it's not quite so simple as the pynvjitlink example because we need other things to be encoded in the variants as well. The behavior on x86 and arm is different because the former is a single variant while the latter requires two, so we would need to write out variants conditionally based on that. Moreover, for arm we need to use a different compiler version for each CUDA version, so it's a longer file. If you both prefer that I can switch to it, we just wind up with two fairly long here-docs instead of this command. The flip side is that we can drop the committed variants files if we choose to go that route.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made the changes in 00b642a. If you like it we can stick with it, otherwise feel free to revert that commit.
| #cuda_version: ${{ (env.get("RAPIDS_CUDA_VERSION") | split("."))[:2] | join(".") }} | ||
| #cuda_major: '${{ (env.get("RAPIDS_CUDA_VERSION") | split("."))[0] }}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we drop these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose to keep them to simplify reverting the changes, but if others feel like it would be cleaner to drop these I am fine with that.
| #else | ||
| constexpr int cufile_version() noexcept { return 0; } | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is necessary because the dead code optimizer will generally elide constexpr functions from the final binary if no runtime usage is detected, but we need the symbol in the DSO for usage from other libraries (including Python extensions). For the same reason we also avoid inlining the definition here.
|
/ok to test |
| # Construct the extra variants according to the architecture | ||
| if [[ "$(arch)" == "x86_64" ]]; then | ||
| cat > variants.yaml << EOF | ||
| c_compiler_version: | ||
| - 13 | ||
| cxx_compiler_version: | ||
| - 13 | ||
| cuda_version: | ||
| - ${RAPIDS_CUDA_VERSION} | ||
| EOF | ||
| else | ||
| cat > variants.yaml << EOF | ||
| zip_keys: | ||
| - [c_compiler_version, cxx_compiler_version, cuda_version] | ||
| c_compiler_version: | ||
| - 12 | ||
| - 13 | ||
| cxx_compiler_version: | ||
| - 12 | ||
| - 13 | ||
| cuda_version: | ||
| - 12.1 # The last version to not support cufile | ||
| - ${RAPIDS_CUDA_VERSION} | ||
| EOF | ||
| fi | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking, but I think this would be better to encode in individual variant_$ARCH_$CUDA_VERSION.yaml files and then the branching logic here can be to select the appropriate variant file for the build at hand.
That way the various build variants are all in the conda recipe file and not hidden away in this bash script.
I'm fine with this being done in a follow-up to get CI unblocked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol, I see that was your original approach. well, not worth going back and forth over it.
My general take is "repeat yourself" if it keeps things clean, and heredocs in a random bash script isn't worth the tradeoff to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I liked the old way better too. This version feels wrong in how much of the logic is moved out of the recipe itself. I think we should reconsider what approach we like best, but I'll merge as-is to get things unblocked for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Provided a simplification on the current approach in PR: #755
bdice
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
|
/merge |
Now that we have dropped support for CUDA 11 we no longer require the nvidia channel. With the changes in rapidsai/rapids-dask-dependency#85, RAPIDS now only uses released versions of dask, so we no longer need the dask channel either. This PR also removes the explicit cufile dependence in the kvikio conda packages, which should no longer be necessary now that we have variants of the libkvikio package for different CUDA versions handling this dependency (see #754). Contributes to rapidsai/build-planning#184 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - James Lamb (https://github.com/jameslamb) URL: #759
On arm cufile was not supported until CUDA 12.2, whereas support exists since 12.0 on x86 architectures. To properly reflect these dependencies, we need to build separate variants of cufile on arm for cuda versions before and after 12.2. This PR updates the recipe to support that.