Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(cust_raw): consolidate CUDA, cuDNN, OptiX bindgen and remove find_cuda_helper #181

Conversation

adamcavendish
Copy link
Contributor

@adamcavendish adamcavendish commented Apr 1, 2025

  1. Consolidation of bindgen related "*-sys" packages
    • Remove the common dependency of find_cuda_helper. Use the cargo
      metadata mechanism instead.
    • Merged all CUDA bindgen-generated code into the cust_raw crate for
      simplicity and maintainability.
    • Add CUDA Runtime API bindgen support.
  2. cuDNN and OptiX Integration
    • Split cudnn into cudnn (high-level API) and cudnn-sys (low-level
      bindgens) for better abstraction.
    • Split optix into optix (high-level API) and optix-sys (low-level
      bindgens) for better abstraction.
  3. CUDA 12+ Support
    • Updated cust to support CUDA versions >= 12.
    • Added compatibility for CUDA 12.3+ graph API changes:
      • Renamed cuGraphKernelNodeGetParams →
        cuGraphKernelNodeGetParams_v2.
      • Enabled conditional node support for CUDA >= 12.3.
  4. Temporarily disable cuDNN in CI
    • Windows CI pipelines have no cuDNN support yet.

Co-authored-by: Adam Basfop Cavendish GetbetterABC@yeah.net
Co-authored-by: Jorge Ortega jorge-ortega@outlook.com


This PR addresses #166, please refer to the issue for more discussion backgrounds.

@adamcavendish adamcavendish requested a review from frjnn as a code owner April 1, 2025 12:29
@adamcavendish adamcavendish changed the title refactor(cust_raw): consolidate CUDA, OptiX bindgen and remove find_c… refactor(cust_raw): consolidate CUDA, OptiX bindgen and remove find_cuda_helper Apr 1, 2025
@adamcavendish
Copy link
Contributor Author

adamcavendish commented Apr 1, 2025

@jorge-ortega @LegNeato The PR builds in Ubuntu Linux 24.04 nvidia container with CUDA 12.8.0, cuDNN 9, OptiX 7.3.0 including all examples. The change is for review as I might add more CUDA versions support in the following days if we still want to embed the generated code into our codebase.

If we are sure we do not want to add generated code, I'll switch over to drop version features and ensure they are built from host.

Note:

  1. I don't have graphics support on a server so the OptiX might be better to get run by someone if possible.
  2. The CI mostly fails on rustfmt for the generated code. I also wonder whether we should do formatting on the generated code.

@adamcavendish adamcavendish force-pushed the feature/refactor-cust-raw-and-find-cuda branch from 6c43b45 to 224bf2c Compare April 1, 2025 12:53
@jorge-ortega
Copy link
Collaborator

jorge-ortega commented Apr 1, 2025

Seeing how infectious specifying the version through feature flags are, I'd rather we go with host generated bindings all the way. It'd be easier for the ecosystem to evolve with the assumption that you bring you own SDK to build against and check for a supported version through cargo metadata, instead of relying on users to ensure they have the same version feature flags set across all crates that might use cust or the bindings directly.

@LegNeato
Copy link
Contributor

LegNeato commented Apr 1, 2025

Now that I see it I agree.

@adamcavendish adamcavendish force-pushed the feature/refactor-cust-raw-and-find-cuda branch 2 times, most recently from fabc50f to bb8e882 Compare April 2, 2025 06:11
@jorge-ortega
Copy link
Collaborator

We might have to exclude cudnn from CI for the time being until we get the SDK installed on workers.

@adamcavendish adamcavendish force-pushed the feature/refactor-cust-raw-and-find-cuda branch from bb8e882 to c931644 Compare April 2, 2025 06:50
@adamcavendish adamcavendish changed the title refactor(cust_raw): consolidate CUDA, OptiX bindgen and remove find_cuda_helper refactor(cust_raw): consolidate CUDA, cuDNN, OptiX bindgen and remove find_cuda_helper Apr 2, 2025
@adamcavendish adamcavendish force-pushed the feature/refactor-cust-raw-and-find-cuda branch 3 times, most recently from d0627ac to 7d129ce Compare April 2, 2025 07:12
@adamcavendish adamcavendish force-pushed the feature/refactor-cust-raw-and-find-cuda branch from 7d129ce to 3cccdbb Compare April 2, 2025 07:32
@jorge-ortega
Copy link
Collaborator

Now comes the fun part of cfg gating in order to support an old CUDA version like v11.2 😆

@adamcavendish
Copy link
Contributor Author

Now comes the fun part of cfg gating in order to support an old CUDA version like v11.2 😆

Yeah, I'm looking into this. Do we have a plan on what versions we would like to support and how long?

@jorge-ortega
Copy link
Collaborator

CUDA v12 builds pass 🎉Just clippy and rustdoc issues to address.

Yeah, I'm looking into this. Do we have a plan on what versions we would like to support and how long?

Nothing official. @LegNeato should we drop 11.2 so we can focus rebooting on 12.0 and up until we figure out what cust will officially support moving forward?

… find_cuda_helper

1. Consolidation of bindgen related "*-sys" packages
  - Remove the common dependency of `find_cuda_helper`. Use the cargo
    metadata mechanism instead.
  - Merged all CUDA bindgen-generated code into the cust_raw crate for
    simplicity and maintainability.
  - Add CUDA Runtime API bindgen support.
2. cuDNN and OptiX Integration
  - Split cudnn into cudnn (high-level API) and cudnn-sys (low-level
    bindgens) for better abstraction.
  - Split optix into optix (high-level API) and optix-sys (low-level
    bindgens) for better abstraction.
3. CUDA 12+ Support
  - Updated cust to support CUDA versions >= 12.
  - Added compatibility for CUDA 12.3+ graph API changes:
    - Renamed cuGraphKernelNodeGetParams →
      cuGraphKernelNodeGetParams_v2.
    - Enabled conditional node support for CUDA >= 12.3.
4. Temporarily disable cuDNN in CI
    - Windows CI pipelines have no cuDNN support yet.

Co-authored-by: Adam Basfop Cavendish <GetbetterABC@yeah.net>
Co-authored-by: Jorge Ortega <jorge-ortega@outlook.com>
@adamcavendish adamcavendish force-pushed the feature/refactor-cust-raw-and-find-cuda branch from 3cccdbb to d7d0b15 Compare April 2, 2025 08:23
@jorge-ortega
Copy link
Collaborator

jorge-ortega commented Apr 2, 2025

Issues like this I can address with the changes on my workspace which handles macro function renaming. Since v12 builds, I'm good to merge these.

@LegNeato
Copy link
Contributor

LegNeato commented Apr 2, 2025

@jorge-ortega I think you should be able to merge yourself, let me know if you can not!

@LegNeato
Copy link
Contributor

LegNeato commented Apr 2, 2025

I filed #185, which will get us cudnn and remove some future maintenance burden.

@jorge-ortega jorge-ortega merged commit b85d9ca into Rust-GPU:main Apr 2, 2025
2 of 4 checks passed
@adamcavendish adamcavendish deleted the feature/refactor-cust-raw-and-find-cuda branch April 3, 2025 02:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants