-
Notifications
You must be signed in to change notification settings - Fork 180
Closed
Labels
Milestone
Description
Currently CUDA Python has a Cython-based reimplementation of CUDA runtime on top of the CUDA driver APIs. This is significant effort in terms of maintenance, requiring a lot of engineering time, as we have to catch up with every new cudart APIs indefinitely.
This RFC reflects the team's plan that we will switch to statically linking to cudart on all platforms (Linux & Windows) instead, starting from the next CUDA major release. That is, for cuda-python
X.Y.Z we statically link to libcudart_static.a
on Linux (or cudart_static.lib
on Windows) from CUDA Toolkit X.Y.Z.
By doing this, the benefits include:
- We can remove tens of thousands of lines of Cython code
- We can avoid chasing indefinitely after latest cudart APIs
- We can avoid potential possibilities of misaligned reimplementation, in favor of directly using the "official" cudart implementation
- We can reduce significantly the build time of cuda-python wheels and conda packages
- We continue offering CUDA minor version compatibility as well as all other functionalities and expectations that the current implementation offers, e.g.
- We align with the expectation of normal CUDA applications compiled by NVCC that cudart is statically linked into the executable
- Additional context: cuda.cudart.cudaRuntimeGetVersion() hard-codes the runtime version, rather than querying the runtime #16 (comment)
Please let us know if this could be a concern to your project, as we do not anticipate any issue or any user-visible effects. Thanks!
rmccorm4, vyasr, cennn and kkraus14