Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make cuDNN/cuTENSOR/NCCL optional #109

Closed
pentschev opened this issue Mar 9, 2021 · 15 comments · Fixed by #113
Closed

Make cuDNN/cuTENSOR/NCCL optional #109

pentschev opened this issue Mar 9, 2021 · 15 comments · Fixed by #113
Assignees

Comments

@pentschev
Copy link

Currently CuPy lists cuDNN as a requirement, but given cuDNN is large in size it's possibly a good idea to make it optional instead. A concrete use case is RAPIDS, it depends on CuPy but doesn't use cuDNN, which forces the inclusion of that library in official Docker containers, etc.

@leofang mentioned there may be a few things that need to be changed. Could we discuss those here? Do we need those changes in the CuPy repo or do they need to be here in the feedstock?

cc @kkraus14 @kmaehashi @jakirkham

@jakirkham
Copy link
Member

Wee need changes in both ( cupy/cupy#4850 (comment) ). The CuPy bits can be discussed in that issue. The Conda bits are easy to handle once the CuPy bit are done (and in a release)

@leofang
Copy link
Member

leofang commented Mar 10, 2021

Thanks, @pentschev!

@kkraus14 @jakirkham I have a question related to this need. I'd also like to make cuTENSOR optional --- which currently is not, as I expanded the build matrix so that we build CuPy without cuTENSOR support (for CUDA 9.2 - 11.2) in cos6 images, and we build it with cuTENSOR (for CUDA 10.1+) in cos7 images. This is a bit too much. In order to make it optional (and reduce the build matrix size), I say we kill the duplicate builds in the former (cos6 + CUDA 10.1+). WDYT? Would this have any real impact to your user base?

@jakirkham
Copy link
Member

Does CuPy link to cuTENSOR at build time?

@leofang
Copy link
Member

leofang commented Mar 10, 2021

Yes, there're two Cython modules for supporting cuTENSOR.

@leofang
Copy link
Member

leofang commented Mar 10, 2021

Yes, there're two Cython modules for supporting cuTENSOR.

Likewise for cuDNN and NCCL.

@jakirkham
Copy link
Member

What happens if a user tries to load those modules if they were linked to those libraries, but the libraries were not present? A library load error? Or does CuPy have some way to insulate the user from that?

@leofang
Copy link
Member

leofang commented Mar 10, 2021

Python raises ImportError if the symbols in a module cannot be fully resolved (undefined references) by the linker. In the case of cuDNN, CuPy has a try-except to catch this:
https://github.com/cupy/cupy/blob/v9.0.0b3/cupy/cuda/cudnn.py#L9-L14
I think this detection can be easily done in both cuTENSOR and NCCL.

@jakirkham
Copy link
Member

Sounds good. I think if we have that in place for cuTENSOR and NCCL, it will be easy to make those optional here in the same way

@jakirkham
Copy link
Member

jakirkham commented Mar 10, 2021

Also ok with dropping CentOS 6 (assuming I understood your suggestion above correctly)

RAPIDS is CentOS 7+ only

@leofang
Copy link
Member

leofang commented Mar 10, 2021

I raised the question on version pinning in the CuPy counterpart issue: cupy/cupy#4850 (comment). CuPy's library metadata pins at different versions from the latest ones we have on CF, so it needs to be addressed too.

@jakirkham
Copy link
Member

Once that is done, I think we can add all of these to ignore_run_exports and run_constrained that way we still build against them, but not ship them by default. Users can then optionally install these extra pieces as they need them

@leofang leofang changed the title Make cuDNN optional Make cuDNN/cuTENSOR/NCCL optional Mar 12, 2021
@leofang
Copy link
Member

leofang commented Mar 12, 2021

In case it wasn't clear, the work plan is as follows:

  1. Get the PR Support optional dependencies from Conda-Forge cupy/cupy#4873 in so that we can give instructions to CF users at runtime for installing missing libraries.
  2. Generate a json file in the recipe here (likely when the recipe is updated next time, say for v8.6.0/v9.0.0rc1) to enable the warning mechanism. Also use the tricks John mentioned above to do the actual opt-out job.

@leofang
Copy link
Member

leofang commented Mar 12, 2021

(btw @pentschev I changed the issue title to reflect the new plan.)

@leofang leofang self-assigned this Mar 13, 2021
This was referenced Mar 23, 2021
@leofang
Copy link
Member

leofang commented Mar 27, 2021

@pentschev this is done starting v9.0.0rc1; v8.6 still has to carry the deps...

@pentschev
Copy link
Author

Thanks so much @leofang , you're awesome! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants