Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda12 rebuild #181

Closed
wants to merge 6 commits into from
Closed

Conversation

RaulPPelaez
Copy link
Contributor

@RaulPPelaez RaulPPelaez commented Aug 16, 2023

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

This is my take on trying to build for CUDA 12, its just a continuation of #171.
This builds locally on my machine, but is failing on testing because it does not find the mkl .so:

import: 'torch'                                                                                                                                                              
Traceback (most recent call last):                                                                                                                                           
  File "/home/conda/feedstock_root/build_artifacts/pytorch-recipe_1692196994653/test_tmp/run_test.py", line 2, in <module>                                                   
    import torch                                                                                                                                                             
  File "/home/conda/feedstock_root/build_artifacts/pytorch-recipe_1692196994653/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla
cehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.11/site-packages/torch/__init__.py", line 228, in <module>              
    _load_global_deps()                                                                                                                                                      
  File "/home/conda/feedstock_root/build_artifacts/pytorch-recipe_1692196994653/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla
cehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.11/site-packages/torch/__init__.py", line 187, in _load_global_deps     
    raise err
  File "/home/conda/feedstock_root/build_artifacts/pytorch-recipe_1692196994653/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla
cehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.11/site-packages/torch/__init__.py", line 168, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/home/conda/feedstock_root/build_artifacts/pytorch-recipe_1692196994653/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_pla
cehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_p/lib/python3.11/ctypes/__init__.py", line 376, in __init__
    self._handle = _dlopen(self._name, mode)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: libmkl_intel_lp64.so.1: cannot open shared object file: No such file or directory

regro-cf-autotick-bot and others added 5 commits June 6, 2023 19:46
The transition to CUDA 12 SDK includes new packages for all CUDA libraries and
build tools. Notably, the cudatoolkit package no longer exists, and packages
should depend directly on the specific CUDA libraries (libblas, libcusolver,
etc) as needed. For an in-depth overview of the changes and to report problems
[see this issue]( conda-forge/conda-forge.github.io#1963 ).
Please feel free to raise any issues encountered there. Thank you! 🙏
Add use_mkldnn
Add CFLAGS to silence gcc-12 false positive error
@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I was trying to look for recipes to lint for you, but it appears we have a merge conflict.
Please try to merge or rebase with the base branch to resolve this conflict.

Please ping the 'conda-forge/core' team (using the @ notation in a comment) if you believe this is a bug.

@jakirkham
Copy link
Member

@conda-forge-admin, please re-render

@github-actions
Copy link

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you, but it looks like there was nothing to do.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/pytorch-cpu-feedstock/actions/runs/5882478037.

recipe/meta.yaml Outdated
Comment on lines 82 to 84
- mkl==2021.4.0 # [x86]
- mkl-include==2021.4.0 # [x86]
- mkl-devel==2021.4.0 # [x86]-
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe these pins could be moved to conda_build_config.yaml. Then when resolving upstream changes would prefer upstream's changes (then could readd the -devel part). IIUC -include comes with -devel

blas_impl:
- mkl # [x86 or x86_64]
- generic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're also really old (>2 years). Do we really need to pin mkl that tightly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah am wondering the same. It also deviates from the current global pinnings

Maybe this was needed to get the build working, in which case there probably needs to be some investigation into how to bump the versions (and fix whatever issue occurs when using more recent versions)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See this comment for the rationale: #172 (comment)

@h-vetinari
Copy link
Member

This is my take on trying to build for CUDA 12, its just a continuation of #171.
This builds locally on my machine, but is failing on testing because it does not find the mkl .so:

Thanks for picking this up @RaulPPelaez!

@h-vetinari
Copy link
Member

The PR will also have to be rebased, or at least merged with main.

@RaulPPelaez
Copy link
Contributor Author

I am having some success building locally. However, I am having trouble merging this with main, a lot of conflicts arise in files automatically created by the bot and I have no idea how to solve them.
I would like to somehow reopen this PR starting from the current main but I do not know how to instruct conda-smithy to trigger the CUDA 12 migration there.
Some help pls?

@RaulPPelaez RaulPPelaez mentioned this pull request Aug 17, 2023
5 tasks
@RaulPPelaez
Copy link
Contributor Author

Ok I think I got it, I am closing this and continuing in #182 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants