Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserving libtorch_python in package #246

Closed
wants to merge 10 commits into from

Conversation

jeongseok-meta
Copy link
Contributor

@jeongseok-meta jeongseok-meta commented Jul 9, 2024

Our project requires linking to libtorch_python. Currently, the shared library is not included in the official TorchConfig.cmake, so we use find_library() to locate it. However, this method fails to find libtorch_python, although it successfully locates other symlinked libraries in the same directory (e.g., libtorch_cpu or libtorch_global_deps).

+ ls -l /home/conda/feedstock_root/build_artifacts/momentum_1720498881696/_build_env/lib/python3.12/site-packages/torch/lib
total 22012
lrwxrwxrwx  1 conda conda       21 Jul  9 04:23 libc10.so -> ../../../../libc10.so
lrwxrwxrwx  1 conda conda       21 Jul  9 04:23 libshm.so -> ../../../../libshm.so
lrwxrwxrwx  1 conda conda       27 Jul  9 04:23 libtorch_cpu.so -> ../../../../libtorch_cpu.so
lrwxrwxrwx  1 conda conda       35 Jul  9 04:23 libtorch_global_deps.so -> ../../../../libtorch_global_deps.so
-rwxr-xr-x 49 conda conda 22537640 Jun 17 01:29 libtorch_python.so
lrwxrwxrwx  1 conda conda       23 Jul  9 04:23 libtorch.so -> ../../../../libtorch.so

This PR aims to prevent the removal of libtorch_python by this line. Instead, it moves the library to PREFIX/lib and creates a symbolic link in the torch lib folder, similar to other libraries, hoping this change resolves the issue with find_library() not finding the shared library.

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@jeongseok-meta
Copy link
Contributor Author

@conda-forge-admin, please rerender

Copy link

github-actions bot commented Jul 9, 2024

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you, but it looks like there was nothing to do.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/pytorch-cpu-feedstock/actions/runs/9851406294.

recipe/build.sh Outdated
@@ -218,7 +218,6 @@ if [[ "$PKG_NAME" == "libtorch" ]]; then
for f in ATen caffe2 tensorpipe torch c10; do
mv torch/include/$f ${PREFIX}/include/$f
done
rm ${PREFIX}/lib/libtorch_python.*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work AFAIU; libtorch_python.so depends on a specific python version, while the conda package libtorch intentionally does not depend on python at all.

If we want to ship libtorch_python.so, we should most likely package it in the pytorch output (or perhaps create a separate output).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing out that libtorch should not depend on Python. Moving it or creating a separate output seems like a viable solution.

I am relatively new to the build script of this feedstock and find it more complex than other packages I've worked with. Could you please provide some guidance on how to implement your suggestion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please provide some guidance on how to implement your suggestion?

As a first approach, it should work to "install" libtorch_python.so in build_pytorch.sh, by copying from the build cache (that's created & populated during the execution of build.sh) into $PREFIX/lib.

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Jul 9, 2024

you are asking for for the python specific library, you should be able to replace torch_python with torch in your setup script.

@h-vetinari
Copy link
Member

you are asking for for the python specific library, you should be able to replace torch_python with torch in your setup script.

We genuinely don't seem to package libtorch_python.so, also not in pytorch itself (which is empty).

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Jul 9, 2024

~/miniforge3/pkgs
07:25 $ find -name libtorch_python.so
./pytorch-2.3.1-cpu_mkl_py312h3b258cc_100/lib/python3.12/site-packages/torch/lib/libtorch_python.so
./pytorch-2.3.1-cpu_mkl_py310h75865b9_100/lib/python3.10/site-packages/torch/lib/libtorch_python.so
./pytorch-2.3.1-cpu_generic_py310ha4c588e_0/lib/python3.10/site-packages/torch/lib/libtorch_python.so

in a typical workflow, people often are not linking to it, so i guess we left it in lib so it could be python specific.

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Jul 9, 2024

And a computer that has cuda120

~/miniforge3/pkgs
$ find -name libtorch_python.so
./pytorch-2.3.1-cuda120_py310h2c91c31_300/lib/python3.10/site-packages/torch/lib/libtorch_python.so

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Jul 9, 2024

We genuinely don't seem to package libtorch_python.so, also not in pytorch itself (which is empty).

I guess that stremlit should be taken with a grain of salt, in this case a 30MB package can't be empty ;)

@h-vetinari
Copy link
Member

I guess that streamlit should be taken with a grain of salt, in this case a 30MB package can't be empty ;)

mr-bean-realization

@h-vetinari
Copy link
Member

h-vetinari commented Jul 10, 2024

OK, my view of the world is restored (thanks @jaimergp 🙏): the files for pytorch now show up in streamlit, and in particular, libtorch_python.so can be found in $SP_DIR/torch/lib.

It would be within the realm of possibility to move them to $PREFIX/lib and then play around with patchelf to correct the path in the binaries. Here's an example where ray does something similar. Not sure if that's worth it though. 🤷

@hmaarrfk
Copy link
Contributor

@jeongseok-meta can you comment as to whether or not you are able to link directly to the non-python libraries?

it also seems that we do include the TorchConfig.cmake file

./libtorch-2.3.1-cuda120_h2b0da52_300/share/cmake/Torch/TorchConfig.cmake

are you using conda-forge cmake or an other?

@jeongseok-meta
Copy link
Contributor Author

@hmaarrfk Yes, I can link to other libraries by using find_package(Torch CONFIG REQUIRED) and I am using cmake from conda-forge:

https://github.com/facebookincubator/momentum/blob/0d60cde3a5654c7b21599eefd1cace99da4edb45/pymomentum/CMakeLists.txt#L11

However, the problem is that I want to link to libtorch_python, which is not included in the TorchConfig.cmake.

By the way, thank you for the other comments. I will be AFK for the next few days, but I will catch up when I return.

@hmaarrfk
Copy link
Contributor

@jeongseok-meta thanks for giving more context.

In my experience there is often little value in linking to the "Python library" when a "c" library exists.
I can't speak to your exact usecase, but my hunch is that linking to python will give little benefit.

Its also unclear what those at pytorch meant to have as public vs private.

I would consider what you need from the python side, and maybe just avoid using it for maintainability.

One thing that would be helpful for us to understand is, does this work from packages from the pytorch channel? While we try to do it the "conda-forge way" we do try to follow upstream "intentions". So if it works with the pytorch + default channel, then it would give us one extra push to try to make it work here (but again, seems like a very niche usecase, so we would need your help to do it well -- some good suggestions came from h-vetinari)

@isuruf
Copy link
Member

isuruf commented Aug 2, 2024

@jeongseok-meta, this is an issue with our split build. I think the best solution is to move $PREFIX/lib/python$PY_VER/site-packages/torch/lib/libtorch_python.so to $PREFIX/lib and make a symlink back at site-packages.

@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

@jeongseok-meta
Copy link
Contributor Author

@hmaarrfk Thank you for sharing your thoughts. Specifically, we need to use python_ivalue.h where the definitions are included in libtorch_python.so for Python binding with PyTorch interop. Unfortunately, we don't have a clear path to avoid depending on the API at the moment, and it's also unclear if PyTorch intended for it to be public or private as well.

One thing that would be helpful for us to understand is, does this work from packages from the pytorch channel?

It works with packages from the PyTorch channel, as we can build and import the Python binding linking to libtorch_python.so without errors locally (pixi.toml of our project)

@jeongseok-meta
Copy link
Contributor Author

Hi @isuruf, your solution looks promising! However, I noticed some build failures in some jobs. Do you have any ideas on how to address them?

@hmaarrfk
Copy link
Contributor

@conda-forge-admin please rerender

@hmaarrfk
Copy link
Contributor

i thought i had access to the GPU runners... maybe they are down.

the package fails to build in docker locally for me.

@hmaarrfk
Copy link
Contributor

i think perhaps a new flag was added in gcc 12.4 or something.
conda-forge/ctng-compiler-activation-feedstock#121

perhaps we can try to set the upper bound to 12.3???

@h-vetinari
Copy link
Member

the package fails to build in docker locally for me.

Can you post an error log?

i think perhaps a new flag was added in gcc 12.4 or something.

The only thing that changed in our packaging was the way how meson sets a release flag, and we specifically didn't apply that change to 12.4, only 13.x and up.

@hmaarrfk
Copy link
Contributor

Here is the log, had to pull it out of my tmux
log.txt

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Aug 21, 2024

I think some compilation issues might stem from libmagma having been updated to 2.8 from 2.7.2: conda-forge/libmagma-feedstock#18

That said, i'm not too sure.

2.4.0 seems to build fine with libmagma 2.8.0 https://github.com/conda-forge/pytorch-cpu-feedstock/actions/runs/10477101071/job/29017474679?pr=250#step:3:734

conda-forge/libmagma-feedstock#21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants