-
Notifications
You must be signed in to change notification settings - Fork 967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to install with GPU support via cuBLAS and CUDA #250
Comments
Closing the issue so it doesn't clog up the issues. But should still be searchable. |
Great work @DavidBurela! We need to document that Also the number of threads should be set to the physical number of cores on the system. This is usually half the number of reported hypercores, unless one is running in a VM where the number of threads need to be set to the number of virtual cores. @abetlen should @DavidBurela's docs be put in the |
Just a note, if you are not using The Error$ CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir
...
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found CUDAToolkit: /usr/local/cuda/include (found version "12.1.105")
-- cuBLAS found
-- The CUDA compiler identification is unknown
CMake Error at /tmp/pip-build-env-u5gx6grn/overlay/lib/python3.10/site-packages/cmake/data/share/cmake-3.26/Modules/CMakeDetermineCUDACompiler.cmake:603 (message):
Failed to detect a default CUDA architecture.
...
ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects The FixAdd CUDA's bin folder to export PATH="/usr/local/cuda/bin:$PATH" |
Closing. Please reopen if necessary. |
In my case, because I had a non-cuBLAS-enabled wheel hanging around, I had to force pip to rebuild using CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install --update --no-cache-dir llama-cpp-python |
@DavidBurela, this gives me
|
For the installation and the solution that produced the result, see user jllllllllll's post: Problem to install llama-cpp-python on Windows 10 with GPU NVidia Support CUBlast, BLAS = 0 #721 |
I use this way conda create -n condaexample python=3.11 #enter later python version if needed
conda activate condaexample
# Full list at https://anaconda.org/nvidia/cuda-toolkit
conda install -c "nvidia/label/cuda-12.1.1" cuda-toolkit But why |
Submitting and closing, to help anyone else searching for how to solve this. Including my error message as that is where I was stuck with no results found on the web.
I have also captured an exact step by step in this ReadMe: https://github.com/DavidBurela/edgellm#edgellm
Install CUDA toolkit
You need to ensure you have the CUDA toolkit installed. as you need
nvcc
etc in your path, to correctly compile when you install viaCMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
Ensure you install the correct version of CUDA toolkit
When I installed with cuBLAS support and tried to run, I would get this error
the provided PTX was compiled with an unsupported toolchain.
I was able to pin the root cause down to the CUDA Toolkit version being installed, was newer than what my GPU Drivers supported.
Run
nvidia-smi
, and note what version of CUDA is supported in the top right.Here my GPU drivers support 12.0, so I can install CUDA toolkit 12.0.1
Download & install the correct version
Direct download and install
https://developer.nvidia.com/cuda-toolkit-archive
Conda
If you are using Conda you can also download it directly into your environment
Enable in code
The text was updated successfully, but these errors were encountered: