-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NVIDIA GPU and numpy #1979
Comments
After several hours of troubleshooting I finally managed to solve the issue. InstallFirst of all you have to install llama-cpp forcing a specific version of numpy<2:
Ensure to:
Run Private GPT:
If this solves your problem, good, you're done. If you instead stumble upon another error about "CUDA error: out of memory" and "TOKENIZERS_PARALLELISM=(true | false)", ensure to set this variable to true:
Then rerun Private GPT as always:
This solved the issue for me. I also suppose the first command should be updated on the official documentation. On a side note:
|
Hey thank you for that numpy part! |
I've just opened a PR to add a CUDA-compatible dockerfile with these problems fixes, can you try? |
Sorry, I cannot do it anymore. |
Hi,
I'm trying to setup Private GPT on windows WSL.
I followed the instructions here and here but I'm not able to correctly run PGTP.
If I follow this instructions:
poetry install --extras "ui llms-llama-cpp embeddings-huggingface vector-stores-qdrant"
I'm able to run PGPT with numpy 1.26.4 but with BLAS=0 (CPU).
If I run this instead:
CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python
I get BLAS=1 (GPU) but it automatically upgrades numpy to a 2.x version and PGPT doesn't work because it gives an error like "A module that was compiled using NumPy 1.x cannot be run in NumPy 2.0.0 as it may crash".
Is there a way I can downgrade numpy AND use GPU (BLAS=1)?
The text was updated successfully, but these errors were encountered: