-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU not being used #7
Comments
I found the somewhat hidden 'Setup visual studio community for llamacpp.odt'. I completed everything in there (Visual Studio with c++ was already done for some ComfyUI nodes previously) and the problem persists. |
I think I figured out what the problem is. It seems that for some people, setting CMAKE_ARGS doesn't work. So I took the advice from here abetlen/llama-cpp-python#284 (comment). So first, clone the llama-cpp-python repository with the --recurse-submodules option. Then in vendor/llama.cpp, edit CMakeLists.txt and change LLAMA_CUBLAS to ON. Then create a venv in the llama-cpp-python directory and run Only after doing it that way was I able to get it to use the GPU. |
interesting, so I copy that into the somewhat hidden docu |
I'm unable to use the llmware one click installer because I'm using a cloud provider which makes docker a no-go. So I went with the llama_index one. Everything seems to be working, but extremely slowly.
This leads me to believe that the GPU (Quadro RTX 6000) is not being used. I saw that there is a check_gpu_enabled.py so I edited the model path in that and got an output that contains
BLAS = 0
. As a side note, the model that is automatically downloaded is different than what is listed in the readme for llama_index.Activating the environment and running
pip show torch
gives:Automatic1111, ComfyUI, Oobabooga, etc. all work fine with the GPU, so I must be missing something here. Any tips to get it to use the GPU?
The text was updated successfully, but these errors were encountered: