Closed
Description
- I installed
llama-cpp-python
successfully with CUBLAS on my system with the following command:
CUDACXX=/usr/local/cuda/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir
- When trying to start using it, a severe crash is happening on importing the module:
$ python
Python 3.10.10 (main, Mar 21 2023, 18:45:11) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from llama_cpp import Llama
Illegal instruction (core dumped)
- This also effects using text-generation-webui with CUBLAS on, so I cannot load any llama.cpp model with it.
System: Ubuntu 20.04, RTX 3060 12 GB, 64 GB RAM, CUDA 12.1.105