Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When compiling with cuBLAS, cmake ignores -DLLAMA_AVX2=OFF and builds a binary that attempts to use AVX2 #284

Closed
jamesqh opened this issue May 27, 2023 · 4 comments
Labels
build duplicate This issue or pull request already exists hardware Hardware specific issue windows A Windoze-specific issue

Comments

@jamesqh
Copy link

jamesqh commented May 27, 2023

Expected Behavior

I have a CUDA supporting card and a CPU that doesn't support AVX2, and I want to build llama-cpp-python for CUDA. I can compile the latest llama.cpp in my (x64!!) Visual Studio environment with cmake, and it works, detecting no AVX2 and CUDA out of the box without any arguments and giving me a binary that prints the expected system info n_threads = 4 / 8 | AVX = 1 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 and runs perfectly fine. So theoretically it should be possible.

With llama-cpp-python I run these commands:


(venv) D:\llamastuff\test>set FORCE_CMAKE=1

(venv) D:\llamastuff\test>set CMAKE_ARGS="-DLLAMA_AVX2=OFF -DLLAMA_CUBLAS=ON"

(venv) D:\llamastuff\test>pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir

(venv) D:\llamastuff\test>python -c "from llama_cpp import *; print(llama_print_system_info())"

And I expect to get the same system info as I do for llama.cpp, AVX2=0 and BLAS=1. I also expect to be able to load models and run them!

Current Behavior

Instead I get this system info:


(venv) D:\llamastuff\test>python -c "from llama_cpp import *; print(llama_print_system_info())"
b'AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | '

AVX2=1, and obviously when I try to run any model it just errors out with a Windows Error 0xc000001d because I don't actually have any AVX2 for it to use.

It also does the same thing if I transpose the arguments and use set CMAKE_ARGS="-DLLAMA_CUBLAS=ON -DLLAMA_AVX2=OFF"

Environment and Context

i7-3770, Windows 10 Enterprise 64 bit 10.0.19044, Visual Studio 2022, cl.exe 19.35.32217.1 for x64, cmake version 3.25.1-msvc1, Python 3.10.4, pip 23.1.2

Failure Logs

Here is a verbose compile log:

(venv) D:\llamastuff\test>set FORCE_CMAKE=1

(venv) D:\llamastuff\test>set CMAKE_ARGS="-DLLAMA_AVX2=OFF -DLLAMA_CUBLAS=ON"

(venv) D:\llamastuff\test>pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --verbose
Using pip 23.1.2 from D:\llamastuff\test\venv\lib\site-packages\pip (python 3.10)
Collecting llama-cpp-python
  Downloading llama_cpp_python-0.1.55.tar.gz (1.4 MB)
     ---------------------------------------- 1.4/1.4 MB 397.4 kB/s eta 0:00:00
  Running command pip subprocess to install build dependencies
  Collecting setuptools>=42
    Using cached setuptools-67.8.0-py3-none-any.whl (1.1 MB)
  Collecting scikit-build>=0.13
    Using cached scikit_build-0.17.5-py3-none-any.whl (82 kB)
  Collecting cmake>=3.18
    Using cached cmake-3.26.3-py2.py3-none-win_amd64.whl (33.0 MB)
  Collecting ninja
    Using cached ninja-1.11.1-py2.py3-none-win_amd64.whl (313 kB)
  Collecting distro (from scikit-build>=0.13)
    Using cached distro-1.8.0-py3-none-any.whl (20 kB)
  Collecting packaging (from scikit-build>=0.13)
    Using cached packaging-23.1-py3-none-any.whl (48 kB)
  Collecting tomli (from scikit-build>=0.13)
    Using cached tomli-2.0.1-py3-none-any.whl (12 kB)
  Collecting wheel>=0.32.0 (from scikit-build>=0.13)
    Using cached wheel-0.40.0-py3-none-any.whl (64 kB)
  Installing collected packages: ninja, cmake, wheel, tomli, setuptools, packaging, distro, scikit-build
  Successfully installed cmake-3.26.3 distro-1.8.0 ninja-1.11.1 packaging-23.1 scikit-build-0.17.5 setuptools-67.8.0 tomli-2.0.1 wheel-0.40.0
  Installing build dependencies ... done
  Running command Getting requirements to build wheel
  running egg_info
  writing llama_cpp_python.egg-info\PKG-INFO
  writing dependency_links to llama_cpp_python.egg-info\dependency_links.txt
  writing requirements to llama_cpp_python.egg-info\requires.txt
  writing top-level names to llama_cpp_python.egg-info\top_level.txt
  reading manifest file 'llama_cpp_python.egg-info\SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file 'llama_cpp_python.egg-info\SOURCES.txt'
  Getting requirements to build wheel ... done
  Running command Preparing metadata (pyproject.toml)
  running dist_info
  creating C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info
  writing C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\PKG-INFO
  writing dependency_links to C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\dependency_links.txt
  writing requirements to C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\requires.txt
  writing top-level names to C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\top_level.txt
  writing manifest file 'C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\SOURCES.txt'
  reading manifest file 'C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file 'C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\SOURCES.txt'
  creating 'C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python-0.1.55.dist-info'
  Preparing metadata (pyproject.toml) ... done
Collecting typing-extensions>=4.5.0 (from llama-cpp-python)
  Downloading typing_extensions-4.6.2-py3-none-any.whl (31 kB)
Building wheels for collected packages: llama-cpp-python
  Running command Building wheel for llama-cpp-python (pyproject.toml)


  --------------------------------------------------------------------------------
  -- Trying 'Ninja (Visual Studio 17 2022 x64 v143)' generator
  --------------------------------
  ---------------------------
  ----------------------
  -----------------
  ------------
  -------
  --
  Not searching for unused variables given on the command line.
  -- The C compiler identification is MSVC 19.35.32217.1
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: D:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- The CXX compiler identification is MSVC 19.35.32217.1
  CMake Warning (dev) at C:/Users/Vardogger/AppData/Local/Temp/pip-build-env-zzt_1t3c/overlay/Lib/site-packages/cmake/data/share/cmake-3.26/Modules/CMakeDetermineCXXCompiler.cmake:168 (if):
    Policy CMP0054 is not set: Only interpret if() arguments as variables or
    keywords when unquoted.  Run "cmake --help-policy CMP0054" for policy
    details.  Use the cmake_policy command to set the policy and suppress this
    warning.

    Quoted variables like "MSVC" will no longer be dereferenced when the policy
    is set to NEW.  Since the policy is not set the OLD behavior will be used.
  Call Stack (most recent call first):
    CMakeLists.txt:4 (ENABLE_LANGUAGE)
  This warning is for project developers.  Use -Wno-dev to suppress it.

  CMake Warning (dev) at C:/Users/Vardogger/AppData/Local/Temp/pip-build-env-zzt_1t3c/overlay/Lib/site-packages/cmake/data/share/cmake-3.26/Modules/CMakeDetermineCXXCompiler.cmake:189 (elseif):
    Policy CMP0054 is not set: Only interpret if() arguments as variables or
    keywords when unquoted.  Run "cmake --help-policy CMP0054" for policy
    details.  Use the cmake_policy command to set the policy and suppress this
    warning.

    Quoted variables like "MSVC" will no longer be dereferenced when the policy
    is set to NEW.  Since the policy is not set the OLD behavior will be used.
  Call Stack (most recent call first):
    CMakeLists.txt:4 (ENABLE_LANGUAGE)
  This warning is for project developers.  Use -Wno-dev to suppress it.

  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: D:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Configuring done (2.7s)
  -- Generating done (0.0s)
  -- Build files have been written to: C:/Users/Vardogger/AppData/Local/Temp/pip-install-nrzfuccz/llama-cpp-python_bb1daf7ce27d485d8e2db5b40d01b253/_cmake_test_compile/build
  --
  -------
  ------------
  -----------------
  ----------------------
  ---------------------------
  --------------------------------
  -- Trying 'Ninja (Visual Studio 17 2022 x64 v143)' generator - success
  --------------------------------------------------------------------------------

  Configuring Project
    Working directory:
      C:\Users\Vardogger\AppData\Local\Temp\pip-install-nrzfuccz\llama-cpp-python_bb1daf7ce27d485d8e2db5b40d01b253\_skbuild\win-amd64-3.10\cmake-build
    Command:
      'C:\Users\Vardogger\AppData\Local\Temp\pip-build-env-zzt_1t3c\overlay\Lib\site-packages\cmake\data\bin/cmake.exe' 'C:\Users\Vardogger\AppData\Local\Temp\pip-install-nrzfuccz\llama-cpp-python_bb1daf7ce27d485d8e2db5b40d01b253' -G Ninja '-DCMAKE_MAKE_PROGRAM:FILEPATH=C:\Users\Vardogger\AppData\Local\Temp\pip-build-env-zzt_1t3c\overlay\Lib\site-packages\ninja\data\bin\ninja' -D_SKBUILD_FORCE_MSVC=1930 --no-warn-unused-cli '-DCMAKE_INSTALL_PREFIX:PATH=C:\Users\Vardogger\AppData\Local\Temp\pip-install-nrzfuccz\llama-cpp-python_bb1daf7ce27d485d8e2db5b40d01b253\_skbuild\win-amd64-3.10\cmake-install' -DPYTHON_VERSION_STRING:STRING=3.10.4 -DSKBUILD:INTERNAL=TRUE '-DCMAKE_MODULE_PATH:PATH=C:\Users\Vardogger\AppData\Local\Temp\pip-build-env-zzt_1t3c\overlay\Lib\site-packages\skbuild\resources\cmake' '-DPYTHON_EXECUTABLE:PATH=D:\llamastuff\test\venv\Scripts\python.exe' '-DPYTHON_INCLUDE_DIR:PATH=D:\python\python310\Include' '-DPYTHON_LIBRARY:PATH=D:\python\python310\libs\python310.lib' '-DPython_EXECUTABLE:PATH=D:\llamastuff\test\venv\Scripts\python.exe' '-DPython_ROOT_DIR:PATH=D:\llamastuff\test\venv' -DPython_FIND_REGISTRY:STRING=NEVER '-DPython_INCLUDE_DIR:PATH=D:\python\python310\Include' '-DPython_LIBRARY:PATH=D:\python\python310\libs\python310.lib' '-DPython3_EXECUTABLE:PATH=D:\llamastuff\test\venv\Scripts\python.exe' '-DPython3_ROOT_DIR:PATH=D:\llamastuff\test\venv' -DPython3_FIND_REGISTRY:STRING=NEVER '-DPython3_INCLUDE_DIR:PATH=D:\python\python310\Include' '-DPython3_LIBRARY:PATH=D:\python\python310\libs\python310.lib' '-DCMAKE_MAKE_PROGRAM:FILEPATH=C:\Users\Vardogger\AppData\Local\Temp\pip-build-env-zzt_1t3c\overlay\Lib\site-packages\ninja\data\bin\ninja' '"-DLLAMA_AVX2=OFF' '-DLLAMA_CUBLAS=ON"' -DCMAKE_BUILD_TYPE:STRING=Release '-DLLAMA_AVX2=OFF -DLLAMA_CUBLAS=ON'

  Not searching for unused variables given on the command line.
  CMake Warning:
    Ignoring extra path from command line:

     ""-DLLAMA_AVX2=OFF"


  -- The C compiler identification is MSVC 19.35.32217.1
  -- The CXX compiler identification is MSVC 19.35.32217.1
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: D:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: D:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Found Git: D:/Program Files/Git/cmd/git.exe (found version "2.40.0.windows.1")
  fatal: not a git repository (or any of the parent directories): .git
  fatal: not a git repository (or any of the parent directories): .git
  CMake Warning at vendor/llama.cpp/CMakeLists.txt:109 (message):
    Git repository not found; to enable automatic generation of build info,
    make sure Git is installed and the project is a Git repository.


  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
  -- Looking for pthread_create in pthreads
  -- Looking for pthread_create in pthreads - not found
  -- Looking for pthread_create in pthread
  -- Looking for pthread_create in pthread - not found
  -- Found Threads: TRUE
  -- Found CUDAToolkit: D:/Program Files/CUDA Toolkit/include (found version "11.7.64")
  -- cuBLAS found
  -- The CUDA compiler identification is NVIDIA 11.7.64
  -- Detecting CUDA compiler ABI info
  -- Detecting CUDA compiler ABI info - done
  -- Check for working CUDA compiler: D:/Program Files/CUDA Toolkit/bin/nvcc.exe - skipped
  -- Detecting CUDA compile features
  -- Detecting CUDA compile features - done
  -- CMAKE_SYSTEM_PROCESSOR: AMD64
  -- x86 detected
  -- GGML CUDA sources found, configuring CUDA architecture
  -- Configuring done (19.2s)
  -- Generating done (0.0s)
  -- Build files have been written to: C:/Users/Vardogger/AppData/Local/Temp/pip-install-nrzfuccz/llama-cpp-python_bb1daf7ce27d485d8e2db5b40d01b253/_skbuild/win-amd64-3.10/cmake-build
  -- Install configuration: "Release"
  -- Installing: C:/Users/Vardogger/AppData/Local/Temp/pip-install-nrzfuccz/llama-cpp-python_bb1daf7ce27d485d8e2db5b40d01b253/_skbuild/win-amd64-3.10/cmake-install/llama_cpp/llama.dll
  [1/5] Building C object vendor\llama.cpp\CMakeFiles\ggml.dir\ggml.c.obj
  [2/5] Building CXX object vendor\llama.cpp\CMakeFiles\llama.dir\llama.cpp.obj
  [3/5] Building CUDA object vendor\llama.cpp\CMakeFiles\ggml.dir\ggml-cuda.cu.obj
  ggml-cuda.cu

  [4/5] Linking CXX shared library bin\llama.dll
  [4/5] Install the project...

  copying llama_cpp\llama.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama.py
  copying llama_cpp\llama_cpp.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama_cpp.py
  copying llama_cpp\llama_types.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama_types.py
  copying llama_cpp\__init__.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp\__init__.py
  creating directory _skbuild\win-amd64-3.10\cmake-install\llama_cpp/server
  copying llama_cpp/server\app.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp/server\app.py
  copying llama_cpp/server\__init__.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp/server\__init__.py
  copying llama_cpp/server\__main__.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp/server\__main__.py

  running bdist_wheel
  running build
  running build_py
  creating _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310
  creating _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama_cpp.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama_types.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\__init__.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
  creating _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\server\app.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\server\__init__.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\server\__main__.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server
  copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama.dll -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
  copied 7 files
  running build_ext
  installing to _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel
  running install
  running install_lib
  creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64
  creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel
  creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\llama_cpp
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\llama.dll -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\llama.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\llama_cpp.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\llama_types.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
  creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\llama_cpp\server
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server\app.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp\server
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server\__init__.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp\server
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server\__main__.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp\server
  copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\__init__.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
  copied 8 files
  running install_egg_info
  running egg_info
  writing llama_cpp_python.egg-info\PKG-INFO
  writing dependency_links to llama_cpp_python.egg-info\dependency_links.txt
  writing requirements to llama_cpp_python.egg-info\requires.txt
  writing top-level names to llama_cpp_python.egg-info\top_level.txt
  reading manifest file 'llama_cpp_python.egg-info\SOURCES.txt'
  adding license file 'LICENSE.md'
  writing manifest file 'llama_cpp_python.egg-info\SOURCES.txt'
  Copying llama_cpp_python.egg-info to _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp_python-0.1.55-py3.10.egg-info
  running install_scripts
  copied 0 files
  C:\Users\Vardogger\AppData\Local\Temp\pip-build-env-zzt_1t3c\overlay\Lib\site-packages\wheel\bdist_wheel.py:100: RuntimeWarning: Config variable 'Py_DEBUG' is unset, Python ABI tag may be incorrect
    if get_flag("Py_DEBUG", hasattr(sys, "gettotalrefcount"), warn=(impl == "cp")):
  creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\llama_cpp_python-0.1.55.dist-info\WHEEL
  creating 'C:\Users\Vardogger\AppData\Local\Temp\pip-wheel-oqcck55p\.tmp-mdlva4hn\llama_cpp_python-0.1.55-cp310-cp310-win_amd64.whl' and adding '_skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel' to it
  adding 'llama_cpp/__init__.py'
  adding 'llama_cpp/llama.dll'
  adding 'llama_cpp/llama.py'
  adding 'llama_cpp/llama_cpp.py'
  adding 'llama_cpp/llama_types.py'
  adding 'llama_cpp/server/__init__.py'
  adding 'llama_cpp/server/__main__.py'
  adding 'llama_cpp/server/app.py'
  adding 'llama_cpp_python-0.1.55.dist-info/LICENSE.md'
  adding 'llama_cpp_python-0.1.55.dist-info/METADATA'
  adding 'llama_cpp_python-0.1.55.dist-info/WHEEL'
  adding 'llama_cpp_python-0.1.55.dist-info/top_level.txt'
  adding 'llama_cpp_python-0.1.55.dist-info/RECORD'
  removing _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel
  Building wheel for llama-cpp-python (pyproject.toml) ... done
  Created wheel for llama-cpp-python: filename=llama_cpp_python-0.1.55-cp310-cp310-win_amd64.whl size=289667 sha256=af2b25b6a7ee2f16e2e2bd7e6027f0292ec8d14429f27dc9dbc60b8f0cc79d43
  Stored in directory: C:\Users\Vardogger\AppData\Local\Temp\pip-ephem-wheel-cache-bw7hp7lu\wheels\e3\c8\b2\3b99086798b666cdff1000d0995fd164d3eb9db7b7fe4aca09
Successfully built llama-cpp-python
Installing collected packages: typing-extensions, llama-cpp-python
  Attempting uninstall: typing-extensions
    Found existing installation: typing_extensions 4.6.2
    Uninstalling typing_extensions-4.6.2:
      Removing file or directory d:\llamastuff\test\venv\lib\site-packages\__pycache__\typing_extensions.cpython-310.pyc
      Removing file or directory d:\llamastuff\test\venv\lib\site-packages\typing_extensions-4.6.2.dist-info\
      Removing file or directory d:\llamastuff\test\venv\lib\site-packages\typing_extensions.py
      Successfully uninstalled typing_extensions-4.6.2
Successfully installed llama-cpp-python-0.1.55 typing-extensions-4.6.2

(venv) D:\llamastuff\test>python -c "from llama_cpp import *; print(llama_print_system_info())"
b'AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | '
@gjmulder gjmulder added build hardware Hardware specific issue windows A Windoze-specific issue labels May 27, 2023
@real-limitless
Copy link

I wonder if this issue is what is happening on Linux as well on #272 I keep getting ILLEGAL INSTRUCTION on Linux everytime I build a new library with cuBLAS support.

@real-limitless
Copy link

real-limitless commented May 28, 2023

Is it possible that this is a bug with the llama-cpp instead? ggerganov/llama.cpp#809

@jamesqh
Copy link
Author

jamesqh commented May 28, 2023

Can confirm @chen369's suggestion in #272 lets me compile successfully. Cloning the repo, editing the vendor/llama.cpp CMakeLists.txt to set AVX2 OFF on line 56 and CUBLAS ON on line 70 and doing the pip install+setup from there with FORCE_CMAKE=ON and no other args gives me a working module with AVX = 1 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0

So a workaround exists - thanks @chen369!

If it turns out to be a llama.cpp bug, though, I'll be quite confused. For me llama.cpp cmake automatically detects CUDA and no AVX2 without me telling it anything, it's only when llama-cpp-python is building with -DLLAMA_CUBLAS=ON that it chooses AVX2 and ignores the arg saying not to. Seems more like a problem in the way llama-cpp-python is talking to cmake. But I don't really understand that whole business so maybe it is!

Okay I really don't know how I managed it earlier, I don't think the environment variables would've been affecting it if I accidentally had them activated? But yes after checking again it turns out that cmake building llama.cpp with no arguments does not intelligently detect CUDA and no AVX2. So that's an upstream issue.

But cmake building llama.cpp with -DLLAMA_AVX2=OFF -DLLAMA_CUBLAS=ON works fine, as expected, and CMAKE_ARGS should be getting that across and isn't.

@gjmulder gjmulder added the duplicate This issue or pull request already exists label May 29, 2023
@gjmulder gjmulder closed this as not planned Won't fix, can't repro, duplicate, stale Jun 14, 2023
@Zuzia-Sweetheart
Copy link

i have the same issue, but i use linux, is there a way for me to make llama not to try to use AVX2 with cuBLAS?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build duplicate This issue or pull request already exists hardware Hardware specific issue windows A Windoze-specific issue
Projects
None yet
Development

No branches or pull requests

4 participants