Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] Port to windows #22

Open
4 tasks done
chraac opened this issue Feb 6, 2025 · 10 comments · May be fixed by #24
Open
4 tasks done

[feat] Port to windows #22

chraac opened this issue Feb 6, 2025 · 10 comments · May be fixed by #24
Assignees

Comments

@chraac
Copy link
Owner

chraac commented Feb 6, 2025

Tasks

  • add interface for loading dynamic library
  • implement linux so loader
  • implement win dll loader
  • fix compiling error in win

For detailed build instructions, please refer to #22 (comment)

@chraac chraac converted this from a draft issue Feb 6, 2025
@chraac chraac self-assigned this Feb 6, 2025
@chraac chraac moved this from Backlog to In progress in qnn backend Feb 6, 2025
@chraac
Copy link
Owner Author

chraac commented Feb 8, 2025

@chraac If it is possible to build llama.cpp QNN backend for laptop? I have Snapdragon X Elite laptop chip which has NPU. Now I check the CMakeLists.txt in ggml-qnn folder and I found QNN backend build only supports Android device.

Hi David, currently, our QNN backend only supports android devices. I understand there are Qualcomm devices that run Windows, and after reviewing the source code, I've identified some modifications needed for win support:

  1. win uses different APIs from dlopen/dlclose for dynamic library loading. we'll need to create an abstraction layer and implement the win-spec version.
  2. while I can work on the implementation and verify it compiles on my machine, but, don't have a Snapdragon laptop for testing. so would need your help verify the functionality. we could handle this work in a separate branch or issue.
  3. created a backlog item on github project to track the work, can have a look: https://github.com/users/chraac/projects/2/views/1

For sue, willing to help verify the functionality! I'm also deepdiving llama.cpp QNN backend support, and I'm willing to help support more ops.

Originally posted by @Davidqian123 in #14

thought we can move our discussion here, i've created new issues and branch

@chraac
Copy link
Owner Author

chraac commented Feb 11, 2025

Status update:
Now it can successfully compile at windows with vs2022 and arm64-windows-llvm-debug profile

...
  [136/186] Linking CXX executable bin\test-tokenizer-0.exe
  [137/186] Linking CXX executable bin\test-log.exe
  [138/186] Linking CXX executable bin\test-gguf.exe
  [139/186] Building CXX object examples/run/CMakeFiles/llama-run.dir/run.cpp.obj
  [140/186] Linking CXX executable bin\test-backend-ops.exe
  [141/186] Linking CXX executable bin\test-quantize-fns.exe
  [142/186] Linking CXX executable bin\test-rope.exe
  [143/186] Linking CXX executable bin\test-barrier.exe
  [144/186] Linking CXX executable bin\test-model-load-cancel.exe
  [145/186] Linking CXX executable bin\test-quantize-perf.exe
  [146/186] Linking CXX executable bin\test-arg-parser.exe
  [147/186] Linking CXX executable bin\llama-lookup-merge.exe
  [148/186] Linking CXX executable bin\test-autorelease.exe
  [149/186] Linking CXX executable bin\llama-infill.exe
  [150/186] Linking CXX executable bin\llama-gguf-split.exe
  [151/186] Linking CXX executable bin\llama-embedding.exe
  [152/186] Linking CXX executable bin\test-chat-template.exe
  [153/186] Linking CXX executable bin\llama-batched-bench.exe
  [154/186] Linking CXX executable bin\llama-gritlm.exe
  [155/186] Linking CXX executable bin\llama-batched.exe
  [156/186] Linking CXX executable bin\llama-lookup-stats.exe
  [157/186] Linking CXX executable bin\llama-lookahead.exe
  [158/186] Linking CXX executable bin\llama-imatrix.exe
  [159/186] Linking CXX executable bin\llama-eval-callback.exe
  [160/186] Linking CXX executable bin\llama-lookup-create.exe
  [161/186] Linking CXX executable bin\llama-quantize.exe
  [162/186] Linking CXX executable bin\llama-perplexity.exe
  [163/186] Linking CXX executable bin\llama-cli.exe
  [164/186] Linking CXX executable bin\llama-passkey.exe
  [165/186] Linking CXX executable bin\llama-bench.exe
  [166/186] Linking CXX executable bin\llama-parallel.exe
  [167/186] Linking CXX executable bin\llama-lookup.exe
  [168/186] Linking CXX executable bin\llama-retrieval.exe
  [169/186] Linking CXX executable bin\llama-save-load-state.exe
  [170/186] Linking CXX executable bin\llama-speculative-simple.exe
  [171/186] Linking CXX executable bin\llama-vdot.exe
  [172/186] Linking CXX executable bin\llama-q8dot.exe
  [173/186] Linking CXX executable bin\llama-speculative.exe
  [174/186] Linking CXX executable bin\llama-convert-llama2c-to-ggml.exe
  [175/186] Linking CXX executable bin\llama-run.exe
  [176/186] Linking CXX executable bin\llama-tokenize.exe
  [177/186] Linking CXX executable bin\llama-export-lora.exe
  [178/186] Linking CXX executable bin\llama-tts.exe
  [179/186] Linking CXX executable bin\llama-cvector-generator.exe
  [180/186] Linking CXX executable bin\llama-gen-docs.exe
  [181/186] Linking CXX executable bin\llama-minicpmv-cli.exe
  [182/186] Linking CXX executable bin\llama-llava-cli.exe
  [183/186] Linking CXX executable bin\llama-qwen2vl-cli.exe
  [184/186] Generating index.html.gz.hpp
  [185/186] Building CXX object examples/server/CMakeFiles/llama-server.dir/server.cpp.obj
  [186/186] Linking CXX executable bin\llama-server.exe

Rebuild succeeded.

@Davidqian123
Copy link

What command did you use to compile? Did you set -DGGML_QNN=ON

@chraac
Copy link
Owner Author

chraac commented Feb 12, 2025

What command did you use to compile? Did you set -DGGML_QNN=ON

Yeah, you can try it with latest vs2022, make sure to install clang toolchain, and add following variable at the CMakePresets.json

{
    "name": "arm64-windows-llvm", "hidden": true,
    "architecture": { "value": "arm64",    "strategy": "external" },
    "toolset":      { "value": "host=x64", "strategy": "external" },
    "cacheVariables": {
        "CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/arm64-windows-llvm.cmake",
+        "GGML_QNN": "ON",
+        "GGML_QNN_SDK_PATH": "<path to your Qualcomm AI Engine Direct SDK>",
+        "BUILD_SHARED_LIBS": "OFF"
    }
},

@chraac
Copy link
Owner Author

chraac commented Feb 15, 2025

How to build on Windows

  1. Download Qualcomm AI Engine Direct SDK, from here, and then extract it into a folder

  2. Install the latest Visual Studio, make sure clang toolchain and 'cmake' are installed
    Image
    Image

  3. Launch vs2022, tap Continue without code, and then in File menu, select Open -> CMake, in file open dialog, navigate to llama.cpp root directory, select CMakeLists.txt

  4. Edit llama.cpp\CMakePresets.json, add following line to config arm64-windows-llvm

    {
        "name": "arm64-windows-llvm", "hidden": true,
        "architecture": { "value": "arm64",    "strategy": "external" },
        "toolset":      { "value": "host=x64", "strategy": "external" },
        "cacheVariables": {
    -        "CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/arm64-windows-llvm.cmake"
    +        "CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/arm64-windows-llvm.cmake",
    +        "GGML_QNN": "ON",
    +        "GGML_QNN_SDK_PATH": "<path to your Qualcomm AI Engine Direct SDK, like x:/ml/qnn_sdk/qairt/2.31.0.250130/>",
    +        "BUILD_SHARED_LIBS": "OFF"
        }
    },
  5. Select config arm64-windows-llvm-debug
    Image

  6. In Build menu, tap Build All, output file are located at build-arm64-windows-llvm-debug\bin\

  7. Please copy those files from sdk lib directory (at qairt\2.31.0.250130\lib\aarch64-windows-msvc\) to the output directory before run:

QnnSystem.dll
QnnCpu.dll
QnnGpu.dll
QnnHtp.dll
QnnHtp*.dll

@liurunliang
Copy link

liurunliang commented Feb 16, 2025

Hi 大佬,我在我的X Elite笔记本上试着跑了下,看起来Windows下libcdsprpc.so好像叫libcdsprpc.dll
qnn-lib.cpp
QNNSDK里面的 2.31.0.250130\examples\Genie\Genie\src\qualla\engines\qnn-api\RpcMem.cpp 文件有提到这个路径

bool RpcMem::initialize() {
  // On Android, 32-bit and 64-bit libcdsprpc.so can be found at /vendor/lib and /vendor/lib64
  // respectively. On Windows, it's installed into something like this
  //      c:\Windows\System32\DriverStore\FileRepository\qcnspmcdm8380.inf_arm64_30b9cc995571de6a\libcdsprpc.dll
#ifdef _WIN32
  const char* dsprpc_so = "libcdsprpc.dll";
#else
  const char* dsprpc_so = "libcdsprpc.so";
#endif

文档提到的驱动路径里面确实有这个
Image
运行前需要把这个路径下的libcdsprpc.dll复制到同目录下
还有就是启动的时候找不到QNNSystem.dll

[ggml_backend_qnn_init_with_device_context, 338]: extend_lib_search_path is nullptr, will use / as default
[ggml_backend_qnn_init_with_device_context, 343]: device qnn-gpu
[ggml_backend_qnn_init_with_device_context, 344]: extend_lib_search_path /
[qnn_init, 49]: enter qnn_init
[load_system, 314]: system_lib_path:/QnnSystem.dll
[load_system, 318]: can not load QNN library /QnnSystem.dll, error: (null)
[qnn_init, 53]: failed to load QNN system lib
[ggml_backend_qnn_init_with_device_context, 380]: failed to init qnn backend qnn-gpu
llama_init_from_model: failed to initialize qnn-gpu backend

好像是extend_lib_search_path 默认值"/"在Windows下无法正常工作,自己编译运行的话需要在CMakePresets.json多加一行QNN_DEFAULT_LIB_SEARCH_PATH的位置

{
    "name": "arm64-windows-llvm", "hidden": true,
    "architecture": { "value": "arm64",    "strategy": "external" },
    "toolset":      { "value": "host=x64", "strategy": "external" },
    "cacheVariables": {
-        "CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/arm64-windows-llvm.cmake"
+        "CMAKE_TOOLCHAIN_FILE": "${sourceDir}/cmake/arm64-windows-llvm.cmake",
+        "GGML_QNN": "ON",
+        "GGML_QNN_SDK_PATH": "<path to your Qualcomm AI Engine Direct SDK, like x:/ml/qnn_sdk/qairt/2.31.0.250130/>",
+        "BUILD_SHARED_LIBS": "OFF"
+        "QNN_DEFAULT_LIB_SEARCH_PATH": "<path to your Qualcomm AI Engine Direct SDK's aarch64-windows-msvc path, like x:/ml/qnn_sdk/qairt/2.31.0.250130/lib/aarch64-windows-msvc>",
    }
},

@chraac
Copy link
Owner Author

chraac commented Feb 16, 2025

Thank you @liurunliang for testing, many help!

Hi 大佬,我在我的X Elite笔记本上试着跑了下,看起来Windows下libcdsprpc.so好像叫libcdsprpc.dll

For the name of libcdsprpc.dll, I've push a change to fix it, please update to latest version

还有就是启动的时候找不到QNNSystem.dll

Yeah, thank you again for shout out this, forgot to say, we should copy related qnn shared library into the binary folder, gonna update the instructions to list all the dlls we need

好像是extend_lib_search_path 默认值"/"在Windows下无法正常工作,自己编译运行的话需要在CMakePresets.json多加一行QNN_DEFAULT_LIB_SEARCH_PATH的位置

Thats what i gonna looks into next, originally for win, the QNN_DEFAULT_LIB_SEARCH_PATH is designed to be an empty string, so here gonna take a deep look to figure out the dll search path here.

@chraac
Copy link
Owner Author

chraac commented Feb 19, 2025

好像是extend_lib_search_path 默认值"/"在Windows下无法正常工作,自己编译运行的话需要在CMakePresets.json多加一行QNN_DEFAULT_LIB_SEARCH_PATH的位置

Hi @liurunliang, I've push some fix to the dev-run-on-win, wondering may be you can have another try if have time, thanks!

@chraac chraac linked a pull request Feb 20, 2025 that will close this issue
@chraac chraac linked a pull request Feb 20, 2025 that will close this issue
@liurunliang
Copy link

liurunliang commented Feb 22, 2025

Hi @chraac It works great! thank you~

Image

@chraac
Copy link
Owner Author

chraac commented Feb 22, 2025

Hi @chraac It works great! thank you~

Image

nice nice! thanks! will merge this into dev-refactoring soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In progress
Development

Successfully merging a pull request may close this issue.

3 participants