Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: RPC server doesn't load GPU if I use Vulkan #8536

Closed
metal3d opened this issue Jul 17, 2024 · 5 comments · Fixed by #9714
Closed

Bug: RPC server doesn't load GPU if I use Vulkan #8536

metal3d opened this issue Jul 17, 2024 · 5 comments · Fixed by #9714
Assignees
Labels
bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)

Comments

@metal3d
Copy link
Contributor

metal3d commented Jul 17, 2024

What happened?

I compiled llamacpp with Vulkan backend. The "rpc-server" binary is linked to libvulkan but it never uses my GPUs. While "llama-cli" is OK.

Name and Version

version: 3384 (4e24cff)
built with cc (GCC) 14.1.1 20240701 (Red Hat 14.1.1-7) for x86_64-redhat-linux

What operating system are you seeing the problem on?

Linux

Relevant log output

./rpc-server
create_backend: using CPU backend
Starting RPC server on 0.0.0.0:50052, backend memory: 23967 MB


ldd ./rpc-server
        linux-vdso.so.1 (0x00007f18759f2000)
        libllama.so => /home/metal3d/Projects/ML/llama.cpp/build-rpc/src/libllama.so (0x00007f1875879000)
        libggml.so => /home/metal3d/Projects/ML/llama.cpp/build-rpc/ggml/src/libggml.so (0x00007f1875400000)
        libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f1875000000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f187531c000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f187582b000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f1874e0f000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f18759f4000)
        libvulkan.so.1 => /lib64/libvulkan.so.1 (0x00007f18757af000)
        libgomp.so.1 => /lib64/libgomp.so.1 (0x00007f18752c6000)
@metal3d metal3d added bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches) labels Jul 17, 2024
@rgerganov
Copy link
Collaborator

The Vulkan backend is using the tensor->extra property which is not supported by the RPC backend. There is the same issues with the SYCL backend (PR #7682)

@xvim
Copy link

xvim commented Sep 4, 2024

is any plan to support vulkan when using RPC backend?

@rgerganov
Copy link
Collaborator

I will try to find out how to avoid using tensor->extra in Vulkan. Maybe adding a global map ggml_tensor -> ggml_tensor_extra_gpu

@rgerganov rgerganov self-assigned this Sep 5, 2024
@slaren
Copy link
Collaborator

slaren commented Sep 5, 2024

The extras in the Vulkan backend are not really necessary, all the data that they contain is already present (directly or indirectly) in other fields of the tensor. At this point I think they are only there for legacy reasons, but could be removed with a refactor.

rgerganov added a commit to rgerganov/llama.cpp that referenced this issue Sep 10, 2024
This patch allows using the Vulkan backend with the RPC backend as
tensor->extra is no longer used.

Ref: ggerganov#8536
ggerganov pushed a commit that referenced this issue Oct 2, 2024
* vulkan : do not use tensor->extra

This patch allows using the Vulkan backend with the RPC backend as
tensor->extra is no longer used.

Ref: #8536

* Adapt GGML_VULKAN_CHECK_RESULTS to extra removal (#2)

---------

Co-authored-by: 0cc4m <picard12@live.de>
rgerganov added a commit to rgerganov/llama.cpp that referenced this issue Oct 2, 2024
@rgerganov rgerganov mentioned this issue Oct 2, 2024
4 tasks
@metal3d
Copy link
Contributor Author

metal3d commented Oct 3, 2024 via email

dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this issue Oct 29, 2024
* vulkan : do not use tensor->extra

This patch allows using the Vulkan backend with the RPC backend as
tensor->extra is no longer used.

Ref: ggerganov#8536

* Adapt GGML_VULKAN_CHECK_RESULTS to extra removal (Mobile-Artificial-Intelligence#2)

---------

Co-authored-by: 0cc4m <picard12@live.de>
dsx1986 pushed a commit to dsx1986/llama.cpp that referenced this issue Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants