-
Hi all, with this Metal inference (apple silicon only) feature from #1642 I already try it and it seems not plug & play for server, how to modify the server to enable this feature ? Thanks @FSSRepo Sorry to bother you again brother :) |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hello, you need config the project with metal cd llama.cpp/build Config cmake to build with metal option: cmake .. -DLLAMA_METAL=ON
cmake --build . --config Release Run the server with gpu acceleration: ./server -m modelfile -ngl 1 change the ngl to offload a number of layers to the gpu |
Beta Was this translation helpful? Give feedback.
-
@FSSRepo thanks, I never thought it will be that simple, I'll try it out Edit : Wow it just works |
Beta Was this translation helpful? Give feedback.
Hello, you need config the project with metal
cd llama.cpp/build
Config cmake to build with metal option:
cmake .. -DLLAMA_METAL=ON cmake --build . --config Release
Run the server with gpu acceleration:
change the ngl to offload a number of layers to the gpu