Skip to content

Enabling server for Metal inference (apple silicon only) #1722

Closed Answered by FSSRepo
x4080 asked this question in Q&A
Discussion options

You must be logged in to vote

Hello, you need config the project with metal

cd llama.cpp/build

Config cmake to build with metal option:

cmake .. -DLLAMA_METAL=ON
cmake --build . --config Release

Run the server with gpu acceleration:

./server -m modelfile -ngl 1

change the ngl to offload a number of layers to the gpu

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by x4080
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants