Llama.cpp a lot slower when using cmake compared to using w64devkit #594

v4lentin1879 · 2023-10-26T15:20:06Z

v4lentin1879
Oct 26, 2023

I'm running mistral-orca on llama.cpp on my mac. The application I am using it for runs on macos and windows. It's a typescript application using the node-llama-cpp library. Due to that library I need to build the binaries using cmake.

Now my issue is that the llama.cpp binaries are a lot slower when using cmake compared to using w64devkit on windows. I'm not even at the step where I wrap the node-llama-cpp library around it.

Does anyone know the reason why cmake makes it this slow and w64devkit doesn't? Is there a possibility to fix this using a flag or so? I am running it on a 2019 macbook pro with an i7, 6 Core. I am using CPU inference only.

w64devkit:
llama_print_timings: load time = 2789.31 ms
llama_print_timings: sample time = 7.55 ms / 18 runs ( 0.42 ms per token, 2383.16 tokens per second)
llama_print_timings: prompt eval time = 1925.06 ms / 20 tokens ( 96.25 ms per token, 10.39 tokens per second)
llama_print_timings: eval time = 8256.93 ms / 18 runs ( 458.72 ms per token, 2.18 tokens per second)
llama_print_timings: total time = 23842.73 ms

cmake:
llama_print_timings: load time = 4133.27 ms
llama_print_timings: sample time = 5.71 ms / 18 runs ( 0.32 ms per token, 3153.47 tokens per second)
llama_print_timings: prompt eval time = 25917.85 ms / 19 tokens ( 1364.10 ms per token, 0.73 tokens per second)
llama_print_timings: eval time = 43493.77 ms / 18 runs ( 2416.32 ms per token, 0.41 tokens per second)
llama_print_timings: total time = 74989.23 ms

Thanks a lot for your help!

Green-Sky · 2023-10-31T20:21:39Z

Green-Sky
Oct 31, 2023

Did you make sure you are building in Release mode? (CMAKE_BUILD_TYPE=Release)

2 replies

v4lentin1879 Nov 2, 2023
Author

Yes. I run the following commands:

cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build . --config Release

Green-Sky Nov 2, 2023

also, looks like you are comparing two things, that are not comparable...
w64devkit - is a dev distribution
cmake - is a project build tool. it is, in a way comparable to make (gmake), but it lowers to make or ninja or msbuild ....

so you are probably comparing the Makefile to the CMakeLists.txt file in the llama.cpp repo, while both are using w64devkit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama.cpp a lot slower when using cmake compared to using w64devkit #594

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Llama.cpp a lot slower when using cmake compared to using w64devkit #594

v4lentin1879 Oct 26, 2023

Replies: 1 comment · 2 replies

Green-Sky Oct 31, 2023

v4lentin1879 Nov 2, 2023 Author

Green-Sky Nov 2, 2023

v4lentin1879
Oct 26, 2023

Replies: 1 comment 2 replies

Green-Sky
Oct 31, 2023

v4lentin1879 Nov 2, 2023
Author