when not to use LLAMAFILE CPP flag #10338

freebiesoft · 2024-11-16T16:20:05Z

freebiesoft
Nov 16, 2024

By default, the GGML_USE_LLAMAFILE CPP flag is passed to the compiler in the Makefile and CMake files. Justine Tunney remarks "the speedup works best for prompts having fewer than 1,000 tokens" (https://justine.lol/matmul/).

Is there a prompt size from which performance starts to get worse with GGML_USE_LLAMAFILE flag than without?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

when not to use LLAMAFILE CPP flag #10338

{{title}}

Replies: 0 comments

Select a reply

when not to use LLAMAFILE CPP flag #10338

freebiesoft Nov 16, 2024

Replies: 0 comments

freebiesoft
Nov 16, 2024