when not to use LLAMAFILE CPP flag #10338
Unanswered
freebiesoft
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
By default, the GGML_USE_LLAMAFILE CPP flag is passed to the compiler in the Makefile and CMake files. Justine Tunney remarks "the speedup works best for prompts having fewer than 1,000 tokens" (https://justine.lol/matmul/).
Is there a prompt size from which performance starts to get worse with GGML_USE_LLAMAFILE flag than without?
Beta Was this translation helpful? Give feedback.
All reactions