Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: llama-bench: split-mode flag doesn't recognize argument 'none' #9501

Open
letter-v opened this issue Sep 16, 2024 · 1 comment
Open
Labels
bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)

Comments

@letter-v
Copy link

letter-v commented Sep 16, 2024

What happened?

Example command:

llama-bench -m /path/to/model.gguf -p 0 -ctk q4_0 -ctv q4_0 -fa -sm none -mg 0 -ngl 50 -n 128,256,512

Example output:

error: invalid parameter for argument: none
usage: llama-bench [options]

Setting -sm=none doesn't give any error output, but llama-bench seems to disregard this and offloads to the internal GPU anyway. For what it's worth, using llama-cli and llama-server with -sm none works fine:

llama-server -c 64000 -fa -ctk q4_0 -ctv q4_0 -ngl 50 -m /path/to/model.gguf --port 8188 -sm none -mg 0

Name and Version

llama-bench doesn't seem to have a --version flag, but llama-cli --version prints out the following:

version: 3761 (6262d13e)
built with cc (GCC) 14.2.1 20240801 (Red Hat 14.2.1-1) for x86_64-redhat-linux

What operating system are you seeing the problem on?

Linux

Relevant log output

No response

@letter-v letter-v added bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches) labels Sep 16, 2024
@ggerganov
Copy link
Owner

llama-bench requires to specify -fa [values] in contrast to llama-server that uses just -fa.

Try this:

llama-bench -m /path/to/model.gguf -p 0 -ctk q4_0 -ctv q4_0 -fa 1 -sm none -mg 0 -ngl 50 -n 128,256,512

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed low severity Used to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)
Projects
None yet
Development

No branches or pull requests

2 participants