Running llama.cpp on android just prints out the question #712

Shreyas-ITB · 2023-04-02T14:16:00Z

I ran llama.cpp on my android phone which has 8 threads and 8GB of ram in which around 7.16 GB is available, that is more than enough to run the 7B Alpaca model on it. But when i run it, it just repeats the question that i provided to it. I am using the ./examples/chat.sh file. Why does it do that? How do i solve it?

The text was updated successfully, but these errors were encountered:

cmp-nct · 2023-04-02T16:48:39Z

Just guessing: after prompt was processed there can be a noticeable delay until the completions start.
Also there are interactive modes that wait for return/enter.

github-actions · 2024-04-11T01:07:11Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

sw added the android label Apr 4, 2023

github-actions bot added the stale label Mar 25, 2024

github-actions bot closed this as completed Apr 11, 2024

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running llama.cpp on android just prints out the question #712

Running llama.cpp on android just prints out the question #712

Shreyas-ITB commented Apr 2, 2023

cmp-nct commented Apr 2, 2023

github-actions bot commented Apr 11, 2024

Running llama.cpp on android just prints out the question #712

Running llama.cpp on android just prints out the question #712

Comments

Shreyas-ITB commented Apr 2, 2023

cmp-nct commented Apr 2, 2023

github-actions bot commented Apr 11, 2024