Update llama.cpp.rst

Llama.cpp just updated their program names, I've updated the article to use the new name. quantize -> llama-quantize main -> llama-cli simple -> llama-simple [Check out the PR](ggerganov/llama.cpp#7809)
NoumaanAhamed · Jul 3, 2024 · 3dca3c1 · 3dca3c1
1 parent 0c28a94
commit 3dca3c1
Showing 1 changed file with 3 additions and 3 deletions.
diff --git a/docs/source/run_locally/llama.cpp.rst b/docs/source/run_locally/llama.cpp.rst
@@ -55,14 +55,14 @@ Then you can run the model with the following command:
 
 .. code:: bash
 
- ./main -m qwen2-7b-instruct-q5_k_m.gguf -n 512 --color -i -cml -f prompts/chat-with-qwen.txt
+ ./llama-cli -m qwen2-7b-instruct-q5_k_m.gguf -n 512 --color -i -cml -f prompts/chat-with-qwen.txt
 
 where ``-n`` refers to the maximum number of tokens to generate. There
 are other hyperparameters for you to choose and you can run
 
 .. code:: bash
 
- ./main -h
+ ./llama-cli -h
 
 to figure them out.
 
@@ -92,7 +92,7 @@ Then you can run the test with the following command:
 
 .. code:: bash
 
- ./perplexity -m models/7B/ggml-model-q4_0.gguf -f wiki.test.raw
+ ./llama-perplexity -m models/7B/ggml-model-q4_0.gguf -f wiki.test.raw
 
 where the output is like