Skip to content

Commit

Permalink
Update llama.cpp.rst (#739)
Browse files Browse the repository at this point in the history
* Update llama.cpp.rst

Llama.cpp just updated their program names, I've updated the article to use the new name.

quantize -> llama-quantize
main -> llama-cli
simple -> llama-simple

[Check out the PR](ggerganov/llama.cpp#7809)

* Updated llama.cpp.rst ( removed -cml )

* Update llama.cpp.rst

---------

Co-authored-by: Ren Xuancheng <jklj077@users.noreply.github.com>
  • Loading branch information
NoumaanAhamed and jklj077 authored Jul 4, 2024
1 parent 0c28a94 commit 3fcad1b
Showing 1 changed file with 7 additions and 3 deletions.
10 changes: 7 additions & 3 deletions docs/source/run_locally/llama.cpp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,14 +55,18 @@ Then you can run the model with the following command:

.. code:: bash
./main -m qwen2-7b-instruct-q5_k_m.gguf -n 512 --color -i -cml -f prompts/chat-with-qwen.txt
./llama-cli -m qwen2-7b-instruct-q5_k_m.gguf \
-n 512 -co -i -if -f prompts/chat-with-qwen.txt \
--in-prefix "<|im_start|>user\n" \
--in-suffix "<|im_end|>\n<|im_start|>assistant\n" \
-ngl 80 -fa
where ``-n`` refers to the maximum number of tokens to generate. There
are other hyperparameters for you to choose and you can run

.. code:: bash
./main -h
./llama-cli -h
to figure them out.

Expand Down Expand Up @@ -92,7 +96,7 @@ Then you can run the test with the following command:

.. code:: bash
./perplexity -m models/7B/ggml-model-q4_0.gguf -f wiki.test.raw
./llama-perplexity -m <gguf_path> -f wiki.test.raw
where the output is like

Expand Down

0 comments on commit 3fcad1b

Please sign in to comment.