Skip to content

Commit

Permalink
Update gguf.rst
Browse files Browse the repository at this point in the history
  • Loading branch information
jklj077 authored Jul 4, 2024
1 parent 3fcad1b commit d439e04
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/source/quantization/gguf.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ of quantizing the model to 4 bits is shown below:

.. code:: bash
./quantize models/7B/qwen2-7b-instruct-fp16.gguf models/7B/qwen2-7b-instruct-q4_0.gguf q4_0
./llama-quantize models/7B/qwen2-7b-instruct-fp16.gguf models/7B/qwen2-7b-instruct-q4_0.gguf q4_0
where we use ``q4_0`` for the 4-bit quantization. Until now, you have
finished quantizing a model to 4 bits and putting it into a GGUF file,
Expand Down Expand Up @@ -79,7 +79,7 @@ below:

.. code:: bash
./quantize models/7B/qwen2-7b-instruct-fp16.gguf models/7B/qwen2-7b-instruct-q2_k.gguf q2_k
./llama-quantize models/7B/qwen2-7b-instruct-fp16.gguf models/7B/qwen2-7b-instruct-q2_k.gguf q2_k
We now provide GGUF models in the following quantization levels:
``q2_k``, ``q3_k_m``, ``q4_0``, ``q4_k_m``, ``q5_0``, ``q5_k_m``,
Expand Down

0 comments on commit d439e04

Please sign in to comment.