Skip to content

Commit 7fab66d

Browse files
ikawrakowKawrakow
authored andcommitted
Add ability to use importance matrix for all k-quants (ggml-org#4930)
Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
1 parent 2a728c4 commit 7fab66d

File tree

4 files changed

+462
-16
lines changed

4 files changed

+462
-16
lines changed

examples/quantize/quantize.cpp

+1-1
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ static void usage(const char * executable) {
8282
printf(" --allow-requantize: Allows requantizing tensors that have already been quantized. Warning: This can severely reduce quality compared to quantizing from 16bit or 32bit\n");
8383
printf(" --leave-output-tensor: Will leave output.weight un(re)quantized. Increases model size but may also increase quality, especially when requantizing\n");
8484
printf(" --pure: Disable k-quant mixtures and quantize all tensors to the same type\n");
85-
printf(" --imatrixfile_name: use data in file_name as importance matrix for quant optimizations\n");
85+
printf(" --imatrix file_name: use data in file_name as importance matrix for quant optimizations\n");
8686
printf(" --include-weights tensor_name: use importance matrix for this/these tensor(s)\n");
8787
printf(" --exclude-weights tensor_name: use importance matrix for this/these tensor(s)\n");
8888
printf("Note: --include-weights and --exclude-weights cannot be used together\n");

0 commit comments

Comments
 (0)