ktransformers_ext/bench #120

wanwujie1 · 2025-01-12T02:31:26Z

When running all scripts under the bench directory and finding that, when executed on the CPU, PyTorch's native fp16 and int8 are faster than ggml_type's fp16, q8_0, q4_k_m, and q2_k, and the performance differences between ggml_type's fp16, q8_0, q4_k_m, and q2_k are also minimal, is this normal?

chenht2022 · 2025-02-10T09:03:51Z

Have you ensured that the number of threads in cpuinfer does not exceed the number of physical CPU cores？You can set it in bench_xxx.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ktransformers_ext/bench #120

ktransformers_ext/bench #120

wanwujie1 commented Jan 12, 2025

chenht2022 commented Feb 10, 2025

ktransformers_ext/bench #120

ktransformers_ext/bench #120

Comments

wanwujie1 commented Jan 12, 2025

chenht2022 commented Feb 10, 2025