Will llama.cpp take advantage of FP4 compute of the new chips? #11517

Sehyo · 2025-01-30T16:32:49Z

Sehyo
Jan 30, 2025

Will llama.cpp take advantage of FP4 compute of the new chips? E.g. Blackwell GPUs.

ExtReMLapin · 2025-04-12T07:11:07Z

I could be wrong but if it uses MMQ, isn't that automatically using the specialized instructions as soon as you have cuda 12.8 installed ?

2 replies

@JohannesGaessler might be able to enlighten us on that

FP4 (hardware) is currently not used at any point in the llama.cpp/GGML code and whether or not the feature is available will make no difference.