Skip to content

Commit f5ef5cf

Browse files
authored
ggml-cuda : perform cublas mat mul of quantized types as f16 (ggml-org#3412)
* ggml-cuda : perform cublas matrix multiplication of quantized types as fp16 * rename CC_TURING to CC_VOLTA * disable fp16 mat mul completely with multi GPU
1 parent 40e07a6 commit f5ef5cf

File tree

1 file changed

+122
-72
lines changed

1 file changed

+122
-72
lines changed

0 commit comments

Comments
 (0)