bartowski/DeepSeek-Coder-V2-Lite-Instruct-GGUF wont run #976
-
Is there some special way to run bartowski/DeepSeek-Coder-V2-Lite-Instruct-GGUF? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
What backend are you trying to use? Right now there is a bug that prevents it from working correctly on any backend except the cuda/cublas one, see ggerganov#7118 (comment). I have a partial fix for it, that will be in the next version. Also you cannot use any quant of Q3_K_S and below as they are using the wrong tensor types of IQ4_NL which is not supported on any non-cuda backend. I actually asked ikawrakow about it but didn't get a reply - @bartowski do you know why there are IQ4_NL tensors in your regular Q2_K and Q3_K_S quants? I do have older Q2Ks and Q3_K_S quants that work fine. |
Beta Was this translation helpful? Give feedback.
-
Pasting my reply from HF: I think it has something to do with the shape of the tensor and not being divisible by 256 Same thing on other people's quants too: Here's a comment from Slaren explaining when a similar thing happened with Qwen2 and my p100: |
Beta Was this translation helpful? Give feedback.
Pasting my reply from HF:
I think it has something to do with the shape of the tensor and not being divisible by 256
Same thing on other people's quants too:
https://huggingface.co/mradermacher/DeepSeek-Coder-V2-Lite-Instruct-GGUF/tree/main?show_file_info=DeepSeek-Coder-V2-Lite-Instruct.Q2_K.gguf
Here's a comment from Slaren explaining when a similar thing happened with Qwen2 and my p100:
ggerganov#7805 (comment)