-
Notifications
You must be signed in to change notification settings - Fork 10.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make IQ1_M work for QK_K = 64 #6327
Conversation
@ggerganov Perhaps you should disable the nix build? I don't know about you, but for me a check running for 6 hours and eventually cancelled on every commit does not make much sense. If nothing else, lets have some merci with our planet. |
@SomeoneSerge Is there something to be done to speed-up the builds? AFAICT, with the recent workflow concurrency changes (#6243) all Nix builds are bound to be cancelled since the chance of committing something to |
@ggerganov thanks for the heads-up; I noticed a few cancelled builds but haven't got around to investigate this. I opened a tracking issue for now: #6346 |
If we don't want nix builds to fail on
|
* iq1_m: make it work for QK_K = 64 (WIP) * iq1_m: make it work for QK_K = 64 (scalar and AVX2) * iq1_m: QK_K = 64 seems to work on Metal and ARM_NEON --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
* iq1_m: make it work for QK_K = 64 (WIP) * iq1_m: make it work for QK_K = 64 (scalar and AVX2) * iq1_m: QK_K = 64 seems to work on Metal and ARM_NEON --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
* iq1_m: make it work for QK_K = 64 (WIP) * iq1_m: make it work for QK_K = 64 (scalar and AVX2) * iq1_m: QK_K = 64 seems to work on Metal and ARM_NEON --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
As with all other i-quants, AVX2, ARM_NEON, CPU scalar, Metal. CUDA will come later.