convert: allow using quantized Mistral weight #17889

ngxson · 2025-12-09T17:05:50Z

target model: https://huggingface.co/mistralai/Devstral-Small-2-24B-Instruct-2512

~~need --mistral-format, otherwise it has problem with tokenizer~~ if you have problem with tokenizer conversion, install transformers version 5.0.0

pwilkin · 2025-12-09T17:10:57Z

Traceback (most recent call last):
  File "/devel/tools/llama.cpp/convert_hf_to_gguf.py", line 10635, in <module>
    main()
    ~~~~^^
  File "/devel/tools/llama.cpp/convert_hf_to_gguf.py", line 10629, in main
    model_instance.write()
    ~~~~~~~~~~~~~~~~~~~~^^
  File "/devel/tools/llama.cpp/convert_hf_to_gguf.py", line 660, in write
    self.prepare_tensors()
    ~~~~~~~~~~~~~~~~~~~~^^
  File "/devel/tools/llama.cpp/convert_hf_to_gguf.py", line 2540, in prepare_tensors
    super().prepare_tensors()
    ~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/devel/tools/llama.cpp/convert_hf_to_gguf.py", line 531, in prepare_tensors
    for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
                                 ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
  File "/devel/tools/llama.cpp/convert_hf_to_gguf.py", line 2862, in modify_tensors
    assert data_torch.nelement() == 0 # unused by the model
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

ngxson · 2025-12-09T17:26:46Z

should work now with --mistral-format

Co-authored-by: compilade <compilade@users.noreply.github.com>

bartowski1182 · 2025-12-09T20:22:00Z

confirming this works :)

https://huggingface.co/bartowski/mistralai_Devstral-Small-2-24B-Instruct-2512-GGUF

LGTM!

* convert: allow using quantized Mistral weight * data_torch.ndim * update dequant fn Co-authored-by: compilade <compilade@users.noreply.github.com> --------- Co-authored-by: compilade <compilade@users.noreply.github.com>

convert: allow using quantized Mistral weight

83f3004

This comment was marked as resolved.

Sign in to view

data_torch.ndim

b246a57

loci-dev mentioned this pull request Dec 9, 2025

UPSTREAM PR #17889: convert: allow using quantized Mistral weight auroralabs-loci/llama.cpp#501

Open

update dequant fn

e564478

Co-authored-by: compilade <compilade@users.noreply.github.com>

github-actions bot added the python python script changes label Dec 9, 2025

ngxson marked this pull request as ready for review December 9, 2025 22:44

ngxson requested a review from CISC as a code owner December 9, 2025 22:44

CISC approved these changes Dec 9, 2025

View reviewed changes

ngxson merged commit 9e79b01 into ggml-org:master Dec 10, 2025
6 checks passed

gabe-l-hart mentioned this pull request Dec 10, 2025

feat: llama.cpp bump (17f7f4) for SSM performance improvements ollama/ollama#13408

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert: allow using quantized Mistral weight #17889

convert: allow using quantized Mistral weight #17889

ngxson commented Dec 9, 2025 •

edited

Loading

Uh oh!

pwilkin commented Dec 9, 2025

Uh oh!

This comment was marked as resolved.

ngxson commented Dec 9, 2025 •

edited

Loading

Uh oh!

bartowski1182 commented Dec 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

convert: allow using quantized Mistral weight #17889

convert: allow using quantized Mistral weight #17889

Conversation

ngxson commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pwilkin commented Dec 9, 2025

Uh oh!

This comment was marked as resolved.

ngxson commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bartowski1182 commented Dec 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ngxson commented Dec 9, 2025 •

edited

Loading

ngxson commented Dec 9, 2025 •

edited

Loading