Error when loading GPTQ-quantized Mistral-Small model with vLLM

Hi, 

Just tried to quantize Mistral-Small-3.1-24B with GPTQ using your [example script](https://github.com/vllm-project/llm-compressor/blob/main/examples/multimodal_vision/mistral3_example.py).

Everything went smoothly but when I tried to load the model with vLLM, I got the following error message 

> ValueError: There is no module or parameter named 'multi_modal_projector.patch_merger.merging_layer.weight_packed' in Mistral3ForConditionalGeneration

I'm using one of the latest commits on the `main` branch of vLLM (`0.9.2rc2.dev269+gf29fd8a7f`) and same for llm-compressor (`llmcompressor-0.6.1.dev35+g53240c63`).

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error when loading GPTQ-quantized Mistral-Small model with vLLM #1652

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error when loading GPTQ-quantized Mistral-Small model with vLLM #1652

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions