Fix missing quant_method value #174

kylesayrs · 2024-09-30T15:56:44Z

Purpose

Fix bug where quantization_config is present but quant_method is missing, causing HF transformers to fail to parse

FAILED tests/llmcompressor/transformers/finetune/test_oneshot_then_finetune.py::TestOneshotThenFinetune::test_oneshot_then_finetune - ValueError: The model's quantization config from the arguments has no `quant_method` attribute. Make sure that the model has been correctly quantized

malformed_config.json

{
  "_name_or_path": "/home/ksayers/.cache/huggingface/hub/models--Xenova--llama2.c-stories15M/snapshots/ccdd47c2dc554aeecd2bb4e713e1c988f206a296",
  "architectures": [
    "LlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "head_dim": 48,
  "hidden_act": "silu",
  "hidden_size": 288,
  "initializer_range": 0.02,
  "intermediate_size": 768,
  "max_position_embeddings": 256,
  "mlp_bias": false,
  "model_type": "llama",
  "num_attention_heads": 6,
  "num_hidden_layers": 6,
  "num_key_value_heads": 6,
  "pretraining_tp": 1,
  "quantization_config": {
    "version": "0.6.0.20240928"
  },
  "rms_norm_eps": 1e-05,
  "rope_scaling": null,
  "rope_theta": 10000.0,
  "tie_word_embeddings": false,
  "torch_dtype": "float32",
  "transformers_version": "4.45.0",
  "use_cache": true,
  "vocab_size": 32000
}

corrected_config.json

{
  "_name_or_path": "/home/ksayers/.cache/huggingface/hub/models--Xenova--llama2.c-stories15M/snapshots/ccdd47c2dc554aeecd2bb4e713e1c988f206a296",
  "architectures": [
    "LlamaForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "head_dim": 48,
  "hidden_act": "silu",
  "hidden_size": 288,
  "initializer_range": 0.02,
  "intermediate_size": 768,
  "max_position_embeddings": 256,
  "mlp_bias": false,
  "model_type": "llama",
  "num_attention_heads": 6,
  "num_hidden_layers": 6,
  "num_key_value_heads": 6,
  "pretraining_tp": 1,
  "quantization_config": {
    "version": "0.6.0.20240928"
    "quant_method": "compressed-tensors"
  },
  "rms_norm_eps": 1e-05,
  "rope_scaling": null,
  "rope_theta": 10000.0,
  "tie_word_embeddings": false,
  "torch_dtype": "float32",
  "transformers_version": "4.45.0",
  "use_cache": true,
  "vocab_size": 32000
}

Changes

Always write metadata (version, quant_method) if either the quantization config or sparsity config is present
Handle logic to unwrap these metadata fields when parsing

Testing

tests/llmcompressor/transformers/finetune/test_oneshot_then_finetune.py no longer raises missing quant_method error

src/compressed_tensors/compressors/model_compressor.py

…thod

initial commit

cb87b91

kylesayrs self-assigned this Sep 30, 2024

kylesayrs mentioned this pull request Sep 30, 2024

Cleanup ModelCompressor, fix reload bugs #172

Closed

dsikka reviewed Oct 1, 2024

View reviewed changes

src/compressed_tensors/compressors/model_compressor.py Show resolved Hide resolved

kylesayrs requested review from rahul-tuli, dsikka and mgoin October 1, 2024 16:12

Merge remote-tracking branch 'origin' into kylesayrs/require-quant_me…

f085b12

…thod

dsikka approved these changes Oct 1, 2024

View reviewed changes

horheynm approved these changes Oct 1, 2024

View reviewed changes

mgoin approved these changes Oct 1, 2024

View reviewed changes

mgoin merged commit 4a09744 into main Oct 1, 2024
1 check passed

mgoin deleted the kylesayrs/require-quant_method branch October 1, 2024 17:36

kylesayrs mentioned this pull request Oct 1, 2024

Fix ModelCompressor parsing in HF Quantizer case #176

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix missing quant_method value #174

Fix missing quant_method value #174

kylesayrs commented Sep 30, 2024 •

edited

Loading

Fix missing quant_method value #174

Fix missing quant_method value #174

Conversation

kylesayrs commented Sep 30, 2024 • edited Loading

Purpose

Changes

Testing

kylesayrs commented Sep 30, 2024 •

edited

Loading