Unable to quantize pytorch model using huggingface export

I am trying to quantize GPT2 (pytorch) model directly without any onnx conversions, following is my code and traceback of the error I am getting.

**Code**

```
from transformers import GPT2Tokenizer, GPT2Model
import modelopt.torch.quantization as mtq
from modelopt.torch.export.unified_export_hf import export_hf_checkpoint

tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2Model.from_pretrained('gpt2').to('cuda')

config = {
    "quant_cfg": {
        "*weight_quantizer": {"num_bits": 8, "enable": True},
        "*lm_head*": {"enable": False},
        "default": {"enable": False},
    },
    "algorithm": None,
}

mq = mtq.quantize(model, config)

export_hf_checkpoint(mq, save_modelopt_state=True)
```

**Error**

```
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-9-d1815c270de0> in <cell line: 3>()
      1 from modelopt.torch.export.unified_export_hf import export_hf_checkpoint
      2 
----> 3 export_hf_checkpoint(mq, save_modelopt_state=True)

/usr/local/lib/python3.10/dist-packages/modelopt/torch/export/unified_export_hf.py in export_hf_checkpoint(model, dtype, export_dir, save_modelopt_state)
    379             " torch.save for further inspection."
    380         )
--> 381         raise e

/usr/local/lib/python3.10/dist-packages/modelopt/torch/export/unified_export_hf.py in export_hf_checkpoint(model, dtype, export_dir, save_modelopt_state)
    342     export_dir.mkdir(parents=True, exist_ok=True)
    343     try:
--> 344         post_state_dict, hf_quant_config, per_layer_quantization = _export_hf_checkpoint(
    345             model, dtype
    346         )

/usr/local/lib/python3.10/dist-packages/modelopt/torch/export/unified_export_hf.py in _export_hf_checkpoint(model, dtype)
    148     layer_pool = {
    149         f"model.layers.{name}": sub_module
--> 150         for name, sub_module in model.model.layers.named_modules()
    151     }
    152     # NOTE: Speculative decoding models have extra modules that may be quantized

/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in __getattr__(self, name)
   1929             if name in modules:
   1930                 return modules[name]
-> 1931         raise AttributeError(
   1932             f"'{type(self).__name__}' object has no attribute '{name}'"
   1933         )

AttributeError: 'GPT2Model' object has no attribute 'model'
```

Please let me know if I am missing something here, and if at all things can work without ONNX conversion @jingyu-ml @kevalmorabia97  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to quantize pytorch model using huggingface export #161

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to quantize pytorch model using huggingface export #161

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions