ModelHubMixin config support throws error #2379

joelburget · 2024-07-07T00:03:04Z

Describe the bug

I created a notebook which tries to use PyTorchModelHubMixin in a way very similar to that described in the docs and #2001. As you can see, when I try to instantiate it with MyModel.from_pretrained I get AttributeError: 'dict' object has no attribute 'hidden_size'. AutoModel.from_pretrained fails with AttributeError: 'NoneType' object has no attribute 'get'. It's not clear with either error what the root cause is.

Reproduction

https://gist.github.com/joelburget/623a13c71129044c661009a56b2cf46d is self-contained

Logs

No response

System info

- huggingface_hub version: 0.22.2
- Platform: macOS-14.5-x86_64-i386-64bit
- Python version: 3.11.9
- Running in iPython ?: Yes
- iPython shell: ZMQInteractiveShell
- Running in notebook ?: Yes
- Running in Google Colab ?: No
- Token path ?: /Users/joel/.cache/huggingface/token
- Has saved token ?: True
- Who am I ?: joelb
- Configured git credential helpers: osxkeychain, store
- FastAI: N/A
- Tensorflow: N/A
- Torch: 2.1.2
- Jinja2: 3.1.3
- Graphviz: N/A
- keras: N/A
- Pydot: N/A
- Pillow: 10.3.0
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: 1.26.4
- pydantic: N/A
- aiohttp: 3.9.3
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /Users/joel/.cache/huggingface/hub
- HF_ASSETS_CACHE: /Users/joel/.cache/huggingface/assets
- HF_TOKEN_PATH: /Users/joel/.cache/huggingface/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10

The text was updated successfully, but these errors were encountered:

Wauplin · 2024-07-08T13:55:12Z

Hi @joelburget, the problem in your example is that you are serializing the config object into a dictionary (config=config.to_dict()). Therefore, when you reload it you are getting a dictionary. The config dictionary is forwarded by your class to GPTNeoBlock which expects a transformers.configuration_utils.PretrainedConfig class. This is why you get AttributeError: 'dict' object has no attribute 'hidden_size'. Your notebook is not very similar to the docs you've linked from what I can see. In the docs, we showcase that having proper parameters with type annotations like hidden_size: int (instead of config) works great. You can have a look to this guide for more details.

In general, I'm not sure to understand what you are trying to achieve. PyTorchModelHubMixin is a class to facilitate exporting/importing torch models. It has no link to transformers. If you want to customize/adapt a transformers model, it's better to check there how to do it :)

joelburget · 2024-07-08T23:59:29Z

Hi @Wauplin, thanks for looking into this.

the problem in your example is that you are serializing the config object into a dictionary

I first tried model.push_to_hub("joelb/my-awesome-model", config=config), but this fails with TypeError: Object of type GPTNeoConfig is not JSON serializable.

In the docs, we showcase that having proper parameters with type annotations like hidden_size: int (instead of config) works great... If you want to customize/adapt a transformers model, it's better to check there how to do it

You're right, I was basing this off the fact that all transformers models take a config rather than ints only (as that's all the linked docs show). I can check over at the transformers repo.

joelburget · 2024-07-09T00:15:19Z

For anyone else trying to do something similar:

hf_model = AutoModel.from_pretrained("EleutherAI/gpt-neo-125M")

class MyModel(GPTNeoModel):
    def __init__(self, config):
        super().__init__(config)
        self.h = nn.ModuleList([GPTNeoBlock(config, 0)])

config = AutoConfig.from_pretrained("EleutherAI/gpt-neo-125M")
config.num_layers = 1
config.attention_layers = config.attention_layers[:1]
config.attention_types = [[['global'], 1]]

model = MyModel(config)
model.push_to_hub("joelb/my-awesome-model", config=config)

Wauplin · 2024-07-09T07:03:17Z

I first tried model.push_to_hub("joelb/my-awesome-model", config=config), but this fails with TypeError: Object of type GPTNeoConfig is not JSON serializable.

Glad you've found a workaround for your use case @joelburget :) Just for your info, this issue tells you that the mixin don't know how to serialize your GPTNeoConfig object as a JSON. What you can do is to provide an encoder and a decoder methods when defining your class as explained in this section of the guide. However your solution that do not involve PyTorchHubMixin is much better as it relies only on the transformers library which is better suited to handle transformers objects.

joelburget added the bug Something isn't working label Jul 7, 2024

joelburget closed this as completed Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ModelHubMixin config support throws error #2379

ModelHubMixin config support throws error #2379

joelburget commented Jul 7, 2024 •

edited

Loading

Wauplin commented Jul 8, 2024

joelburget commented Jul 8, 2024

joelburget commented Jul 9, 2024 •

edited by Wauplin

Loading

Wauplin commented Jul 9, 2024

ModelHubMixin config support throws error #2379

ModelHubMixin config support throws error #2379

Comments

joelburget commented Jul 7, 2024 • edited Loading

Describe the bug

Reproduction

Logs

System info

Wauplin commented Jul 8, 2024

joelburget commented Jul 8, 2024

joelburget commented Jul 9, 2024 • edited by Wauplin Loading

Wauplin commented Jul 9, 2024

joelburget commented Jul 7, 2024 •

edited

Loading

joelburget commented Jul 9, 2024 •

edited by Wauplin

Loading