Skip to content

Conversation

@0xjeffro
Copy link

@0xjeffro 0xjeffro commented Sep 16, 2025

This is a quick fix for issue #40874. The T5GemmaConfig class was missing the num_hidden_layers attribute that cache initialization expects. Added num_hidden_layers property to expose the decoder's num_hidden_layers value.

What does this PR do?

Fixes #40874

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

This is a quick fix for issue huggingface#40874. The `T5GemmaConfig` class was missing the `num_hidden_layers` attribute that cache initialization expects. Added `attribute_map` and `num_layers` property following the same pattern used by `T5Config` to expose the decoder's `num_hidden_layers` value.
This is a quick fix for issue huggingface#40874. The `T5GemmaConfig` class was missing the `num_hidden_layers` attribute that cache initialization expects. Added `attribute_map` and `num_layers` property following the same pattern used by `T5Config` to expose the decoder's `num_hidden_layers` value.
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgot that it was an encoder decoder hahah there might be a better way to fix this @gante I don't remember what we do in that case

This is a quick fix for issue huggingface#40874. The `T5GemmaConfig.get_text_config()` method was always returning self instead of the appropriate sub-configuration. This caused cache initialization failures because the main config lacks `num_hidden_layers` while the decoder/encoder sub-configs have this attribute.

Now the method correctly returns the decoder config when `decoder=True` and encoder config when `encoder=True`, allowing `cache_utils.py` to obtain the proper configuration with `num_hidden_layers` for cache initialization.
…_T5GemmaConfig' into fix_missing_num_hidden_layers_in_T5GemmaConfig

# Conflicts:
#	src/transformers/models/t5gemma/configuration_t5gemma.py
This is a quick fix for issue huggingface#40874. The `T5GemmaConfig` class was missing the `num_hidden_layers` attribute that cache initialization expects. Added `num_hidden_layers` property to expose the decoder's `num_hidden_layers` value.
This is a quick fix for issue huggingface#40874. The `T5GemmaConfig` class was missing the `num_hidden_layers` attribute that cache initialization expects. Added `num_hidden_layers` property to expose the decoder's `num_hidden_layers` value.
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: t5gemma

@0xjeffro 0xjeffro marked this pull request as ready for review September 16, 2025 18:29
Copy link
Contributor

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhmmm this is not wanted, it means config.get_text_config is not fetching the decoder for this model correctly. Having a look at the original issue

@gante
Copy link
Contributor

gante commented Sep 17, 2025

#40939 is the right fix 🤗

In any case, thank you for being proactive and trying to fix the issue 🫶

@gante
Copy link
Contributor

gante commented Sep 17, 2025

(closing the PR so we don't mistakenly merge it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing num_hidden_layers in T5GemmaConfig

3 participants