🔴🔴 [config] Add `get_sub_config` (in place of `get_text_config`) and fix related bugs #40553

gante · 2025-08-29T17:26:19Z

What does this PR do?

This PR (🔴 = BC breaking):

Adds config.get_sub_config(). This function is an upgrade to get_text_config:
a. Allows model-agnostic extraction of different modalities in a single function, future-proofing its uses;
b. Uses heuristics to determine correct encoder/decoder and modality matches (as opposed to relying on hardcoded sub config names that could match). These heuristics prevent us from having to manually update matching names as new models come out. They also nudge future models towards the same patterns (👉 standardization)
c. Related to [generate] handle support for cache classes when num enc layers != num dec layers #40277: when pulling encoder/decoder attributes from older encoder-decoder configs, make sure to consider config.attribute_map
d. Sometimes we want any text config (e.g resizing embeddings), other times we want any decoder modality (e.g. setting up the kv cache)... and sometimes we want specifically the text decoder. get_text_config didn't support this level of control, but get_sub_config does 🤗
e. (b.) + (c.) + (d.) = several bug fixes + allow us to remove a few overwrites
🔴 Deprecates get_text_config, replacing its calls by get_sub_config(modality="text", ...). This replacement is BC-breaking because of 1.b. However, in most cases, it fixes subtle bugs.
Enables SlidingWindowLayer, the static one, on cross-attention cases (where no cache_position is provided).
Corrects is_encoder_decoder check in generate tests.
Fixes a few related corner cases in test_generate_continue_from_inputs_embeds and test_past_key_values_format (which allows us to delete skips/overwrites)

This PR is a follow-up to #40454 (incomplete fix for the underlying issues, which ended up being quite complex)
Fixes #40644

gante · 2025-08-29T17:30:06Z

tests/generation/test_utils.py

            output_generate = self._greedy_generate(model=model, inputs_dict=inputs_dict)

-            if model.config.get_text_config(decoder=True).is_encoder_decoder:
+            if model.config.is_encoder_decoder:


The original model.config.get_text_config(decoder=True).is_encoder_decoder was added because of BLIP2, which was not setting its outer-level is_encoder_decoder attribute correctly. This PR fixes the root issue there.

In practice, this line was not right: if we want to check if a model is an encoder-decoder model, we can't look at it's decoder config only :)

HuggingFaceDocBuilderDev · 2025-08-29T17:35:35Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante · 2025-09-01T12:51:09Z

tests/generation/test_utils.py

+        """
        for model_class in self.all_generative_model_classes:
-            if any(model_name in model_class.__name__.lower() for model_name in ["imagegpt"]):
+            # To be more precise: technically we can run this test on all models that have `inputs_embeds` or


tl;dr moved model-level overwrites here, so we can better track odd models, and updated skip conditions

gante · 2025-09-04T09:21:53Z

src/transformers/cache_utils.py

-            config = config.get_text_config()
+            # We pull the decoder sub-config here to allow composite models to easily initialize the cache as
+            # `DynamicCache(config=model.config)`
+            config = config.get_sub_config(decoder=True)


e.g. in text-to-speech models, the decoder is not a text decoder :)

(it was working before because all TTS decoders had a common name in their decoder config)

github-actions · 2025-09-04T16:09:41Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, blip_2, clvp, colqwen2, dia, gemma3, gemma3n, idefics2, idefics3, kyutai_speech_to_text, llava_next, llava_next_video, llava_onevision, mllama, moshi, paligemma

gante · 2025-09-19T08:40:15Z

Closing, the root issue is solved in #40939 (the remaining fixes here were targetting old models)

gante commented Aug 29, 2025

View reviewed changes

gante mentioned this pull request Aug 29, 2025

Fix 'T5GemmaConfig' object has no attribute 'num_hidden_layers' #40454

Closed

5 tasks

gante changed the title ~~🔴🔴 [config] Upgrade get_text_config and fix related bugs~~ 🔴🔴 [config] get_text_config no longer errors out if it find multiple matches (and fix related bugs) Sep 1, 2025

lewtun and others added 16 commits September 1, 2025 12:28

Fix num_hidden_layers

365b1ff

Add missing setter

3f4e709

Fix config

e805bdc

Fix qualiry

15de12f

Revert T5Gemma

0f9bbf2

Fix cache

ef345d9

Simplify logic

e533071

Fix edge cases

6cf3918

Fix style

587f326

accept models with encoder and decoder text subconfigs

a9897f9

nit

7b892eb

t5gemma green

d535668

init encoder-decoder caches properly

c24ef73

comment

6fb9c5c

fix test_generate_continue_from_inputs_embeds

2436b5d

tmp commit

2b7f8dc

gante force-pushed the upgrade_get_text_config branch 2 times, most recently from 2d50d7f to 2b7f8dc Compare September 1, 2025 12:49

Merge branch 'main' into upgrade_get_text_config

e0e4024

gante commented Sep 1, 2025

View reviewed changes

gante added 3 commits September 1, 2025 12:57

qwen25omni

0a71558

fix moonshine

a4ec881

120 char/line

8bc0b50

gante mentioned this pull request Sep 3, 2025

BlenderbotForConditionalGeneration errors out with list index out of range #40644

Closed

gante added 2 commits September 3, 2025 16:04

derp

7d1b6b3

bad mass change

934bb39

bad mass change

a64dd31

gante changed the title ~~🔴🔴 [config] get_text_config no longer errors out if it find multiple matches (and fix related bugs)~~ 🔴🔴 [config] Introduce get_sub_config (in place of get_text_config) and fix related bugs Sep 3, 2025

gante and others added 14 commits September 3, 2025 17:17

Merge branch 'main' into upgrade_get_text_config

4dc0277

Merge branch 'main' into upgrade_get_text_config

a094822

make fixup

37264b2

upgrade test

aab2566

tmp commit

3191568

fix (and delete a few overwrites)

dfdd18c

revert

143f4c3

speecht5

d4e5741

shorter diff

a68cdb3

even shorter diff

d0f1d3b

even shorter diff

53360d5

even shorter diff

2b5f343

even shorter diff

7301b54

even shorter diff

11ccc06

gante changed the title ~~🔴🔴 [config] Introduce get_sub_config (in place of get_text_config) and fix related bugs~~ 🔴🔴 [config] Add get_sub_config (in place of get_text_config) and fix related bugs Sep 3, 2025

gante added 3 commits September 4, 2025 08:16

better heuristics

7b73f61

nit

84f7714

fine-grain review of get_sub_config arguments

72898c8

gante commented Sep 4, 2025

View reviewed changes

gante added 4 commits September 4, 2025 13:30

no image decoders

56d02d5

derp

005fb93

fix a few corner cases

1df3c02

tmp

2cdeae6

This was referenced Sep 4, 2025

[tests] fix blip2 edge case #40699

Merged

[tests] update test_past_key_values_format and delete overwrites #40701

Merged

gante closed this Sep 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🔴🔴 [config] Add `get_sub_config` (in place of `get_text_config`) and fix related bugs #40553

🔴🔴 [config] Add `get_sub_config` (in place of `get_text_config`) and fix related bugs #40553

Uh oh!

gante commented Aug 29, 2025 •

edited

Loading

Uh oh!

gante Aug 29, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Aug 29, 2025

Uh oh!

gante Sep 1, 2025

Uh oh!

gante Sep 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

gante commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

🔴🔴 [config] Add get_sub_config (in place of get_text_config) and fix related bugs #40553

🔴🔴 [config] Add get_sub_config (in place of get_text_config) and fix related bugs #40553

Uh oh!

Conversation

gante commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

gante Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Aug 29, 2025

Uh oh!

gante Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

gante Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

gante commented Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

🔴🔴 [config] Add `get_sub_config` (in place of `get_text_config`) and fix related bugs #40553

🔴🔴 [config] Add `get_sub_config` (in place of `get_text_config`) and fix related bugs #40553

gante commented Aug 29, 2025 •

edited

Loading

gante Sep 4, 2025 •

edited

Loading