-
Notifications
You must be signed in to change notification settings - Fork 31.1k
🔴🔴 [config] Add get_sub_config (in place of get_text_config) and fix related bugs
#40553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| output_generate = self._greedy_generate(model=model, inputs_dict=inputs_dict) | ||
|
|
||
| if model.config.get_text_config(decoder=True).is_encoder_decoder: | ||
| if model.config.is_encoder_decoder: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original model.config.get_text_config(decoder=True).is_encoder_decoder was added because of BLIP2, which was not setting its outer-level is_encoder_decoder attribute correctly. This PR fixes the root issue there.
In practice, this line was not right: if we want to check if a model is an encoder-decoder model, we can't look at it's decoder config only :)
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
get_text_config and fix related bugsget_text_config no longer errors out if it find multiple matches (and fix related bugs)
2d50d7f to
2b7f8dc
Compare
| """ | ||
| for model_class in self.all_generative_model_classes: | ||
| if any(model_name in model_class.__name__.lower() for model_name in ["imagegpt"]): | ||
| # To be more precise: technically we can run this test on all models that have `inputs_embeds` or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tl;dr moved model-level overwrites here, so we can better track odd models, and updated skip conditions
get_text_config no longer errors out if it find multiple matches (and fix related bugs)get_sub_config (in place of get_text_config) and fix related bugs
get_sub_config (in place of get_text_config) and fix related bugsget_sub_config (in place of get_text_config) and fix related bugs
src/transformers/cache_utils.py
Outdated
| config = config.get_text_config() | ||
| # We pull the decoder sub-config here to allow composite models to easily initialize the cache as | ||
| # `DynamicCache(config=model.config)` | ||
| config = config.get_sub_config(decoder=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. in text-to-speech models, the decoder is not a text decoder :)
(it was working before because all TTS decoders had a common name in their decoder config)
|
[For maintainers] Suggested jobs to run (before merge) run-slow: auto, blip_2, clvp, colqwen2, dia, gemma3, gemma3n, idefics2, idefics3, kyutai_speech_to_text, llava_next, llava_next_video, llava_onevision, mllama, moshi, paligemma |
|
Closing, the root issue is solved in #40939 (the remaining fixes here were targetting old models) |
What does this PR do?
This PR (🔴 = BC breaking):
config.get_sub_config(). This function is an upgrade toget_text_config:a. Allows model-agnostic extraction of different modalities in a single function, future-proofing its uses;
b. Uses heuristics to determine correct encoder/decoder and modality matches (as opposed to relying on hardcoded sub config names that could match). These heuristics prevent us from having to manually update matching names as new models come out. They also nudge future models towards the same patterns (👉 standardization)
c. Related to [generate] handle support for cache classes when num enc layers != num dec layers #40277: when pulling
encoder/decoderattributes from older encoder-decoder configs, make sure to considerconfig.attribute_mapd. Sometimes we want any text config (e.g resizing embeddings), other times we want any decoder modality (e.g. setting up the kv cache)... and sometimes we want specifically the text decoder.
get_text_configdidn't support this level of control, butget_sub_configdoes 🤗e. (b.) + (c.) + (d.) = several bug fixes + allow us to remove a few overwrites
get_text_config, replacing its calls byget_sub_config(modality="text", ...). This replacement is BC-breaking because of 1.b. However, in most cases, it fixes subtle bugs.SlidingWindowLayer, the static one, on cross-attention cases (where nocache_positionis provided).is_encoder_decodercheck ingeneratetests.test_generate_continue_from_inputs_embedsandtest_past_key_values_format(which allows us to delete skips/overwrites)This PR is a follow-up to #40454 (incomplete fix for the underlying issues, which ended up being quite complex)
Fixes #40644