Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backbone kwargs in config #28784

Merged
merged 10 commits into from
Feb 14, 2024

Conversation

amyeroberts
Copy link
Collaborator

@amyeroberts amyeroberts commented Jan 30, 2024

What does this PR do?

This enables configuring the backbones through the config directly e.g. passing in out_indices to the backbone. This enables configuring a model's backbone when it's loaded from a pretrained checkpoint. At the moment, this is only possible when loading from a backbone_config.

Example:

model = MaskFormer.from_pretrained(
    "facebook/maskformer-swin-base-ade",
    backbone="facebook/maskformer-swin-large-ade",
    backbone_kwargs={"out_indices": (-2, -1)}
)

This is necessary to replace th timm code currently there for models like DETR e.g. here, which is often hard coded.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@amyeroberts amyeroberts force-pushed the backbone-kwargs-in-config branch from 04f3fb2 to cdc6e93 Compare January 31, 2024 19:22
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@amyeroberts amyeroberts force-pushed the backbone-kwargs-in-config branch 2 times, most recently from c55d529 to f3d272c Compare February 1, 2024 14:41
@amyeroberts amyeroberts marked this pull request as ready for review February 1, 2024 15:42
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain the motivations behind this? IMO would should push users to use a config for the backbone, as we do for the quantization_config for example.
I feel like `backbone = AutoBackbone.from_config(config.backbone_config, arg1 =x, arg2=y) could be used in the potential refactor?

Would help if you have a linked PR that has the expected behaviour to see how this is gonna be called 🤗

@amyeroberts
Copy link
Collaborator Author

@ArthurZucker Sure!

We want to remove the hard-coded conditional logic inside of models like DETR and be able to fully configure the backbone's behaviour. The use case is: image I want to create a new model to train. Instead of the default architecture of DETR, I want to use a different timm backbone, and I want the backbone to return different feature maps from the default.

At the moment this isn't possible because:

  • use_timm_backbone behaviour is hard coded, so we can't load different timm backbones easily e.g. here i.e. I can't pass timm specific arguments to configure the backbone e.g. output_stride.
  • I can't configure the backbone by passing in out_indices to set the feature maps.

I completely agree that it would be better to have this all as one argument! For context: when the backbones were first added, there were four arguments:

  • backbone which specifies a checkpoint e.g. "facebook/detr-resnet-50"
  • use_timm_backbone - whether or not to load the backbone from timm.
  • use_pretrained_backbone which was ill-defined, but controlled the behaviour JUST for the timm backbones
  • backbone_config a model config which defines a backbone e.g. ResNet.

The backbone_config is just a model config, and used to load backbone models from config:

backbone = AutoBackbone.from_config(backbone_config)

and backbone is a checkpoint used to load from pretrained:

backbone = AutoBackbone.from_pretrained(backbone)

So backbone and backbone_config are mutually exclusive. And backbone_config is of type PretrainedConfig so can only configure transformers models.

Adding backbone_kwargs isn't ideal, but I think it's the simplest solution. The alternative is creating a new backbone config which contains everything. Because the old values are used in 100s of configs, we wouldn't be able to immediately deprecate, so it'd be a case of having code which ingested old config arguments.

If you want, I'm happy to code something up and we can decide which is best from the two :)

For more context, here's an example of the final step for removing timm in the modeling files: https://github.com/amyeroberts/transformers/pull/114/files#diff-bdc82cb85f491576a99a341adabaf42260eac9cd797d70d0a2c564b0d4ee2930

You can see how all we need is a single call backbone = load_backbone(config) in a modules init.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, looks good to me then!

tests/utils/test_backbone_utils.py Outdated Show resolved Hide resolved
@amyeroberts amyeroberts force-pushed the backbone-kwargs-in-config branch from bba8b2c to 9fd1233 Compare February 13, 2024 19:56
@amyeroberts amyeroberts force-pushed the backbone-kwargs-in-config branch from 9fd1233 to 4877d32 Compare February 14, 2024 17:02
@amyeroberts amyeroberts merged commit 0199a48 into huggingface:main Feb 14, 2024
19 checks passed
sbucaille pushed a commit to sbucaille/transformers that referenced this pull request Feb 14, 2024
* Enable instantiating model with pretrained backbone weights

* Clarify pretrained import

* Use load_backbone instead

* Add backbone_kwargs to config

* Pass kwargs to constructors

* Fix up

* Input verification

* Add tests

* Tidy up

* Update tests/utils/test_backbone_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
itazap pushed a commit that referenced this pull request May 14, 2024
* Enable instantiating model with pretrained backbone weights

* Clarify pretrained import

* Use load_backbone instead

* Add backbone_kwargs to config

* Pass kwargs to constructors

* Fix up

* Input verification

* Add tests

* Tidy up

* Update tests/utils/test_backbone_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants