Add `FeaturePyramidBackbone` and port weights from `timm` for `ResNetBackbone` #1769

james77777778 · 2024-08-13T03:12:07Z

This PR introduces FeaturePyramidBackbone, a wrapper for Backbone that adds pyramid_outputs property.
If a vision backbone supports feature pyramids, it should subclass FeaturePyramidBackbone.

I modified ResNetBackbone by subclassing FeaturePyramidBackbone to include feature pyramid information.

Also, head_dtype is added in ResNetImageClassifier.

@divyashreepathihalli @mattdangerw @SamanehSaadat

EDITED:
See #1769 (comment) for updates

mattdangerw · 2024-08-13T21:29:38Z

@james77777778 I was thinking we would make FeaturePyramidBackbone a simple subclass of Backbone for most CV backbones.

So...

Backbone is basically just a functional model with from_preset.
FeaturePyramidBackbone extends Backbone with extra pyramid_outputs.
ResNetBackbone extends FeaturePyramidBackbone directly.

The goal is to keep the Backbone base class as clean as we can, now that we are venturing into multi modal models with a lot of different overall patterns. dir(bert_backbone) shouldn't have feature pyramid stuff in it. If we could move the token_embedding off the backbone and into a TextBackbone or similar without breaking compat we probably would too.

Most CV models like ResNet, DenseNet, EfficientNet, etc, will be FeaturePyramidBackbones. But something like ViT can just subclass Backbone directly without needing the feature pyramid outputs.

WDYT?

james77777778 · 2024-08-14T06:39:23Z

@mattdangerw @divyashreepathihalli
Got it. That makes sense. Updated!

I have updated the PR to make compatible with timm. Additionally, the conversion logic has been added, similar to how it’s done in transformers.

Please refer to this colab for the numerical check:
https://colab.research.google.com/drive/1QnmNDiFYd56fsYoaUM46QRT4gF9G06fH?usp=sharing

Supported:

V1: resnet18.a1_in1k, resnet26.bt_in1k, resnet34.a1_in1k, resnet50.a1_in1k, resnet101.a1h_in1k, resnet152.a1h_in1k
V2: resnetv2_50.a1h_in1k, resnetv2_101.a1h_in1k

mattdangerw

Awesome great work! Still a few comments to resolve, but very cool we are building timm conversion into the library.

keras_nlp/src/models/feature_pyramid_backbone.py

keras_nlp/src/models/resnet/resnet_feature_pyramid_backbone.py

keras_nlp/src/models/resnet/resnet_feature_pyramid_backbone_test.py

keras_nlp/src/models/resnet/resnet_image_classifier.py

mattdangerw · 2024-08-14T18:59:23Z

keras_nlp/src/utils/timm/safetensor_utils.py

@@ -0,0 +1,68 @@
+# Copyright 2024 The KerasNLP Authors


Is this the same as the one for transformers? Can we use the same file? Can just leave it where it is for now and import it.

I have added a argument preset to make compatible with timm.
Now we use the same file.

mattdangerw · 2024-08-14T19:03:47Z

keras_nlp/src/utils/preset_utils.py

 KAGGLE_PREFIX = "kaggle://"
 GS_PREFIX = "gs://"
 HF_PREFIX = "hf://"
+TIMM_PREFIX = "hf://timm"


I think we can merge like this for now to keep things moving, but we should probably allow converting timm models outside of the timm "org" on huggingface. We might need to start parsing the architecture/architectures values transformers/timm config.json. At least that seems like the best place to infer where things are loading from.

I updated the code to parse these values in order to determine the format.
It's a bit fragile. We might need better parsing in the future.

Let's merge for now and update later!

divyashreepathihalli · 2024-08-14T21:05:55Z

Looks really good! Its awesome to have built in timm conversion! LGTM!

mattdangerw · 2024-08-15T20:30:02Z

Thanks!

…Backbone` (keras-team#1769) * Add FeaturePyramidBackbone and update ResNetBackbone * Simplify the implementation * Fix CI * Make ResNetBackbone compatible with timm and add FeaturePyramidBackbone * Add conversion implementation * Update docstrings * Address comments

…Backbone` (#1769) * Add FeaturePyramidBackbone and update ResNetBackbone * Simplify the implementation * Fix CI * Make ResNetBackbone compatible with timm and add FeaturePyramidBackbone * Add conversion implementation * Update docstrings * Address comments

* Add VGG16 backbone (#1737) * Agg Vgg16 backbone * update names * update tests * update test * add image classifier * incorporate review comments * Update test case * update backbone test * add image classifier * classifier cleanup * code reformat * add vgg16 image classifier * make vgg generic * update doc string * update docstring * add classifier test * update tests * update docstring * address review comments * code reformat * update the configs * address review comments * fix task saved model test * update init * code reformatted * Add `ResNetBackbone` and `ResNetImageClassifier` (#1765) * Add ResNetV1 and ResNetV2 * Address comments * Add CSP DarkNet backbone and classifier (#1774) * Add CSP DarkNet * Add CSP DarkNet * snake_case function names * change use_depthwise to block_type * Add `FeaturePyramidBackbone` and port weights from `timm` for `ResNetBackbone` (#1769) * Add FeaturePyramidBackbone and update ResNetBackbone * Simplify the implementation * Fix CI * Make ResNetBackbone compatible with timm and add FeaturePyramidBackbone * Add conversion implementation * Update docstrings * Address comments * Add DenseNet (#1775) * Add DenseNet * fix testcase * address comments * nit * fix lint errors * move description * Add ViTDetBackbone (#1776) * add vit det vit_det_backbone * update docstring * code reformat * fix tests * address review comments * bump year on all files * address review comments * rename backbone * fix tests * change back to ViT * address review comments * update image shape * Add Mix transformer (#1780) * Add MixTransformer * fix testcase * test changes and comments * lint fix * update config list * modify testcase for 2 layers * update input_image_shape -> image_shape (#1785) * update input_image_shape -> image_shape * update docstring example * code reformat * update tests * Create __init__.py (#1788) add missing __init__ file to vit_det * Hack package build script to rename to keras-hub (#1793) This is a temporary way to test out the keras-hub branch. - Does a global rename of all symbols during package build. - Registers the "old" name on symbol export for saving compat. - Adds a github action to publish every commit to keras-hub as a new package. - Removes our descriptions on PyPI temporarily, until we want to message this more broadly. * Add CLIP and T5XXL for StableDiffusionV3 (#1790) * Add `CLIPTokenizer`, `T5XXLTokenizer`, `CLIPTextEncoder` and `T5XXLTextEncoder`. * Make CLIPTextEncoder as Backbone * Add `T5XXLPreprocessor` and remove `T5XXLTokenizer` Add `CLIPPreprocessor` * Use `tf = None` at the top * Replace manual implementation of `CLIPAttention` with `MultiHeadAttention` * Add Bounding Box Utils (#1791) * Bounding box utils * - Correct test cases * - Remove hard tensorflow dtype * - fix api gen * - Fix import for test cases - Use setup for converters test case * - fix api_gen issue * - FIx api gen * - Fix api gen error * - Correct test cases as per new api changes * mobilenet_v3 added in keras-nlp (#1782) * mobilenet_v3 added in keras-nlp * minor bug fixed in mobilenet_v3_backbone * formatting corrected * refactoring backbone * correct_pad_downsample method added * refactoring backbone * parameters updated * Testcaseupdated, expected output shape corrected * code formatted with black * testcase updated * refactoring and description added * comments updated * added mobilenet v1 and v2 * merge conflict resolved * version arg removed, and config options added * input_shape changed to image_shape in arg * config updated * input shape corrected * comments resolved * activation function format changed * minor bug fixed * minor bug fixed * added vision_backbone_test * channel_first bug resolved * channel_first cases working * comments resolved * formatting fixed * refactoring --------- Co-authored-by: ushareng <usha.rengaraju@gmail.com> * Pkgoogle/efficient net migration (#1778) * migrating efficientnet models to keras-hub * merging changes from other sources * autoformatting pass * initial consolidation of efficientnet_backbone * most updates and removing separate implementation * cleanup, autoformatting, keras generalization * removed layer examples outside of effiicient net * many, mainly documentation changes, small test fixes * Add the ResNet_vd backbone (#1766) * Add ResNet_vd to ResNet backbone * Addressed requested parameter changes * Fixed tests and updated comments * Added new parameters to docstring * Add `VAEImageDecoder` for StableDiffusionV3 (#1796) * Add `VAEImageDecoder` for StableDiffusionV3 * Use `keras.Model` for `VAEImageDecoder` and follows the coding style in `VAEAttention` * Replace `Backbone` with `keras.Model` in `CLIPTextEncoder` and `T5XXLTextEncoder` (#1802) * Add pyramid output for densenet, cspDarknet (#1801) * add pyramid outputs * fix testcase * format fix * make common testcase for pyramid outputs * change default shape * simplify testcase * test case change and add channel axis * Add `MMDiT` for StableDiffusionV3 (#1806) * Add `MMDiT` * Update * Update * Update implementation * Add remaining bbox utils (#1804) * - Add formats, iou, utils for bounding box * - Add `AnchorGenerator`, `BoxMatcher` and `NonMaxSupression` layers * - Remove scope_name not required. * use default keras name scope * - Correct format error * - Remove layers as of now and keep them at model level till keras core supports them * - Correct api_gen * Fix timm conversion for rersnet (#1814) * Add `StableDiffusion3` * Fix `_normalize_inputs` * Separate CLIP encoders from SD3 backbone. * Simplify `text_to_image` function. * Address comments * Minor update and add docstrings. * Add VGG16 backbone (#1737) * Agg Vgg16 backbone * update names * update tests * update test * add image classifier * incorporate review comments * Update test case * update backbone test * add image classifier * classifier cleanup * code reformat * add vgg16 image classifier * make vgg generic * update doc string * update docstring * add classifier test * update tests * update docstring * address review comments * code reformat * update the configs * address review comments * fix task saved model test * update init * code reformatted * Add `ResNetBackbone` and `ResNetImageClassifier` (#1765) * Add ResNetV1 and ResNetV2 * Address comments * Add CSP DarkNet backbone and classifier (#1774) * Add CSP DarkNet * Add CSP DarkNet * snake_case function names * change use_depthwise to block_type * Add `FeaturePyramidBackbone` and port weights from `timm` for `ResNetBackbone` (#1769) * Add FeaturePyramidBackbone and update ResNetBackbone * Simplify the implementation * Fix CI * Make ResNetBackbone compatible with timm and add FeaturePyramidBackbone * Add conversion implementation * Update docstrings * Address comments * Add DenseNet (#1775) * Add DenseNet * fix testcase * address comments * nit * fix lint errors * move description * Add ViTDetBackbone (#1776) * add vit det vit_det_backbone * update docstring * code reformat * fix tests * address review comments * bump year on all files * address review comments * rename backbone * fix tests * change back to ViT * address review comments * update image shape * Add Mix transformer (#1780) * Add MixTransformer * fix testcase * test changes and comments * lint fix * update config list * modify testcase for 2 layers * update input_image_shape -> image_shape (#1785) * update input_image_shape -> image_shape * update docstring example * code reformat * update tests * Create __init__.py (#1788) add missing __init__ file to vit_det * Hack package build script to rename to keras-hub (#1793) This is a temporary way to test out the keras-hub branch. - Does a global rename of all symbols during package build. - Registers the "old" name on symbol export for saving compat. - Adds a github action to publish every commit to keras-hub as a new package. - Removes our descriptions on PyPI temporarily, until we want to message this more broadly. * Add CLIP and T5XXL for StableDiffusionV3 (#1790) * Add `CLIPTokenizer`, `T5XXLTokenizer`, `CLIPTextEncoder` and `T5XXLTextEncoder`. * Make CLIPTextEncoder as Backbone * Add `T5XXLPreprocessor` and remove `T5XXLTokenizer` Add `CLIPPreprocessor` * Use `tf = None` at the top * Replace manual implementation of `CLIPAttention` with `MultiHeadAttention` * Add Bounding Box Utils (#1791) * Bounding box utils * - Correct test cases * - Remove hard tensorflow dtype * - fix api gen * - Fix import for test cases - Use setup for converters test case * - fix api_gen issue * - FIx api gen * - Fix api gen error * - Correct test cases as per new api changes * mobilenet_v3 added in keras-nlp (#1782) * mobilenet_v3 added in keras-nlp * minor bug fixed in mobilenet_v3_backbone * formatting corrected * refactoring backbone * correct_pad_downsample method added * refactoring backbone * parameters updated * Testcaseupdated, expected output shape corrected * code formatted with black * testcase updated * refactoring and description added * comments updated * added mobilenet v1 and v2 * merge conflict resolved * version arg removed, and config options added * input_shape changed to image_shape in arg * config updated * input shape corrected * comments resolved * activation function format changed * minor bug fixed * minor bug fixed * added vision_backbone_test * channel_first bug resolved * channel_first cases working * comments resolved * formatting fixed * refactoring --------- Co-authored-by: ushareng <usha.rengaraju@gmail.com> * Pkgoogle/efficient net migration (#1778) * migrating efficientnet models to keras-hub * merging changes from other sources * autoformatting pass * initial consolidation of efficientnet_backbone * most updates and removing separate implementation * cleanup, autoformatting, keras generalization * removed layer examples outside of effiicient net * many, mainly documentation changes, small test fixes * Add the ResNet_vd backbone (#1766) * Add ResNet_vd to ResNet backbone * Addressed requested parameter changes * Fixed tests and updated comments * Added new parameters to docstring * Add `VAEImageDecoder` for StableDiffusionV3 (#1796) * Add `VAEImageDecoder` for StableDiffusionV3 * Use `keras.Model` for `VAEImageDecoder` and follows the coding style in `VAEAttention` * Replace `Backbone` with `keras.Model` in `CLIPTextEncoder` and `T5XXLTextEncoder` (#1802) * Add pyramid output for densenet, cspDarknet (#1801) * add pyramid outputs * fix testcase * format fix * make common testcase for pyramid outputs * change default shape * simplify testcase * test case change and add channel axis * Add `MMDiT` for StableDiffusionV3 (#1806) * Add `MMDiT` * Update * Update * Update implementation * Add remaining bbox utils (#1804) * - Add formats, iou, utils for bounding box * - Add `AnchorGenerator`, `BoxMatcher` and `NonMaxSupression` layers * - Remove scope_name not required. * use default keras name scope * - Correct format error * - Remove layers as of now and keep them at model level till keras core supports them * - Correct api_gen * Fix timm conversion for rersnet (#1814) * Fix * Update * Rename to diffuser and decoder * Define functional model * Merge from upstream/master * Delete old SD3 * Fix copyright * Rename to keras_hub * Address comments * Update * Fix CI * Fix bugs occurred in keras3.1 --------- Co-authored-by: Divyashree Sreepathihalli <divyashreepathihalli@gmail.com> Co-authored-by: Sachin Prasad <sachinprasad@google.com> Co-authored-by: Matt Watson <1389937+mattdangerw@users.noreply.github.com> Co-authored-by: Siva Sravana Kumar Neeli <113718461+sineeli@users.noreply.github.com> Co-authored-by: Usha Rengaraju <34335028+ushareng@users.noreply.github.com> Co-authored-by: ushareng <usha.rengaraju@gmail.com> Co-authored-by: pkgoogle <132095473+pkgoogle@users.noreply.github.com> Co-authored-by: gowthamkpr <47574994+gowthamkpr@users.noreply.github.com>

james77777778 added 3 commits August 13, 2024 11:02

Add FeaturePyramidBackbone and update ResNetBackbone

4a705a1

Simplify the implementation

a9c31cb

Fix CI

fbf47c3

james77777778 force-pushed the add-feature-pyramid branch from aae0efc to fbf47c3 Compare August 13, 2024 05:00

james77777778 requested review from SamanehSaadat, divyashreepathihalli and mattdangerw August 13, 2024 05:23

james77777778 added 2 commits August 14, 2024 09:08

Make ResNetBackbone compatible with timm and add FeaturePyramidBackbone

c6159fa

Add conversion implementation

e4d508f

james77777778 force-pushed the add-feature-pyramid branch from 9568ba6 to e4d508f Compare August 14, 2024 06:33

james77777778 changed the title ~~Add FeaturePyramidBackbone and ResNetFeaturePyramidBackbone~~ Add FeaturePyramidBackbone and port weights from timm for ResNetBackbone Aug 14, 2024

Update docstrings

0dab201

james77777778 force-pushed the add-feature-pyramid branch from f44a0e2 to 0dab201 Compare August 14, 2024 09:21

james77777778 mentioned this pull request Aug 14, 2024

Add the ResNet_vd backbone #1766

Merged

mattdangerw reviewed Aug 14, 2024

View reviewed changes

divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Aug 14, 2024

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 14, 2024

Address comments

075f9cb

james77777778 force-pushed the add-feature-pyramid branch from ea08ec6 to 075f9cb Compare August 15, 2024 01:56

divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Aug 15, 2024

kokoro-team removed the kokoro:force-run Runs Tests on GPU label Aug 15, 2024

divyashreepathihalli approved these changes Aug 15, 2024

View reviewed changes

james77777778 requested a review from mattdangerw August 15, 2024 02:17

mattdangerw merged commit 00ab4d5 into keras-team:keras-hub Aug 15, 2024

james77777778 deleted the add-feature-pyramid branch August 16, 2024 01:45

mattdangerw mentioned this pull request Aug 16, 2024

Pkgoogle/efficient net migration #1778

Merged

Add FeaturePyramidBackbone and port weights from timm for ResNetBackbone #1769

Add FeaturePyramidBackbone and port weights from timm for ResNetBackbone #1769

Uh oh!

Conversation

james77777778 commented Aug 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattdangerw commented Aug 13, 2024

Uh oh!

james77777778 commented Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mattdangerw Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

james77777778 Aug 15, 2024

Choose a reason for hiding this comment

Uh oh!

mattdangerw Aug 14, 2024

Choose a reason for hiding this comment

Uh oh!

james77777778 Aug 15, 2024

Choose a reason for hiding this comment

Uh oh!

mattdangerw Aug 15, 2024

Choose a reason for hiding this comment

Uh oh!

divyashreepathihalli commented Aug 14, 2024

Uh oh!

mattdangerw commented Aug 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add `FeaturePyramidBackbone` and port weights from `timm` for `ResNetBackbone` #1769

Add `FeaturePyramidBackbone` and port weights from `timm` for `ResNetBackbone` #1769

james77777778 commented Aug 13, 2024 •

edited

Loading

james77777778 commented Aug 14, 2024 •

edited

Loading

mattdangerw Aug 14, 2024 •

edited

Loading