-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Seaformer model #21819
Add Seaformer model #21819
Conversation
Hi @inderpreetsingh01, thank you! You can ping me once the PR is ready is to be reviewed. You can follow the official guidelines to learn how to prepare the configuration, image processor and modeling files to replicate the original work such that forward propagating an image through the HF and original implementation yields the same results. |
The PR is just initialized using SegFormer, I can do a review once the SeaFormer model is implemented. |
Hi @alaradirik, I have added seaformer implementation in modeling file and updated the conversion and configuration scripts, I have ran a forward pass in notebook and output is same as the original seaformer model. Can you please review it and let me know of any changes required? I am yet to do the testing part. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding this model!
I think the PR is in good shape overall but there are some issues that needs to be addressed. To summarize the main points:
- SeaformerForImageClassification is not implemented and this is causing modeling test errors. Could you implement this class or remove all references to it?
- We use self-descriptive variable and layer names, as well as type casting for all model classes and functions
Once you're done, you can run the following commands to ensure your PR passes all CI tests:
make style
make quality
make repo-consistency
pytest tests/models/seaformer/test_image_processing_seaformer.py
RUN_SLOW=True pytest tests/models/seaformer/test_modeling_seaformer.py
I can also double check the model output and the post-processing results if you upload a converted model to the hub and let me know.
Thanks again :)
|
||
layers = [] | ||
if expand_ratio != 1: | ||
# pw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be expanded to be more descriptive
Hi @alaradirik thanks for the detailed review :) I have uploaded the converted model to the hub here Inderpreet01/seaformer-semantic-segmentation-large, will work on your comments and update the pr. |
Thank you! Feel free to ping me when you'd like me to do the final review |
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
…ngh01/transformers into add_seaformer_model
Hi @alaradirik, thanks for your response, removing I have looked at maskformer and segformer but not able to figure this out. |
actually this test is getting skipped in segformer model which also initializes weights normally. |
Hi @inderpreetsingh01, sorry for my late reply, I was off due to moving. You can overwrite the test by creating a test with the same name - |
Hi @alaradirik thanks for reply, where should i create this test with the same name? |
Hi @alaradirik can you please do the final review? thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @inderpreetsingh01, thanks for working on this, the PR looks great overall
There are just a few issues that need to be addressed before merging the PR. Most of these issues are minor (commented out / left-over code, non-descriptive variable names, etc.). Other than these, we favor accessing parameters used across multiple model subclasses via the config class attributes rather than passing each parameter as a separate argument.
Additionally, if a model can not pass a common modeling test due to not covering the specific approach, you'd need to overwrite it by creating a method of the same name within the test_modeling_seaformer.py
script.
I'm adding a core maintainer for the final review, looking forward to merging this :)
# # verify the first attentions (first block, first layer) | ||
# expected_seq_len = (self.model_tester.image_size // 4) ** 2 | ||
# expected_reduced_seq_len = (self.model_tester.image_size // (4 * self.model_tester.sr_ratios[0])) ** 2 | ||
# self.assertListEqual( | ||
# list(attentions[0].shape[-3:]), | ||
# [self.model_tester.num_attention_heads[0], expected_seq_len, expected_reduced_seq_len], | ||
# ) | ||
|
||
# # verify the last attentions (last block, last layer) | ||
# expected_seq_len = (self.model_tester.image_size // 32) ** 2 | ||
# expected_reduced_seq_len = (self.model_tester.image_size // (32 * self.model_tester.sr_ratios[-1])) ** 2 | ||
# self.assertListEqual( | ||
# list(attentions[-1].shape[-3:]), | ||
# [self.model_tester.num_attention_heads[-1], expected_seq_len, expected_reduced_seq_len], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# # verify the first attentions (first block, first layer) | |
# expected_seq_len = (self.model_tester.image_size // 4) ** 2 | |
# expected_reduced_seq_len = (self.model_tester.image_size // (4 * self.model_tester.sr_ratios[0])) ** 2 | |
# self.assertListEqual( | |
# list(attentions[0].shape[-3:]), | |
# [self.model_tester.num_attention_heads[0], expected_seq_len, expected_reduced_seq_len], | |
# ) | |
# # verify the last attentions (last block, last layer) | |
# expected_seq_len = (self.model_tester.image_size // 32) ** 2 | |
# expected_reduced_seq_len = (self.model_tester.image_size // (32 * self.model_tester.sr_ratios[-1])) ** 2 | |
# self.assertListEqual( | |
# list(attentions[-1].shape[-3:]), | |
# [self.model_tester.num_attention_heads[-1], expected_seq_len, expected_reduced_seq_len], |
# # verify the first attentions (first block, first layer) | ||
# expected_seq_len = (self.model_tester.image_size // 4) ** 2 | ||
# expected_reduced_seq_len = (self.model_tester.image_size // (4 * self.model_tester.sr_ratios[0])) ** 2 | ||
# self.assertListEqual( | ||
# list(self_attentions[0].shape[-3:]), | ||
# [self.model_tester.num_attention_heads[0], expected_seq_len, expected_reduced_seq_len], | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# # verify the first attentions (first block, first layer) | |
# expected_seq_len = (self.model_tester.image_size // 4) ** 2 | |
# expected_reduced_seq_len = (self.model_tester.image_size // (4 * self.model_tester.sr_ratios[0])) ** 2 | |
# self.assertListEqual( | |
# list(self_attentions[0].shape[-3:]), | |
# [self.model_tester.num_attention_heads[0], expected_seq_len, expected_reduced_seq_len], | |
# ) |
image_scale=(512, 512), keep_ratio=False, align=False, do_random_crop=False | ||
) | ||
model = SeaformerForSemanticSegmentation.from_pretrained( | ||
"Inderpreet01/seaformer-semantic-segmentation-large" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be updated to the organization repo, it might take a while for the university to create an organization on the hub so we will most likely put the checkpoints under huggingface org for a while:
"Inderpreet01/seaformer-semantic-segmentation-large" | |
"huggingface/seaformer-semantic-segmentation-large" |
@@ -508,18 +508,21 @@ class CopyClass(base_class): | |||
self.assertLessEqual(max_diff, 1e-3, msg=f"{key} not identical") | |||
|
|||
def test_initialization(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not edit common test and modeling scripts, the changes should be reverted.
If there is a special case for the added model, you can overwrite this test by creating a test_initialization method within test_modeling_seaformer.py
@inderpreetsingh01 Thanks for adding this model! Ping me when the PR is ready for review (once all of @alaradirik's comments have been addressed and tests are passing). |
@alaradirik thanks for the review, @amyeroberts sure will ping you once model is ready |
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
What does this PR do?
Fixes #21668
Seaformer is a two-branch architecture with Squeeze enhanced Axial Transformer for semantic segmentation on mobile devices.
Supersedes #21774
Before submitting
Pull Request section?
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@alaradirik thanks for offering help with this PR, please let me know about any changes required.