Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Seaformer model #21819

Closed

Conversation

inderpreetsingh01
Copy link

@inderpreetsingh01 inderpreetsingh01 commented Feb 27, 2023

What does this PR do?

Fixes #21668
Seaformer is a two-branch architecture with Squeeze enhanced Axial Transformer for semantic segmentation on mobile devices.


Supersedes #21774

Before submitting

Who can review?

@alaradirik thanks for offering help with this PR, please let me know about any changes required.

@inderpreetsingh01 inderpreetsingh01 mentioned this pull request Feb 27, 2023
5 tasks
@alaradirik
Copy link
Contributor

Hi @inderpreetsingh01, thank you! You can ping me once the PR is ready is to be reviewed.

You can follow the official guidelines to learn how to prepare the configuration, image processor and modeling files to replicate the original work such that forward propagating an image through the HF and original implementation yields the same results.

@alaradirik
Copy link
Contributor

What does this PR do?

Fixes #21668 Seaformer is a two-branch architecture with Squeeze enhanced Axial Transformer for semantic segmentation on mobile devices. Supersedes #21774

Before submitting

Who can review?

@alaradirik thanks for offering help with this PR, please let me know about any changes required.

The PR is just initialized using SegFormer, I can do a review once the SeaFormer model is implemented.

@inderpreetsingh01
Copy link
Author

Hi @alaradirik, I have added seaformer implementation in modeling file and updated the conversion and configuration scripts, I have ran a forward pass in notebook and output is same as the original seaformer model. Can you please review it and let me know of any changes required? I am yet to do the testing part.

Copy link
Contributor

@alaradirik alaradirik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding this model!

I think the PR is in good shape overall but there are some issues that needs to be addressed. To summarize the main points:

  • SeaformerForImageClassification is not implemented and this is causing modeling test errors. Could you implement this class or remove all references to it?
  • We use self-descriptive variable and layer names, as well as type casting for all model classes and functions

Once you're done, you can run the following commands to ensure your PR passes all CI tests:

make style
make quality
make repo-consistency

pytest tests/models/seaformer/test_image_processing_seaformer.py
RUN_SLOW=True pytest tests/models/seaformer/test_modeling_seaformer.py

I can also double check the model output and the post-processing results if you upload a converted model to the hub and let me know.

Thanks again :)

README.md Outdated Show resolved Hide resolved
README_es.md Outdated Show resolved Hide resolved
README_hd.md Outdated Show resolved Hide resolved
README_ja.md Outdated Show resolved Hide resolved
README_ko.md Outdated Show resolved Hide resolved

layers = []
if expand_ratio != 1:
# pw
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be expanded to be more descriptive

src/transformers/models/seaformer/modeling_seaformer.py Outdated Show resolved Hide resolved
src/transformers/models/seaformer/modeling_seaformer.py Outdated Show resolved Hide resolved
src/transformers/models/seaformer/modeling_seaformer.py Outdated Show resolved Hide resolved
src/transformers/models/seaformer/modeling_seaformer.py Outdated Show resolved Hide resolved
@inderpreetsingh01
Copy link
Author

Hi @alaradirik thanks for the detailed review :) I have uploaded the converted model to the hub here Inderpreet01/seaformer-semantic-segmentation-large, will work on your comments and update the pr.
Thanks

@alaradirik
Copy link
Contributor

Hi @alaradirik thanks for the detailed review :) I have uploaded the converted model to the hub here Inderpreet01/seaformer-semantic-segmentation-large, will work on your comments and update the pr. Thanks

Thank you! Feel free to ping me when you'd like me to do the final review

@inderpreetsingh01
Copy link
Author

Hi @alaradirik, thanks for your response, removing num_labels from config has resolved that testcase, can you please help with this test case as well
SeaformerModelTest::test_initialization - AssertionError: -6.169999778649071e-06 not found in [0.0, 1.0]
I have normally initialized the parameters so negative values are expected.

I have looked at maskformer and segformer but not able to figure this out.

@inderpreetsingh01
Copy link
Author

inderpreetsingh01 commented Apr 11, 2023

actually this test is getting skipped in segformer model which also initializes weights normally.

@alaradirik
Copy link
Contributor

actually this test is getting skipped in segformer model which also initializes weights normally.

Hi @inderpreetsingh01, sorry for my late reply, I was off due to moving. You can overwrite the test by creating a test with the same name - test_initialization - as the weight initialization is inline with the original model. You can take a look at common test functions defined over here to see what this test does.

@inderpreetsingh01
Copy link
Author

Hi @alaradirik thanks for reply, where should i create this test with the same name?

@inderpreetsingh01
Copy link
Author

Hi @alaradirik can you please do the final review? thanks

Copy link
Contributor

@alaradirik alaradirik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @inderpreetsingh01, thanks for working on this, the PR looks great overall

There are just a few issues that need to be addressed before merging the PR. Most of these issues are minor (commented out / left-over code, non-descriptive variable names, etc.). Other than these, we favor accessing parameters used across multiple model subclasses via the config class attributes rather than passing each parameter as a separate argument.
Additionally, if a model can not pass a common modeling test due to not covering the specific approach, you'd need to overwrite it by creating a method of the same name within the test_modeling_seaformer.py script.

I'm adding a core maintainer for the final review, looking forward to merging this :)

docs/source/en/_toctree.yml Show resolved Hide resolved
docs/source/en/_toctree.yml Outdated Show resolved Hide resolved
docs/source/en/model_doc/seaformer.mdx Outdated Show resolved Hide resolved
src/transformers/activations.py Outdated Show resolved Hide resolved
src/transformers/activations.py Outdated Show resolved Hide resolved
src/transformers/models/seaformer/modeling_seaformer.py Outdated Show resolved Hide resolved
Comment on lines +266 to +279
# # verify the first attentions (first block, first layer)
# expected_seq_len = (self.model_tester.image_size // 4) ** 2
# expected_reduced_seq_len = (self.model_tester.image_size // (4 * self.model_tester.sr_ratios[0])) ** 2
# self.assertListEqual(
# list(attentions[0].shape[-3:]),
# [self.model_tester.num_attention_heads[0], expected_seq_len, expected_reduced_seq_len],
# )

# # verify the last attentions (last block, last layer)
# expected_seq_len = (self.model_tester.image_size // 32) ** 2
# expected_reduced_seq_len = (self.model_tester.image_size // (32 * self.model_tester.sr_ratios[-1])) ** 2
# self.assertListEqual(
# list(attentions[-1].shape[-3:]),
# [self.model_tester.num_attention_heads[-1], expected_seq_len, expected_reduced_seq_len],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# # verify the first attentions (first block, first layer)
# expected_seq_len = (self.model_tester.image_size // 4) ** 2
# expected_reduced_seq_len = (self.model_tester.image_size // (4 * self.model_tester.sr_ratios[0])) ** 2
# self.assertListEqual(
# list(attentions[0].shape[-3:]),
# [self.model_tester.num_attention_heads[0], expected_seq_len, expected_reduced_seq_len],
# )
# # verify the last attentions (last block, last layer)
# expected_seq_len = (self.model_tester.image_size // 32) ** 2
# expected_reduced_seq_len = (self.model_tester.image_size // (32 * self.model_tester.sr_ratios[-1])) ** 2
# self.assertListEqual(
# list(attentions[-1].shape[-3:]),
# [self.model_tester.num_attention_heads[-1], expected_seq_len, expected_reduced_seq_len],

Comment on lines +297 to +303
# # verify the first attentions (first block, first layer)
# expected_seq_len = (self.model_tester.image_size // 4) ** 2
# expected_reduced_seq_len = (self.model_tester.image_size // (4 * self.model_tester.sr_ratios[0])) ** 2
# self.assertListEqual(
# list(self_attentions[0].shape[-3:]),
# [self.model_tester.num_attention_heads[0], expected_seq_len, expected_reduced_seq_len],
# )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# # verify the first attentions (first block, first layer)
# expected_seq_len = (self.model_tester.image_size // 4) ** 2
# expected_reduced_seq_len = (self.model_tester.image_size // (4 * self.model_tester.sr_ratios[0])) ** 2
# self.assertListEqual(
# list(self_attentions[0].shape[-3:]),
# [self.model_tester.num_attention_heads[0], expected_seq_len, expected_reduced_seq_len],
# )

image_scale=(512, 512), keep_ratio=False, align=False, do_random_crop=False
)
model = SeaformerForSemanticSegmentation.from_pretrained(
"Inderpreet01/seaformer-semantic-segmentation-large"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be updated to the organization repo, it might take a while for the university to create an organization on the hub so we will most likely put the checkpoints under huggingface org for a while:

Suggested change
"Inderpreet01/seaformer-semantic-segmentation-large"
"huggingface/seaformer-semantic-segmentation-large"

@@ -508,18 +508,21 @@ class CopyClass(base_class):
self.assertLessEqual(max_diff, 1e-3, msg=f"{key} not identical")

def test_initialization(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not edit common test and modeling scripts, the changes should be reverted.

If there is a special case for the added model, you can overwrite this test by creating a test_initialization method within test_modeling_seaformer.py

@alaradirik alaradirik changed the title [WIP] Add Seaformer model Add Seaformer model Apr 19, 2023
@alaradirik alaradirik requested a review from amyeroberts April 19, 2023 15:05
@amyeroberts
Copy link
Collaborator

@inderpreetsingh01 Thanks for adding this model! Ping me when the PR is ready for review (once all of @alaradirik's comments have been addressed and tests are passing).

@inderpreetsingh01
Copy link
Author

@alaradirik thanks for the review, @amyeroberts sure will ping you once model is ready

inderpreetsingh01 and others added 12 commits April 22, 2023 10:45
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot closed this May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add SeaFormer model
4 participants