Add Seaformer model #21819

inderpreetsingh01 · 2023-02-27T11:01:29Z

What does this PR do?

Fixes #21668
Seaformer is a two-branch architecture with Squeeze enhanced Axial Transformer for semantic segmentation on mobile devices.

Supersedes #21774

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Add SeaFormer model #21668
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@alaradirik thanks for offering help with this PR, please let me know about any changes required.

alaradirik · 2023-02-27T11:34:43Z

Hi @inderpreetsingh01, thank you! You can ping me once the PR is ready is to be reviewed.

You can follow the official guidelines to learn how to prepare the configuration, image processor and modeling files to replicate the original work such that forward propagating an image through the HF and original implementation yields the same results.

alaradirik · 2023-02-27T11:36:11Z

What does this PR do?

Fixes #21668 Seaformer is a two-branch architecture with Squeeze enhanced Axial Transformer for semantic segmentation on mobile devices. Supersedes #21774

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).

Did you read the contributor guideline,
Pull Request section?

Was this discussed/approved via a Github issue or the forum? Add SeaFormer model #21668

Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.

Did you write any new necessary tests?

Who can review?

@alaradirik thanks for offering help with this PR, please let me know about any changes required.

The PR is just initialized using SegFormer, I can do a review once the SeaFormer model is implemented.

inderpreetsingh01 · 2023-03-18T07:49:04Z

Hi @alaradirik, I have added seaformer implementation in modeling file and updated the conversion and configuration scripts, I have ran a forward pass in notebook and output is same as the original seaformer model. Can you please review it and let me know of any changes required? I am yet to do the testing part.

alaradirik

Thank you for adding this model!

I think the PR is in good shape overall but there are some issues that needs to be addressed. To summarize the main points:

SeaformerForImageClassification is not implemented and this is causing modeling test errors. Could you implement this class or remove all references to it?
We use self-descriptive variable and layer names, as well as type casting for all model classes and functions

Once you're done, you can run the following commands to ensure your PR passes all CI tests:

make style
make quality
make repo-consistency

pytest tests/models/seaformer/test_image_processing_seaformer.py
RUN_SLOW=True pytest tests/models/seaformer/test_modeling_seaformer.py

I can also double check the model output and the post-processing results if you upload a converted model to the hub and let me know.

Thanks again :)

README.md

README_es.md

README_hd.md

README_ja.md

README_ko.md

alaradirik · 2023-03-22T09:37:29Z

src/transformers/models/seaformer/modeling_seaformer.py

+
+        layers = []
+        if expand_ratio != 1:
+            # pw


Could be expanded to be more descriptive

src/transformers/models/seaformer/modeling_seaformer.py

inderpreetsingh01 · 2023-03-23T09:07:51Z

Hi @alaradirik thanks for the detailed review :) I have uploaded the converted model to the hub here Inderpreet01/seaformer-semantic-segmentation-large, will work on your comments and update the pr.
Thanks

alaradirik · 2023-03-23T09:16:50Z

Hi @alaradirik thanks for the detailed review :) I have uploaded the converted model to the hub here Inderpreet01/seaformer-semantic-segmentation-large, will work on your comments and update the pr. Thanks

Thank you! Feel free to ping me when you'd like me to do the final review

…arts

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

…ngh01/transformers into add_seaformer_model

inderpreetsingh01 · 2023-04-10T11:05:21Z

Hi @alaradirik, thanks for your response, removing num_labels from config has resolved that testcase, can you please help with this test case as well
SeaformerModelTest::test_initialization - AssertionError: -6.169999778649071e-06 not found in [0.0, 1.0]
I have normally initialized the parameters so negative values are expected.

I have looked at maskformer and segformer but not able to figure this out.

inderpreetsingh01 · 2023-04-11T07:31:16Z

actually this test is getting skipped in segformer model which also initializes weights normally.

alaradirik · 2023-04-17T11:30:56Z

actually this test is getting skipped in segformer model which also initializes weights normally.

Hi @inderpreetsingh01, sorry for my late reply, I was off due to moving. You can overwrite the test by creating a test with the same name - test_initialization - as the weight initialization is inline with the original model. You can take a look at common test functions defined over here to see what this test does.

inderpreetsingh01 · 2023-04-17T12:31:30Z

Hi @alaradirik thanks for reply, where should i create this test with the same name?

inderpreetsingh01 · 2023-04-18T05:23:39Z

Hi @alaradirik can you please do the final review? thanks

alaradirik

Hi @inderpreetsingh01, thanks for working on this, the PR looks great overall

There are just a few issues that need to be addressed before merging the PR. Most of these issues are minor (commented out / left-over code, non-descriptive variable names, etc.). Other than these, we favor accessing parameters used across multiple model subclasses via the config class attributes rather than passing each parameter as a separate argument.
Additionally, if a model can not pass a common modeling test due to not covering the specific approach, you'd need to overwrite it by creating a method of the same name within the test_modeling_seaformer.py script.

I'm adding a core maintainer for the final review, looking forward to merging this :)

docs/source/en/_toctree.yml

docs/source/en/model_doc/seaformer.mdx

src/transformers/activations.py

src/transformers/models/seaformer/modeling_seaformer.py

alaradirik · 2023-04-19T14:46:59Z

tests/models/seaformer/test_modeling_seaformer.py

+            # # verify the first attentions (first block, first layer)
+            # expected_seq_len = (self.model_tester.image_size // 4) ** 2
+            # expected_reduced_seq_len = (self.model_tester.image_size // (4 * self.model_tester.sr_ratios[0])) ** 2
+            # self.assertListEqual(
+            #     list(attentions[0].shape[-3:]),
+            #     [self.model_tester.num_attention_heads[0], expected_seq_len, expected_reduced_seq_len],
+            # )
+
+            # # verify the last attentions (last block, last layer)
+            # expected_seq_len = (self.model_tester.image_size // 32) ** 2
+            # expected_reduced_seq_len = (self.model_tester.image_size // (32 * self.model_tester.sr_ratios[-1])) ** 2
+            # self.assertListEqual(
+            #     list(attentions[-1].shape[-3:]),
+            #     [self.model_tester.num_attention_heads[-1], expected_seq_len, expected_reduced_seq_len],


Suggested change

# # verify the first attentions (first block, first layer)

# expected_seq_len = (self.model_tester.image_size // 4) ** 2

# expected_reduced_seq_len = (self.model_tester.image_size // (4 * self.model_tester.sr_ratios[0])) ** 2

# self.assertListEqual(

# list(attentions[0].shape[-3:]),

# [self.model_tester.num_attention_heads[0], expected_seq_len, expected_reduced_seq_len],

# )

# # verify the last attentions (last block, last layer)

# expected_seq_len = (self.model_tester.image_size // 32) ** 2

# expected_reduced_seq_len = (self.model_tester.image_size // (32 * self.model_tester.sr_ratios[-1])) ** 2

# self.assertListEqual(

# list(attentions[-1].shape[-3:]),

# [self.model_tester.num_attention_heads[-1], expected_seq_len, expected_reduced_seq_len],

alaradirik · 2023-04-19T14:47:12Z

tests/models/seaformer/test_modeling_seaformer.py

+            # # verify the first attentions (first block, first layer)
+            # expected_seq_len = (self.model_tester.image_size // 4) ** 2
+            # expected_reduced_seq_len = (self.model_tester.image_size // (4 * self.model_tester.sr_ratios[0])) ** 2
+            # self.assertListEqual(
+            #     list(self_attentions[0].shape[-3:]),
+            #     [self.model_tester.num_attention_heads[0], expected_seq_len, expected_reduced_seq_len],
+            # )


Suggested change

# # verify the first attentions (first block, first layer)

# expected_seq_len = (self.model_tester.image_size // 4) ** 2

# expected_reduced_seq_len = (self.model_tester.image_size // (4 * self.model_tester.sr_ratios[0])) ** 2

# self.assertListEqual(

# list(self_attentions[0].shape[-3:]),

# [self.model_tester.num_attention_heads[0], expected_seq_len, expected_reduced_seq_len],

# )

alaradirik · 2023-04-19T14:48:41Z

tests/models/seaformer/test_modeling_seaformer.py

+            image_scale=(512, 512), keep_ratio=False, align=False, do_random_crop=False
+        )
+        model = SeaformerForSemanticSegmentation.from_pretrained(
+            "Inderpreet01/seaformer-semantic-segmentation-large"


Should be updated to the organization repo, it might take a while for the university to create an organization on the hub so we will most likely put the checkpoints under huggingface org for a while:

Suggested change

"Inderpreet01/seaformer-semantic-segmentation-large"

"huggingface/seaformer-semantic-segmentation-large"

alaradirik · 2023-04-19T14:49:33Z

tests/test_modeling_common.py

@@ -508,18 +508,21 @@ class CopyClass(base_class):
                    self.assertLessEqual(max_diff, 1e-3, msg=f"{key} not identical")

    def test_initialization(self):


Please do not edit common test and modeling scripts, the changes should be reverted.

If there is a special case for the added model, you can overwrite this test by creating a test_initialization method within test_modeling_seaformer.py

amyeroberts · 2023-04-21T14:52:25Z

@inderpreetsingh01 Thanks for adding this model! Ping me when the PR is ready for review (once all of @alaradirik's comments have been addressed and tests are passing).

inderpreetsingh01 · 2023-04-22T05:08:21Z

@alaradirik thanks for the review, @amyeroberts sure will ping you once model is ready

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

github-actions · 2023-05-16T15:02:49Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

inderpreetsingh01 added 2 commits February 27, 2023 15:47

Add model

5240521

doc and tests

b2f4b63

inderpreetsingh01 mentioned this pull request Feb 27, 2023

[WIP] Add Seaformer #21774

Closed

5 tasks

inderpreetsingh01 added 2 commits March 18, 2023 13:02

Merge remote-tracking branch 'upstream/main' into add_seaformer_model

8701ff1

updated modeling, configuration and conversion script

712d5b3

alaradirik reviewed Mar 22, 2023

View reviewed changes

inderpreetsingh01 and others added 19 commits March 23, 2023 15:56

updated readme files

6e36028

Merge remote-tracking branch 'upstream/main' into add_seaformer_model

0bfcffa

Merge remote-tracking branch 'upstream/main' into add_seaformer_model

8f6a8df

updated docs and __init__ file

77a78f3

Merge remote-tracking branch 'upstream/main' into add_seaformer_model

c031a6a

updated configuration removed feature_extraction file

8d1a3ed

removed feature extraction references and added push_to_hub

634d648

updated image_processing_seaformer to remove backward compatibility p…

dc511a8

…arts

Update src/transformers/models/seaformer/modeling_seaformer.py

6fe19fb

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

test

ac6ef94

Merge branch 'add_seaformer_model' of https://github.com/inderpreetsi…

b8b01ae

…ngh01/transformers into add_seaformer_model

added type casting

848f774

updated layer names, added function to activations and other changes

061b1eb

doc and tests

bb71f87

updated modeling, configuration and conversion script

4906083

updated readme files

ff2d337

updated configuration removed feature_extraction file

5d764e9

code formatting and tests

b2c9c78

changes

6b06aea

inderpreetsingh01 added 2 commits April 10, 2023 16:28

test changes

e76039f

Merge branch 'add_seaformer_model' of https://github.com/inderpreetsi…

e2250e4

…ngh01/transformers into add_seaformer_model

inderpreetsingh01 and others added 4 commits April 17, 2023 21:50

Merge branch 'huggingface:main' into add_seaformer_model

ec7726d

overwritten test_initialization

45073dd

quality

753f68b

Merge branch 'huggingface:main' into add_seaformer_model

54a099c

alaradirik reviewed Apr 19, 2023

View reviewed changes

alaradirik changed the title ~~[WIP] Add Seaformer model~~ Add Seaformer model Apr 19, 2023

alaradirik requested a review from amyeroberts April 19, 2023 15:05

inderpreetsingh01 and others added 12 commits April 22, 2023 10:45

Update docs/source/en/_toctree.yml

e42c756

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

Update docs/source/en/model_doc/seaformer.mdx

f9b7beb

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

Update src/transformers/activations.py

7b87663

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

Update src/transformers/activations.py

982f08d

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

Update src/transformers/activations.py

0795036

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

Update src/transformers/models/seaformer/__init__.py

14d32d6

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

Update src/transformers/models/seaformer/__init__.py

c5f0d39

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

Update src/transformers/models/seaformer/__init__.py

d5cbdb6

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

Update src/transformers/models/seaformer/modeling_seaformer.py

a734419

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

Update src/transformers/models/seaformer/__init__.py

4a8c677

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

Update utils/check_repo.py

af283e5

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

Update utils/check_repo.py

0a4344e

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

github-actions bot closed this May 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Seaformer model #21819

Add Seaformer model #21819

inderpreetsingh01 commented Feb 27, 2023 •

edited

Loading

alaradirik commented Feb 27, 2023

alaradirik commented Feb 27, 2023

What does this PR do?

Before submitting

Who can review?

inderpreetsingh01 commented Mar 18, 2023

alaradirik left a comment

alaradirik Mar 22, 2023

inderpreetsingh01 commented Mar 23, 2023

alaradirik commented Mar 23, 2023

inderpreetsingh01 commented Apr 10, 2023

inderpreetsingh01 commented Apr 11, 2023 •

edited

Loading

alaradirik commented Apr 17, 2023

inderpreetsingh01 commented Apr 17, 2023

inderpreetsingh01 commented Apr 18, 2023

alaradirik left a comment

alaradirik Apr 19, 2023

alaradirik Apr 19, 2023

alaradirik Apr 19, 2023

alaradirik Apr 19, 2023

amyeroberts commented Apr 21, 2023

inderpreetsingh01 commented Apr 22, 2023

github-actions bot commented May 16, 2023

	"Inderpreet01/seaformer-semantic-segmentation-large"
	"huggingface/seaformer-semantic-segmentation-large"

		@@ -508,18 +508,21 @@ class CopyClass(base_class):
		self.assertLessEqual(max_diff, 1e-3, msg=f"{key} not identical")

		def test_initialization(self):

Add Seaformer model #21819

Add Seaformer model #21819

Conversation

inderpreetsingh01 commented Feb 27, 2023 • edited Loading

What does this PR do?

Before submitting

Who can review?

alaradirik commented Feb 27, 2023

alaradirik commented Feb 27, 2023

What does this PR do?

Before submitting

Who can review?

inderpreetsingh01 commented Mar 18, 2023

alaradirik left a comment

Choose a reason for hiding this comment

alaradirik Mar 22, 2023

Choose a reason for hiding this comment

inderpreetsingh01 commented Mar 23, 2023

alaradirik commented Mar 23, 2023

inderpreetsingh01 commented Apr 10, 2023

inderpreetsingh01 commented Apr 11, 2023 • edited Loading

alaradirik commented Apr 17, 2023

inderpreetsingh01 commented Apr 17, 2023

inderpreetsingh01 commented Apr 18, 2023

alaradirik left a comment

Choose a reason for hiding this comment

alaradirik Apr 19, 2023

Choose a reason for hiding this comment

alaradirik Apr 19, 2023

Choose a reason for hiding this comment

alaradirik Apr 19, 2023

Choose a reason for hiding this comment

alaradirik Apr 19, 2023

Choose a reason for hiding this comment

amyeroberts commented Apr 21, 2023

inderpreetsingh01 commented Apr 22, 2023

github-actions bot commented May 16, 2023

inderpreetsingh01 commented Feb 27, 2023 •

edited

Loading

inderpreetsingh01 commented Apr 11, 2023 •

edited

Loading