Adjust vision transformer backbone architectures #524

clownrat6 · 2021-04-28T11:34:32Z

Add DropPath, trunc_normal_ for VisionTransformer implementation;
Add arg for VisionTransformer backbone to control if input class token into transformer;
Class token will be removed during output stage;

* Add DropPath, trunc_normal_ for VisionTransformer implementation; * Add class token buring intermediate period and remove it during final period;

codecov · 2021-04-28T13:20:05Z

Codecov Report

Merging #524 (ae116f7) into master (d568d06) will increase coverage by 0.07%.
The diff coverage is 89.41%.

@@            Coverage Diff             @@
##           master     #524      +/-   ##
==========================================
+ Coverage   86.60%   86.68%   +0.07%     
==========================================
  Files          99      101       +2     
  Lines        5160     5234      +74     
  Branches      834      848      +14     
==========================================
+ Hits         4469     4537      +68     
- Misses        533      535       +2     
- Partials      158      162       +4

Flag	Coverage Δ
unittests	`86.68% <89.41%> (+0.07%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
mmseg/models/backbones/vit.py	`87.76% <85.71%> (-0.21%)`	⬇️
mmseg/models/utils/weight_init.py	`89.47% <89.47%> (ø)`
mmseg/models/utils/__init__.py	`100.00% <100.00%> (ø)`
mmseg/models/utils/drop.py	`100.00% <100.00%> (ø)`
mmseg/models/losses/utils.py	`81.57% <0.00%> (+4.91%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d568d06...ae116f7. Read the comment docs.

* Remove class token and reshape entire token feature from NLC to NCHW;

…ken into transformer;

* Add related unit test;

xvjiarui · 2021-04-30T07:50:49Z

mmseg/models/backbones/vit.py

+            final feature map. Default: False.
+        interpolate_mode (str): Select the interpolate mode for position
+            embeding vector resize. Default: bilinear.
+        input_cls_token (bool): If concatenating class token into image tokens


Suggested change

input_cls_token (bool): If concatenating class token into image tokens

with_cls_token (bool): If concatenating class token into image tokens

xvjiarui · 2021-04-30T07:52:37Z

mmseg/models/backbones/vit.py

        self.pos_drop = nn.Dropout(p=drop_rate)

-        self.blocks = nn.Sequential(*[
+        self.num_stages = depth
+        self.out_indices = tuple(range(self.num_stages))


We may make this an argument, default to (12, )

xvjiarui · 2021-04-30T07:55:10Z

mmseg/models/utils/drop.py

+        self.drop_prob = drop_prob
+        self.keep_prob = 1 - drop_prob
+
+    def forward(self, x):


Missing unit tests.

tests/test_models/test_utils/test_drop.py

xvjiarui · 2021-04-30T08:00:30Z

mmseg/models/backbones/vit.py

+            x = x[:, 1:]
+
+        outs = []
+        block_len = len(self.blocks)


This variable is not needed.

xvjiarui · 2021-04-30T08:07:25Z

mmseg/models/backbones/vit.py

+        block_len = len(self.blocks)
+        for i, blk in enumerate(self.blocks):
+            x = blk(x)
+            if i == block_len - 1:


Suggested change

if i == block_len - 1:

if i == len(self.blocks) - 1:

* Add unit test for DropPath;

mmseg/models/backbones/vit.py

* Adjust vision transformer backbone architectures; * Add DropPath, trunc_normal_ for VisionTransformer implementation; * Add class token buring intermediate period and remove it during final period; * Fix some parameters loss bug; * * Store intermediate token features and impose no processes on them; * Remove class token and reshape entire token feature from NLC to NCHW; * Fix some doc error * Add a arg for VisionTransformer backbone to control if input class token into transformer; * Add stochastic depth decay rule for DropPath; * * Fix output bug when input_cls_token=False; * Add related unit test; * * Add arg: out_indices to control model output; * Add unit test for DropPath; * Apply suggestions from code review Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com>

* Add 3D pose pipeline * Add transforms on joint coordinates in pipelines * Add camera projection in pipelines * Add camera interface in mmpose.core * Add 3D pose pipeline * Revise code * fix variable name * Add 3D pose pipeline * update unittests for better codecov rate * Add 3D pose pipeline * update unittests for better codecov rate * Add 3D pose pipeline * update unittests for better codecov rate * Add 3D pose pipeline * Add PoseSequenceToTensor * minor fix according to review comments * Revise according to review comments * rebase to master * extend fliplr_regression to handle 2D/3D * add remove_root option to JointRelativization * Fix docstring * update unittest * update unittest * update camera parameters to be in meter * minor fix to docstrings * minor fix * fix importing

sennnnn added 3 commits April 28, 2021 19:31

Adjust vision transformer backbone architectures;

da9573b

* Add DropPath, trunc_normal_ for VisionTransformer implementation; * Add class token buring intermediate period and remove it during final period;

Merge Master

f9c8420

Fix some parameters loss bug;

2f580e9

clownrat6 force-pushed the vit_adjust branch from 49722de to 06b6c3d Compare April 29, 2021 03:45

* Store intermediate token features and impose no processes on them;

e1d59cd

* Remove class token and reshape entire token feature from NLC to NCHW;

clownrat6 force-pushed the vit_adjust branch from 06b6c3d to e1d59cd Compare April 29, 2021 04:46

Fix some doc error

e4a11e7

xvjiarui mentioned this pull request Apr 29, 2021

add configs for vit backbone plus decode_heads #520

Merged

sennnnn added 2 commits April 29, 2021 14:56

Add a arg for VisionTransformer backbone to control if input class to…

d5644c1

…ken into transformer;

Add stochastic depth decay rule for DropPath;

70cde52

clownrat6 force-pushed the vit_adjust branch from fd0bc5d to 70cde52 Compare April 29, 2021 08:30

* Fix output bug when input_cls_token=False;

97af059

* Add related unit test;

clownrat6 mentioned this pull request Apr 29, 2021

Re-implement of SETR #526

Closed

3 tasks

clownrat6 force-pushed the vit_adjust branch from 5402718 to 97af059 Compare April 29, 2021 12:42

xvjiarui reviewed Apr 30, 2021

View reviewed changes

mmseg/models/backbones/vit.py Outdated

x = x[:, 1:]

outs = []

block_len = len(self.blocks)

Copy link

Collaborator

xvjiarui Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable is not needed.

xvjiarui reviewed Apr 30, 2021

View reviewed changes

* Add arg: out_indices to control model output;

1abb86c

* Add unit test for DropPath;

xvjiarui reviewed Apr 30, 2021

View reviewed changes

mmseg/models/backbones/vit.py Outdated Show resolved Hide resolved

xvjiarui reviewed Apr 30, 2021

View reviewed changes

mmseg/models/backbones/vit.py Outdated Show resolved Hide resolved

Apply suggestions from code review

ae116f7

xvjiarui approved these changes Apr 30, 2021

View reviewed changes

xvjiarui merged commit cf2cb54 into open-mmlab:master Apr 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adjust vision transformer backbone architectures #524

Adjust vision transformer backbone architectures #524

Uh oh!

clownrat6 commented Apr 28, 2021 •

edited

Loading

Uh oh!

codecov bot commented Apr 28, 2021 •

edited

Loading

Uh oh!

xvjiarui Apr 30, 2021

Uh oh!

xvjiarui Apr 30, 2021

Uh oh!

xvjiarui Apr 30, 2021

Uh oh!

xvjiarui Apr 30, 2021

Uh oh!

xvjiarui Apr 30, 2021

Uh oh!

xvjiarui Apr 30, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

	input_cls_token (bool): If concatenating class token into image tokens
	with_cls_token (bool): If concatenating class token into image tokens

Adjust vision transformer backbone architectures #524

Adjust vision transformer backbone architectures #524

Uh oh!

Conversation

clownrat6 commented Apr 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Apr 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

xvjiarui Apr 30, 2021

Choose a reason for hiding this comment

Uh oh!

xvjiarui Apr 30, 2021

Choose a reason for hiding this comment

Uh oh!

xvjiarui Apr 30, 2021

Choose a reason for hiding this comment

Uh oh!

xvjiarui Apr 30, 2021

Choose a reason for hiding this comment

Uh oh!

xvjiarui Apr 30, 2021

Choose a reason for hiding this comment

Uh oh!

xvjiarui Apr 30, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

clownrat6 commented Apr 28, 2021 •

edited

Loading

codecov bot commented Apr 28, 2021 •

edited

Loading