-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support ViTPose #1876
Conversation
Have you tried training a base ViT model to check the accuracy? |
Result of the current implementation
Result of original ViTPose implementation
|
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## dev-1.x #1876 +/- ##
===========================================
- Coverage 82.22% 81.77% -0.46%
===========================================
Files 225 227 +2
Lines 13375 13438 +63
Branches 2269 2285 +16
===========================================
- Hits 10998 10989 -9
- Misses 1864 1933 +69
- Partials 513 516 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 2 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
extra (dict, optional): Extra configurations. | ||
Defaults to ``None`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument extra
is convenient for extending the head class but may confuse users. We should keep a well-defined interface where every argument has a clear meaning and a detailed usage description in the docstring.
Here I think maybe we can use existing arguments conv_out_channels
, conv_kernel_sizes
, and has_final_layer
to configure the final conv layers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep the code clear and simple, would it be better to split the dictionary into two parameters, e.g. input_upsample
(defaults to 0) and final_kernel_size
(defaults to 1)?
@@ -101,6 +104,21 @@ def __init__(self, | |||
self.decoder = KEYPOINT_CODECS.build(decoder) | |||
else: | |||
self.decoder = None | |||
self.upsample = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest adding a new argument, e.g. input_upsample
or input_rescale
.
- include testing results from original repo - update training results
Motivation
Add implementation of ViTPose on MMPose 1.0
Modification
HeatmapHead
, add parameterupsample
, default to zero (no effect on previous codes),resize
inBaseHead
BC-breaking (Optional)
Use cases (Optional)
Checklist
**Before
After PR: