-
-
Notifications
You must be signed in to change notification settings - Fork 16.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rerange the blocks of Focus Layer into row major
to be compatible with tensorflow SpaceToDepth
#413
Comments
Hello @ausk, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook , Docker Image, and Google Cloud Quickstart Guide for example environments. If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you. If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:
For more information please visit https://www.ultralytics.com. |
row major
to be compatible with tensorflow SpaceToDepth
row major
to be compatible with tensorflow SpaceToDepth
@ausk modifying the Focus() module will invalidate all YOLOv5 pretrained models, so I would highly advise against it. |
@glenn-jocher Modifying the Focus() module will bring benefits of improved versatility, because many frameworks/libraries store the data in Yes, it will hurt the accuracy of current pretrained models. But if train from scratch, I still recommand to modify. It's a tradeoff. |
Sure. I volunteer you to retrain all of the pretrained models to their current accuracy with your proposed architecture changes then. Once this is done please submit a PR and we are all set :) |
Thank you for your work, anyway. I realise that the space2depth( slice and concate ops) of Focus is the 0th layer of the model, so when inference, we can just remove it, just the conv. So the input becomes nchw (nb, 12, nh, nw). Finally, I have translated the small model (v2) into keras( tensorflow) with nhwc(1, nh, nw, nc) input, and inference success. Just close as you rejected this. |
@ausk ok sounds good! But no I didn't reject the idea. If you can retrain the 4 models with your changes to >= performance and submit a PR then we are good to go. |
@ausk better late than never, I've reopened this issue and will examine this option more closely to better align PyTorch and TF YOLOv5 versions to possibly improve TFLite export (google-coral/edgetpu#272). EDIT: I don't see a problem here, seems like a simple change that brings exportability benefits. I'll try my best to include this update in the next release that includes fully retrained models (i.e. 4.1 or 5.0 possibly). |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
TODO removed following release v6.0 architecture updates. |
🚀 Feature
Modify
Focus Layer
intorow major
to be compatible withtf.space_to_depth
.Just change the blocks order:
from :
torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1)
to :
torch.cat([x[..., ::2, ::2], x[..., ::2, 1::2], x[..., 1::2, ::2], x[..., 1::2, 1::2]], 1)
Motivation
In model/common.py, the Focus Layer is defined in Pytorch as following:
And @bonlime posted a brief answer to What's the Focus layer? #207:
check TResNet paper. p2. They call it
SpaceToDepth
In the
TResNet
paper, p2.1We wanted to create a fast, seamless stem layer, with little information loss as possible, and let the simple well designed residual blocks do all the actual processing work. The stem sole functionality should be to downscale the input resolution to match the rest of the architecture, e.g., by a factor of 4. We met these goals by using a dedicated SpaceToDepth transformation layer [32], that rearranges blocks of spatial data into depth. The SpaceToDepth transformation layer is followed by simple 1x1 convolution to match the number of wanted channels
.That to say, the focus layer is to fast download the input resolution by rearanging blocks of spatial data into depth, and change the feature channels generally by 1x1 conv.
And there is an op
SpaceToDepth
(tf.space_to_depth
,tf2.nn.space_to_depth
) torearranges blocks of spatial data
.The
Fcous
layer usetorch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1)
.Then we compare:
(0) input
(1) by Focus
torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1)
(2) by tensorflow
(3) modify Focus
torch.cat([x[..., ::2, ::2], x[..., ::2, 1::2], x[..., 1::2, ::2], x[..., 1::2, 1::2]], 1)
So, just modify the order of the blocks, we can make it compatible tensorflow
SpaceToDepth
op.It will make the model be more likely to transport into tensorflow.
The text was updated successfully, but these errors were encountered: