-
-
Notifications
You must be signed in to change notification settings - Fork 16.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YOLOv5 P6 Models 😃 #2110
Comments
Hello Glenn, I'm currently working with images with high resolution (8000 x 4000) and I don't want to downsample the images too far, since the model might not be able to detect objects far away. So, it would be super nice If we could use a pretrained model on high resolution, for example the 1280, you're talking about. Is it possible to download the weights of these models? I was also curious if you have models on the higher resolution without the additional output layer, I don't think I necessarily need that. Thanks! |
Hello Glenn, if the target of the data set is small, can I output only P2\P3\P4 How to modify the YAML file? Thank you very much! |
@laurenssam yes the P6 models can be manually downloaded here: Any model in the latest release assets can also be auto-downloaded simply by asking for it in a command, i.e.: python detect.py --weights yolov5s6.pt # auto-download YOLOv5s P6 model |
@WANGCHAO1996 yes the YAML files are easy to modify. You can remove (and add) outputs by simply changing the Detect() module inputs here. Inputs 17, 20 and 23 correspond to P3, P4 and P5 grids. Remember if you modify the number of layers here you should also modify your anchors correspondingly, or simply delete the anchors and replace them with Line 47 in 73a0669
|
Thanks, but how do I know the resolution the model was trained on? |
@laurenssam the P6 models were trained on 1280 by default, except for the ones denoted by -640 (to get an apples to apples comparison with the current models). For training on large images, you can see our xview repo here. The general concept is to train on smaller 'chips' at native resolution, and then run inference either at native resolution if you can, or else use a sliding window that's stitched togethor later. |
parametersnc: 1 # number of classes anchorsanchors: 3 YOLOv5 backbonebackbone: [from, number, module, args][ [ -1, 1, Focus, [ 64, 3 ] ], # 0-P1/2 YOLOv5 headhead:
] |
@WANGCHAO1996 I don't know what you are asking and your post is poorly formatted. Use ``` for code sections. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hi Glenn -- two questions for you: You mention that the x6 model adds an additional layer "for extra-large objects," but also that it seems to perform better than the normal x model "under all conditions." Of course, I'll have to experiment, but just to understand your underlying intentions while creating the model -- would x6 still work for large images (1000px+) with "small" objects to detect (~100px+)? Thanks! |
@Shaotran all models contain their yaml files as attributes: model = torch.load(...
model.yaml The P6 models perform better on COCO than their P5 counterparts. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@glenn-jocher Hello Jocher, in the case that I want to convert to trt ,I wonder what is size of input( 640 or 1280) should be used for detecting yolov5s6 model? |
@ThuyHoang9001 P6 models will get best results at --img 1280, and can also be used all the way down to --img 64 with decreasing levels of accuracy. |
We've done a bit of experimentation with adding an additional large object output layer P6, following the EfficientDet example of increasing output layers for larger models, except in our case applying it to all models. The current models have P3 (stride 8, small) to P5 (stride 32, large) outputs. The P6 output layer is stride 64 and intended for extra-large objects. It seems to help normal COCO training at --img 640, and we've also gotten good results training at --img 1280.
The architecture changes we've made to add a P6 layer are here. The backbone is extended down to P6, and the PANet head goes to down to P3 (like usual) and back up to P6 now instead of stopping at P5. New anchors were also added, evolved at --img 1280.
The chart below shows the current YOLOv5l 4.0 model in blue, and the new YOLOv5l6 architecture in green and orange. The green and orange lines correspond to the same architecture but trained at --img 640 (green) or --img 1280 (orange). The points plotted are the evaluations of each model along an image space vector from 384 to 1536 in steps of 128, with results plotted up to the max mAP point. Code to reproduce is
python test.py --task study
following changes from PR ##2099.Conclusion is that the P6 models increase performance on COCO under all scenarios we tested, though they are also a bit slower (maybe 10% slower), larger (50% larger), with only slightly more FLOPS though, so training time and CUDA memory requirements are almost the same as the P5 models. We are doing some more studies to see if these might be suitable for replacement of the current models. Let me know if you have any thoughts. For the time being, these models are available for auto-download just like the regular models, i.e.
python detect.py --weights yolov5s6.pt
.The text was updated successfully, but these errors were encountered: