-
-
Notifications
You must be signed in to change notification settings - Fork 16.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inference with the pretrained and custom weights , the FPS are very low than given in README.md #2706
Comments
👋 Hello @research-boy, thank you for your interest in 🚀 YOLOv5! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution. If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you. If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available. For business inquiries or professional support requests please visit https://www.ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com. RequirementsPython 3.8 or later with all requirements.txt dependencies installed, including $ pip install -r requirements.txt EnvironmentsYOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
StatusIf this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit. |
@research-boy instructions for reproducing table results are directly below table. |
@glenn-jocher Does that mean the we can reproduce the same fps on coco dataset and with batch size 32 ? |
@research-boy yes |
@glenn-jocher would like to know in case of deployment , what will be the preferred parameter settings to get optimal fps? |
@research-boy the default settings for best inference under most common use cases are already in place in detect.py and PyTorch Hub model inference. You may want to adapt these as necessary to your custom requirements. Lines 149 to 168 in ec8979f
For PyTorch Hub inference see PyTorch Hub tutorial: YOLOv5 Tutorials
|
@glenn-jocher is there any way to use batch processing in class loadstreams , this for trying batch processing on webcam input? |
@research-boy loadstreams dataloader automatically runs batched inference. If you supply 32 streams then your batch size is 32. |
@glenn-jocher What's the difference between using I'm surprised that the Hub version seems to process a whole list of images as one batch, as opposed to dividing the images into batches of a fixed size. The point of using a fixed batch size is to find the sweet spot between inference time (which increases sub-linearly with batch size) versus the time to load each batch. |
@vtyw detect.py is a fully managed command-line YOLOv5 inference solution. YOLOv5 PyTorch Hub models with autoShape() wrappers are python inference solutions suitable for integration into custom projects. For batch-size 1 inference with these models you can simply pass one image at a time. |
@glenn-jocher Here's what I collected as an example: Using yolov5s inference on 640x640 images on a single RTX 2080:
From this I could form a conclusion such as: using a larger batch size improves FPS up to n = 8, and the increase in GPU memory is insignificant. Many object detection libraries offer this batch processing as a built-in feature, and that's why I was confused when batch processing is mentioned in the README and in some code comments but isn't actually a built-in feature as such. |
@vtyw nice table! Batched inference is the automatic default when more than 1 image is passed for inference in our YOLOv5 PyTorch Hub solution: import torch
# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
# Images
dir = 'https://github.com/ultralytics/yolov5/raw/master/data/images/'
imgs = [dir + f for f in ('zidane.jpg', 'bus.jpg')] # batch of images
# Inference
results = model(imgs)
results.print() # or .show(), .save() See PyTorch Hub tutorial for details: YOLOv5 Tutorials
|
Why is batched processing (multiple images at once) faster? Is there a way to achieve that speed by processing images singularly (stream processing)? |
@zeyad-mansour If you looking specific to deploy the model on an edge device try TensorRT with different precision, this should give better FPS. |
@research-boy That's what I figured. Thanks for the explanation! |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Tested Models - YOLOv5s , YOLOV5m
Tested on GPUs - NVIDIA 2070, NVIDIA Quadro 6000
Tested on OS - Ubuntu 20.04, WIndows 10
Inference Batch size : 1
Image size : 640
As per the table provided :
Would like to know the reason of this and let me know if anything i can change to get the same FPS as yours ?
The text was updated successfully, but these errors were encountered: