YOLOv5 conversion and inference using TensorRT (FP16), with no complicated installations setup and zero precession loss!
Tested with Linux based systems (Colab T4/P4/K80, Jetson-Nano (with jetpack installed), Ubuntu-GTX 1650)
First clone this repo and install requirements
$ git clone https://github.com/BlueMirrors/Yolov5-TensorRT.git
$ cd Yolov5-TensorRT
$ pip install -r requirements.txt
Now run inference on video or image file (with pretrained weights).
python detect.py --input $PATH_TO_INPUT_FILE --output $OUTPUT_FILE_NAME
You can also pass --weights
to use your own custom onnx weight file (it'll generate tensorrt engine file internally) or tensorrt engine file (generated from convert.py). You can also pass --classes
for your custom trained weights and/or to filter classes for COCO.
For pretrained default weights (--weights yolov5s
), scripts will download + internally generate new engine file for unseen input shape, but if you are using a custom weight then remeber to rename or remove engine file if you want to generate engines for different shapes.
(Only supported for NVIDIA-GPUs, Tested on Linux Devices, Partial Dynamic Support)
You can convert ONNX weights to TensorRT by using the convert.py
file. Simple run the following command:
python convert.py --weights yolov5s.onnx --img-size 720 1080
-
By default the onnx model is converted to TensorRT engine with FP16 precision. To convert to TensorRT engine with FP32 precision use
--fp32
when running the above command. -
If using default weights, you do not need to download the ONNX model as the script will download it.
-
If you want to build the engine with custom image size, pass
--img-size custom_img_size
toconvert.py
-
If you want to build the engine for your custom weights, simply do the following:
Make sure you use the
---dynamic
flag while exporting your custom weights.python export.py --weights $PATH_TO_PYTORCH_WEIGHTS --dynamic --include onnx
Now simply use
python convert.py --weights path_to_custom_weights.onnx
, and you will have a converted TensorRT engine. Also add--nc
(number of classes) if your custom model has different number of classes than COCO(i.e. 80 classes).
-
Set up docker and NVIDIA Container Toolkit.
-
Pull TensorRT 8.2.2 docker image. Read more about this container release.
docker pull nvcr.io/nvidia/tensorrt:22.01-py3
- Run the container image copying Yolov5-TensorRT in the /home folder.
docker run --gpus all -it -v /path/to/Yolov5-TensorRT:/home nvcr.io/nvidia/tensorrt:22.01-py3
If this is successful, you should be inside the docker container.
- Install requirements inside docker container.
cd /home
pip install -r requirements.txt
- Install OpenCV dependencies.
apt-get update
apt-get install ffmpeg libsm6 libxext6 -y
- Usual Yolov5-TensorRT execution. Run inference on a image/video.
python detect.py --input $PATH_TO_INPUT_FILE --output $OUTPUT_FILE_NAME
- Convert custom onnx yolov5 model to tensorrt
cd /home
python convert.py --weights yolov5s.onnx --img-size 720 1080
In our tests, TensorRT had identical outputs as original pytorch weights.
Based on 5000 inference iterations after 100 iterations of warmups. Includes Image Preprocessing (letterboxing etc.), Model Inference and Output Postprocessing (NMS, Scale-Coords, etc.) time only.
Hardware | FPS |
---|---|
T4 | 157-165 |
GTX 1650 | 138-145 |
P4 | 82-86 |
K80 | 49-55 |
-
We support Letterboxing (significantly better accuracy!!)
-
TensorRT model is not fully dynamic (for optimization reasons). You can inference on any shape of image and it'll setup engine with the first input's shape. To run inference on a different image shape, you'll have to convert a new engine.
-
Building TensorRT Engine and first inference can take sometime to complete (specially if it also has to install all the dependecies for the first time).
-
A new engine is built for an unseen input shape. But once built, engine file is serialized and can be used for future inference.
-
Delete or rename previous serialized engine from the disk before converting a new engine (required only when image shape is different).
-
Batch support will be added next week (after 15th August)