NVIDIA DeepStream SDK 6.1 / 6.0.1 / 6.0 configuration for YOLO models
- New documentation for multiple models
- DeepStream tutorials
- Native YOLOX support
- Native PP-YOLO support
- Dynamic batch-size
- Darknet CFG params parser (no need to edit nvdsparsebbox_Yolo.cpp or another file)
- Support for new_coords, beta_nms and scale_x_y params
- Support for new models
- Support for new layers
- Support for new activations
- Support for convolutional groups
- Support for INT8 calibration
- Support for non square models
- Support for reorg, implicit and channel layers (YOLOR)
- YOLOv5 4.0, 5.0, 6.0 and 6.1 native support
- YOLOR native support
- Models benchmarks (outdated)
- GPU YOLO Decoder (moved from CPU to GPU to get better performance) #138
- Improved NMS #142
- Requirements
- Tested models
- Benchmarks
- dGPU installation
- Basic usage
- YOLOv5 usage
- YOLOR usage
- INT8 calibration
- Using your custom model
- Ubuntu 20.04
- CUDA 11.6 Update 1
- TensorRT 8.2 GA Update 4 (8.2.5.1)
- NVIDIA Driver 510.47.03
- NVIDIA DeepStream SDK 6.1
- GStreamer 1.16.2
- DeepStream-Yolo
- Ubuntu 18.04
- CUDA 11.4 Update 1
- TensorRT 8.0 GA (8.0.1)
- NVIDIA Driver >= 470.63.01
- NVIDIA DeepStream SDK 6.0.1 / 6.0
- GStreamer 1.14.5
- DeepStream-Yolo
nms-iou-threshold = 0.6
pre-cluster-threshold = 0.001 (mAP eval) / 0.25 (FPS measurement)
batch-size = 1
valid = val2017 (COCO) - 1000 random images for INT8 calibration
sample = 1920x1080 video
NOTE: Used maintain-aspect-ratio=1 in config_infer file for YOLOv4 (with letter_box=1), YOLOv5 and YOLOR models.
DeepStream | PyTorch | |
---|---|---|
FPS (without display) | 13.32 | 10.07 |
FPS (with display) | 12.63 | 9.41 |
DeepStream | TensorRTx | Ultralytics | |
---|---|---|---|
FPS (without display) | 110.25 | 87.42 | 97.19 |
FPS (with display) | 105.62 | 73.07 | 50.37 |
More
DeepStream | Precision | Resolution | IoU=0.5:0.95 | IoU=0.5 | IoU=0.75 | FPS (without display) |
---|---|---|---|---|---|---|
YOLOR-P6 | FP32 | 1280 | 0.478 | 0.663 | 0.519 | 5.53 |
YOLOR-CSP-X* | FP32 | 640 | 0.473 | 0.664 | 0.513 | 7.59 |
YOLOR-CSP-X | FP32 | 640 | 0.470 | 0.661 | 0.507 | 7.52 |
YOLOR-CSP* | FP32 | 640 | 0.459 | 0.652 | 0.496 | 13.28 |
YOLOR-CSP | FP32 | 640 | 0.449 | 0.639 | 0.483 | 13.32 |
YOLOv5x6 6.0 | FP32 | 1280 | 0.504 | 0.681 | 0.547 | 2.22 |
YOLOv5l6 6.0 | FP32 | 1280 | 0.492 | 0.670 | 0.535 | 4.05 |
YOLOv5m6 6.0 | FP32 | 1280 | 0.463 | 0.642 | 0.504 | 7.54 |
YOLOv5s6 6.0 | FP32 | 1280 | 0.394 | 0.572 | 0.424 | 18.64 |
YOLOv5n6 6.0 | FP32 | 1280 | 0.294 | 0.452 | 0.314 | 26.94 |
YOLOv5x 6.0 | FP32 | 640 | 0.469 | 0.654 | 0.509 | 8.24 |
YOLOv5l 6.0 | FP32 | 640 | 0.450 | 0.634 | 0.487 | 14.96 |
YOLOv5m 6.0 | FP32 | 640 | 0.415 | 0.601 | 0.448 | 28.30 |
YOLOv5s 6.0 | FP32 | 640 | 0.334 | 0.516 | 0.355 | 63.55 |
YOLOv5n 6.0 | FP32 | 640 | 0.250 | 0.417 | 0.260 | 110.25 |
YOLOv4-P6 | FP32 | 1280 | 0.499 | 0.685 | 0.542 | 2.57 |
YOLOv4-P5 | FP32 | 896 | 0.472 | 0.659 | 0.513 | 5.48 |
YOLOv4-CSP-X-SWISH | FP32 | 640 | 0.473 | 0.664 | 0.513 | 7.51 |
YOLOv4-CSP-SWISH | FP32 | 640 | 0.459 | 0.652 | 0.496 | 13.13 |
YOLOv4x-MISH | FP32 | 640 | 0.459 | 0.650 | 0.495 | 7.53 |
YOLOv4-CSP | FP32 | 640 | 0.440 | 0.632 | 0.474 | 13.19 |
YOLOv4 | FP32 | 608 | 0.498 | 0.740 | 0.549 | 12.18 |
YOLOv4-Tiny | FP32 | 416 | 0.215 | 0.403 | 0.206 | 201.20 |
YOLOv3-SPP | FP32 | 608 | 0.411 | 0.686 | 0.433 | 12.22 |
YOLOv3-Tiny-PRN | FP32 | 416 | 0.167 | 0.382 | 0.125 | 277.14 |
YOLOv3 | FP32 | 608 | 0.377 | 0.672 | 0.385 | 12.51 |
YOLOv3-Tiny | FP32 | 416 | 0.095 | 0.203 | 0.079 | 218.42 |
YOLOv2 | FP32 | 608 | 0.286 | 0.541 | 0.273 | 25.28 |
YOLOv2-Tiny | FP32 | 416 | 0.102 | 0.258 | 0.061 | 231.36 |
To install the DeepStream on dGPU (x86 platform), without docker, we need to do some steps to prepare the computer.
DeepStream 6.1
sudo apt-get update
sudo apt-get install gcc make git libtool autoconf autogen pkg-config cmake
sudo apt-get install python3 python3-dev python3-pip
sudo apt-get install dkms
sudo apt-get install libssl1.1 libgstreamer1.0-0 gstreamer1.0-tools gstreamer1.0-plugins-good gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly gstreamer1.0-libav libgstrtspserver-1.0-0 libjansson4 libyaml-cpp-dev
sudo apt-get install linux-headers-$(uname -r)
NOTE: Purge all NVIDIA driver, CUDA, etc (replace $CUDA_PATH to your CUDA path).
sudo nvidia-uninstall
sudo $CUDA_PATH/bin/cuda-uninstaller
sudo apt-get remove --purge '*nvidia*'
sudo apt-get remove --purge '*cuda*'
sudo apt-get remove --purge '*cudnn*'
sudo apt-get remove --purge '*tensorrt*'
sudo apt autoremove --purge && sudo apt autoclean && sudo apt clean
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt-get update
- TITAN, GeForce RTX / GTX series and RTX / Quadro series
wget https://us.download.nvidia.com/XFree86/Linux-x86_64/510.47.03/NVIDIA-Linux-x86_64-510.47.03.run
- Data center / Tesla series
wget https://us.download.nvidia.com/tesla/510.47.03/NVIDIA-Linux-x86_64-510.47.03.run
- Run
sudo sh NVIDIA-Linux-x86_64-510.47.03.run --silent --disable-nouveau
- Reboot
sudo reboot
- Install
sudo sh NVIDIA-Linux-x86_64-510.47.03.run --silent --dkms --install-libglvnd
NOTE: If you are using a laptop with NVIDIA Optimius, run
sudo apt-get install nvidia-prime
sudo prime-select nvidia
wget https://developer.download.nvidia.com/compute/cuda/11.6.1/local_installers/cuda_11.6.1_510.47.03_linux.run
sudo sh cuda_11.6.1_510.47.03_linux.run --silent --toolkit
- Export environment variables
nano ~/.bashrc
- Add
export PATH=/usr/local/cuda-11.6/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.6/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
- Run
source ~/.bashrc
6. Download from NVIDIA website and install the TensorRT
TensorRT 8.2 GA Update 4 for Ubuntu 20.04 and CUDA 11.0, 11.1, 11.2, 11.3, 11.4 and 11.5 DEB local repo Package
sudo dpkg -i nv-tensorrt-repo-ubuntu2004-cuda11.4-trt8.2.5.1-ga-20220505_1-1_amd64.deb
sudo apt-key add /var/nv-tensorrt-repo-ubuntu2004-cuda11.4-trt8.2.5.1-ga-20220505/82307095.pub
sudo apt-get update
sudo apt install tensorrt
7. Download from NVIDIA website and install the DeepStream SDK
DeepStream 6.1 for Servers and Workstations (.deb)
sudo apt-get install ./deepstream-6.1_6.1.0-1_amd64.deb
rm ${HOME}/.cache/gstreamer-1.0/registry.x86_64.bin
sudo ln -snf /usr/local/cuda-11.6 /usr/local/cuda
sudo reboot
DeepStream 6.0.1 / 6.0
If you are using a laptop with newer Intel/AMD processors and your Graphics in Settings->Details->About tab is llvmpipe, please update the kernel.
wget https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.11/amd64/linux-headers-5.11.0-051100_5.11.0-051100.202102142330_all.deb
wget https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.11/amd64/linux-headers-5.11.0-051100-generic_5.11.0-051100.202102142330_amd64.deb
wget https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.11/amd64/linux-image-unsigned-5.11.0-051100-generic_5.11.0-051100.202102142330_amd64.deb
wget https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.11/amd64/linux-modules-5.11.0-051100-generic_5.11.0-051100.202102142330_amd64.deb
sudo dpkg -i *.deb
sudo reboot
sudo apt-get update
sudo apt-get install gcc make git libtool autoconf autogen pkg-config cmake
sudo apt-get install python3 python3-dev python3-pip
sudo apt install libssl1.0.0 libgstreamer1.0-0 gstreamer1.0-tools gstreamer1.0-plugins-good gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly gstreamer1.0-libav libgstrtspserver-1.0-0 libjansson4
sudo apt-get install linux-headers-$(uname -r)
NOTE: Install DKMS only if you are using the default Ubuntu kernel
sudo apt-get install dkms
NOTE: Purge all NVIDIA driver, CUDA, etc (replace $CUDA_PATH to your CUDA path).
sudo nvidia-uninstall
sudo $CUDA_PATH/bin/cuda-uninstaller
sudo apt-get remove --purge '*nvidia*'
sudo apt-get remove --purge '*cuda*'
sudo apt-get remove --purge '*cudnn*'
sudo apt-get remove --purge '*tensorrt*'
sudo apt autoremove --purge && sudo apt autoclean && sudo apt clean
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt-get update
- TITAN, GeForce RTX / GTX series and RTX / Quadro series
wget https://us.download.nvidia.com/XFree86/Linux-x86_64/470.129.06/NVIDIA-Linux-x86_64-470.129.06.run
- Data center / Tesla series
wget https://us.download.nvidia.com/tesla/470.129.06/NVIDIA-Linux-x86_64-470.129.06.run
- Run
sudo sh NVIDIA-Linux-x86_64-470.129.06.run --silent --disable-nouveau
- Reboot
sudo reboot
- Install
sudo sh NVIDIA-Linux-x86_64-470.129.06.run --silent --dkms --install-libglvnd
NOTE: If you are using a laptop with NVIDIA Optimius, run
sudo apt-get install nvidia-prime
sudo prime-select nvidia
wget https://developer.download.nvidia.com/compute/cuda/11.4.1/local_installers/cuda_11.4.1_470.57.02_linux.run
sudo sh cuda_11.4.1_470.57.02_linux.run --silent --toolkit
- Export environment variables
nano ~/.bashrc
- Add
export PATH=/usr/local/cuda-11.4/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
- Run
source ~/.bashrc
6. Download from NVIDIA website and install the TensorRT
TensorRT 8.0.1 GA for Ubuntu 18.04 and CUDA 11.3 DEB local repo package
sudo dpkg -i nv-tensorrt-repo-ubuntu1804-cuda11.3-trt8.0.1.6-ga-20210626_1-1_amd64.deb
sudo apt-key add /var/nv-tensorrt-repo-ubuntu1804-cuda11.3-trt8.0.1.6-ga-20210626/7fa2af80.pub
sudo apt-get update
sudo apt-get install libnvinfer8=8.0.1-1+cuda11.3 libnvinfer-plugin8=8.0.1-1+cuda11.3 libnvparsers8=8.0.1-1+cuda11.3 libnvonnxparsers8=8.0.1-1+cuda11.3 libnvinfer-bin=8.0.1-1+cuda11.3 libnvinfer-dev=8.0.1-1+cuda11.3 libnvinfer-plugin-dev=8.0.1-1+cuda11.3 libnvparsers-dev=8.0.1-1+cuda11.3 libnvonnxparsers-dev=8.0.1-1+cuda11.3 libnvinfer-samples=8.0.1-1+cuda11.3 libnvinfer-doc=8.0.1-1+cuda11.3
7. Download from NVIDIA website and install the DeepStream SDK
- DeepStream 6.0.1 for Servers and Workstations (.deb)
sudo apt-get install ./deepstream-6.0_6.0.1-1_amd64.deb
- DeepStream 6.0 for Servers and Workstations (.deb)
sudo apt-get install ./deepstream-6.0_6.0.0-1_amd64.deb
- Run
rm ${HOME}/.cache/gstreamer-1.0/registry.x86_64.bin
sudo ln -snf /usr/local/cuda-11.4 /usr/local/cuda
sudo reboot
git clone https://github.com/marcoslucianops/DeepStream-Yolo.git
cd DeepStream-Yolo
- DeepStream 6.1 on x86 platform
CUDA_VER=11.6 make -C nvdsinfer_custom_impl_Yolo
- DeepStream 6.0.1 / 6.0 on x86 platform
CUDA_VER=11.4 make -C nvdsinfer_custom_impl_Yolo
- DeepStream 6.1 on Jetson platform
CUDA_VER=11.4 make -C nvdsinfer_custom_impl_Yolo
- DeepStream 6.0.1 / 6.0 on Jetson platform
CUDA_VER=10.2 make -C nvdsinfer_custom_impl_Yolo
[property]
...
# 0=RGB, 1=BGR, 2=GRAYSCALE
model-color-format=0
# YOLO cfg
custom-network-config=yolov4.cfg
# YOLO weights
model-file=yolov4.weights
# Generated TensorRT model (will be created if it doesn't exist)
model-engine-file=model_b1_gpu0_fp32.engine
# Model labels file
labelfile-path=labels.txt
# Batch size
batch-size=1
# 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
# Number of classes in label file
num-detected-classes=80
...
[class-attrs-all]
# IOU threshold
nms-iou-threshold=0.45
# Score threshold
pre-cluster-threshold=0.25
deepstream-app -c deepstream_app_config.txt
NOTE: If you want to use YOLOv2 or YOLOv2-Tiny models, change the deepstream_app_config.txt file before run it
...
[primary-gie]
enable=1
gpu-id=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV2.txt
NOTE: Make sure to change the YOLOv5 repo version to your model version before conversion.
1. Copy gen_wts_yoloV5.py from DeepStream-Yolo/utils to ultralytics/yolov5 folder
3. Download pt file from ultralytics/yolov5 website (example for YOLOv5n 6.1)
wget https://github.com/ultralytics/yolov5/releases/download/v6.1/yolov5n.pt
python3 gen_wts_yoloV5.py -w yolov5n.pt
- DeepStream 6.1 on x86 platform
CUDA_VER=11.6 make -C nvdsinfer_custom_impl_Yolo
- DeepStream 6.0.1 / 6.0 on x86 platform
CUDA_VER=11.4 make -C nvdsinfer_custom_impl_Yolo
- DeepStream 6.1 on Jetson platform
CUDA_VER=11.4 make -C nvdsinfer_custom_impl_Yolo
- DeepStream 6.0.1 / 6.0 on Jetson platform
CUDA_VER=10.2 make -C nvdsinfer_custom_impl_Yolo
[property]
...
# 0=RGB, 1=BGR, 2=GRAYSCALE
model-color-format=0
# CFG
custom-network-config=yolov5n.cfg
# WTS
model-file=yolov5n.wts
# Generated TensorRT model (will be created if it doesn't exist)
model-engine-file=model_b1_gpu0_fp32.engine
# Model labels file
labelfile-path=labels.txt
# Batch size
batch-size=1
# 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
# Number of classes in label file
num-detected-classes=80
...
[class-attrs-all]
# IOU threshold
nms-iou-threshold=0.45
# Score threshold
pre-cluster-threshold=0.25
...
[primary-gie]
enable=1
gpu-id=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV5.txt
deepstream-app -c deepstream_app_config.txt
NOTE: For YOLOv5 P6 or custom models, check the gen_wts_yoloV5.py args and use them according to your model
- Input weights (.pt) file path (required)
-w or --weights
- Input cfg (.yaml) file path
-c or --yaml
- Model width (default = 640 / 1280 [P6])
-mw or --width
- Model height (default = 640 / 1280 [P6])
-mh or --height
- Model channels (default = 3)
-mc or --channels
- P6 model
--p6
1. Copy gen_wts_yolor.py from DeepStream-Yolo/utils to yolor folder
3. Download pt file from yolor website
python3 gen_wts_yolor.py -w yolor_csp.pt -c cfg/yolor_csp.cfg
- DeepStream 6.1 on x86 platform
CUDA_VER=11.6 make -C nvdsinfer_custom_impl_Yolo
- DeepStream 6.0.1 / 6.0 on x86 platform
CUDA_VER=11.4 make -C nvdsinfer_custom_impl_Yolo
- DeepStream 6.1 on Jetson platform
CUDA_VER=11.4 make -C nvdsinfer_custom_impl_Yolo
- DeepStream 6.0.1 / 6.0 on Jetson platform
CUDA_VER=10.2 make -C nvdsinfer_custom_impl_Yolo
[property]
...
# 0=RGB, 1=BGR, 2=GRAYSCALE
model-color-format=0
# CFG
custom-network-config=yolor_csp.cfg
# WTS
model-file=yolor_csp.wts
# Generated TensorRT model (will be created if it doesn't exist)
model-engine-file=model_b1_gpu0_fp32.engine
# Model labels file
labelfile-path=labels.txt
# Batch size
batch-size=1
# 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
# Number of classes in label file
num-detected-classes=80
...
[class-attrs-all]
# IOU threshold
nms-iou-threshold=0.5
# Score threshold
pre-cluster-threshold=0.25
...
[primary-gie]
enable=1
gpu-id=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yolor.txt
deepstream-app -c deepstream_app_config.txt
sudo apt-get install libopencv-dev
- DeepStream 6.1 on x86 platform
cd DeepStream-Yolo
CUDA_VER=11.6 OPENCV=1 make -C nvdsinfer_custom_impl_Yolo
- DeepStream 6.0.1 / 6.0 on x86 platform
cd DeepStream-Yolo
CUDA_VER=11.4 OPENCV=1 make -C nvdsinfer_custom_impl_Yolo
- DeepStream 6.1 on Jetson platform
cd DeepStream-Yolo
CUDA_VER=11.4 OPENCV=1 make -C nvdsinfer_custom_impl_Yolo
- DeepStream 6.0.1 / 6.0 on Jetson platform
cd DeepStream-Yolo
CUDA_VER=10.2 OPENCV=1 make -C nvdsinfer_custom_impl_Yolo
3. For COCO dataset, download the val2017, extract, and move to DeepStream-Yolo folder
mkdir calibration
for jpg in $(ls -1 val2017/*.jpg | sort -R | head -1000); do \
cp ${jpg} calibration/; \
done
realpath calibration/*jpg > calibration.txt
export INT8_CALIB_IMG_PATH=calibration.txt
export INT8_CALIB_BATCH_SIZE=1
...
model-engine-file=model_b1_gpu0_fp32.engine
#int8-calib-file=calib.table
...
network-mode=0
...
- To
...
model-engine-file=model_b1_gpu0_int8.engine
int8-calib-file=calib.table
...
network-mode=1
...
deepstream-app -c deepstream_app_config.txt
NOTE: NVIDIA recommends at least 500 images to get a good accuracy. In this example I used 1000 images to get better accuracy (more images = more accuracy). Higher INT8_CALIB_BATCH_SIZE values will increase the accuracy and calibration speed. Set it according to you GPU memory. This process can take a long time.
You can get metadata from deepstream in Python and C/C++. For C/C++, you need edit deepstream-app or deepstream-test code. For Python your need install and edit deepstream_python_apps.
Basically, you need manipulate NvDsObjectMeta (Python/C/C++) and NvDsFrameMeta (Python/C/C++) to get label, position, etc. of bboxes.
My projects: https://www.youtube.com/MarcosLucianoTV