For the newest tensorrt 8.x version unet, please check this repo https://github.com/wang-xinyu/tensorrtx/tree/master/unet, they have more updates and are better dependency friendly.
original img(left) and segmentation result(right)This is a TensorRT version Unet, inspired by tensorrtx and pytorch-unet.
You can generate TensorRT engine file using this script and customize some params and network structure based on network you trained (FP32/16 precision, input size, different conv, activation function...)
TensorRT 7.0 (you need to install tensorrt first)
Cuda 10.2
Python3.7
opencv 4.4
cmake 3.18
pip install -r requirements.txt
train your dataset by following pytorch-unet and generate .pth file.
run gen_wts from utils folder, and move it to project folder (you need to run with east training environment)(
mkdir build
cd build
cmake ..
make
unet -s
then a unet exec file will generated, you can use unet -d to infer files in a folder
unet -d ../samples
the speed of tensorRT engine is much faster(testing on 2080TI)
pytorch | TensorRT FP32 | TensorRT FP16 |
---|---|---|
816x672 | 816x672 | 816x672 |
58ms | 43ms (batchsize 8) | 14ms (batchsize 8) |
- add INT8 calibrator
- add custom plugin
etc