Skip to content

Latest commit

 

History

History
103 lines (93 loc) · 8.17 KB

README.md

File metadata and controls

103 lines (93 loc) · 8.17 KB

CUDA and TensorRT Starter Workspace

This repository guides freshmen who does not have background of parallel programming in C++ to learn CUDA and TensorRT from the beginning.

This repository is still working in progress(~24/02/21). I will add some more samples and more detailed description in the future. Please feel free to contribute to this repository

How to install

Please pull the repository firstly

git clone git@github.com:kalfazed/tensorrt_starter.git

After clone the repository, please modify the opencv, cuda, cudnn, and TensorRT version and install directory in config/Makefile.config located in the root direcoty of the repository. The recommaned version in this repository is opencv==4.x, cuda==11.6, cudnn==8.9, TensorRT==8.6.1.6

# Please change the cuda version if needed
# In default, cuDNN library is located in /usr/local/cuda/lib64
CXX                         :=  g++
CUDA_VER                    :=  11

# Please modify the opencv and tensorrt install directory
OPENCV_INSTALL_DIR          :=  /usr/local/include/opencv4
TENSORRT_INSTALL_DIR        :=  /home/kalfazed/packages/TensorRT-8.6.1.6

Besides, please also change the ARCH in config/Makefile.config. This parameter will be used by nvcc, which is a compiler for cuda program.

How to run

Inside each subfolder of each chapter, the basic directory structure is as follow: (For some chapters, it will be different)

|-config
    |- Makefile.config
|-src
    |- cpp
        |- xxx.c
    |- python
        |- yyy.py
|-Makefile

Please run make firstly, then it will generate a binary named trt-cuda or trt-infer, depending on different chapters. Pleae run the binary directly or run make run command.

Chapter description

chapter1-build-environment

chapter2-cuda-programming

chapter3-tensorrt-basics-and-onnx

chapter4-tensorrt-optimiztion

chapter5-tensorrt-api-basics

chapter6-deploy-classification-and-inference-design

chapter7-deploy-yolo-detection