TensorRT Backend For ONNX

Parses ONNX models for execution with TensorRT.

Supported TensorRT Versions

Development on the this branch is for the latest version of TensorRT 10.14 with full-dimensions and dynamic shape support.

For previous versions of TensorRT, refer to their respective branches.

Supported Operators

Current supported ONNX operators are found in the operator support matrix.

Installation

Dependencies

Building

For building within Docker or on Windows, we recommend using the build instructions in the main TensorRT repository to build the onnx-tensorrt library.

Once you have cloned the repository, you can build the parser libraries and executables by running:

cd onnx-tensorrt
mkdir build && cd build
cmake .. -DTENSORRT_ROOT=<path_to_trt> && make -j
# Ensure that you update your LD_LIBRARY_PATH to pick up the location of the newly built library:
export LD_LIBRARY_PATH=$PWD:$LD_LIBRARY_PATH

Note that this project has a dependency on CUDA. By default the build will look in /usr/local/cuda for the CUDA toolkit installation. If your CUDA path is different, overwrite the default path by providing -DCUDA_TOOLKIT_ROOT_DIR=<path_to_cuda_install> in the CMake command.

To build with protobuf-lite support, add -DUSE_ONNX_LITE_PROTO=1 to the end of the cmake command.

InstanceNormalizaiton Performance

There are two implementations of InstanceNormalization that may perform differently depending on various parameters. By default, the parser will use the native TensorRT implementation of InstanceNorm. Users that want to benchmark using the plugin implementation of InstanceNorm can unset the parser flag kNATIVE_INSTANCENORM prior to parsing the model. Note that the plugin implementation cannot be used for building version compatible or hardware compatible engines, and attempting to do so will result in an error.

C++ Example:

// Unset the kNATIVE_INSTANCENORM flag to use the plugin implementation.
parser->unsetFlag(nvonnxparser::OnnxParserFlag::kNATIVE_INSTANCENORM);

Python Example:

// Unset the NATIVE_INSTANCENORM flag to use the plugin implementation.
parser.clear_flag(trt.OnnxParserFlag.NATIVE_INSTANCENORM)

Executable Usage

There are currently two officially supported tools for users to quickly check if an ONNX model can parse and build into a TensorRT engine from an ONNX file.

For C++ users, there is the trtexec binary that is typically found in the <tensorrt_root_dir>/bin directory. The basic command of running an ONNX model is:

trtexec --onnx=model.onnx

Refer to the link or run trtexec -h for more information on CLI options.

For Python users, there is the polygraphy tool. The basic command for running an onnx model is:

polygraphy run model.onnx --trt

Refer to the link or run polygraphy run -h for more information on CLI options.

Python Modules

Python bindings for the ONNX-TensorRT parser are packaged in the shipped .whl files.

TensorRT 10.14 supports ONNX release 1.18.0. Install it with:

python3 -m pip install onnx==1.18.0

The ONNX-TensorRT backend can be installed by running:

python3 setup.py install

ONNX-TensorRT Python Backend Usage

The TensorRT backend for ONNX can be used in Python as follows:

import onnx
import onnx_tensorrt.backend as backend
import numpy as np

model = onnx.load("/path/to/model.onnx")
engine = backend.prepare(model, device='CUDA:1')
input_data = np.random.random(size=(32, 3, 224, 224)).astype(np.float32)
output_data = engine.run(input_data)[0]
print(output_data)
print(output_data.shape)

C++ Library Usage

The model parser library, libnvonnxparser.so, has its C++ API declared in this header:

NvOnnxParser.h

Tests

After installation (or inside the Docker container), ONNX backend tests can be run as follows:

Real model tests only:

python onnx_backend_test.py OnnxBackendRealModelTest

All tests:

python onnx_backend_test.py

You can use -v flag to make output more verbose.

Pre-trained Models

Pre-trained models in ONNX format can be found at the ONNX Model Zoo

Name		Name	Last commit message	Last commit date
Latest commit History 317 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
docs		docs
onnx_tensorrt		onnx_tensorrt
third_party		third_party
.gitignore		.gitignore
.gitmodules		.gitmodules
AttentionHelpers.cpp		AttentionHelpers.cpp
AttentionHelpers.hpp		AttentionHelpers.hpp
CMakeLists.txt		CMakeLists.txt
ConditionalHelpers.cpp		ConditionalHelpers.cpp
ConditionalHelpers.hpp		ConditionalHelpers.hpp
ImporterContext.cpp		ImporterContext.cpp
ImporterContext.hpp		ImporterContext.hpp
LICENSE		LICENSE
LoopHelpers.cpp		LoopHelpers.cpp
LoopHelpers.hpp		LoopHelpers.hpp
ModelImporter.cpp		ModelImporter.cpp
ModelImporter.hpp		ModelImporter.hpp
ModelRefitter.cpp		ModelRefitter.cpp
ModelRefitter.hpp		ModelRefitter.hpp
NvOnnxParser.cpp		NvOnnxParser.cpp
NvOnnxParser.h		NvOnnxParser.h
OnnxAttrs.cpp		OnnxAttrs.cpp
OnnxAttrs.hpp		OnnxAttrs.hpp
README.md		README.md
RNNHelpers.cpp		RNNHelpers.cpp
RNNHelpers.hpp		RNNHelpers.hpp
ShapeTensor.cpp		ShapeTensor.cpp
ShapeTensor.hpp		ShapeTensor.hpp
ShapedWeights.cpp		ShapedWeights.cpp
ShapedWeights.hpp		ShapedWeights.hpp
Status.cpp		Status.cpp
Status.hpp		Status.hpp
TensorOrWeights.cpp		TensorOrWeights.cpp
TensorOrWeights.hpp		TensorOrWeights.hpp
WeightsContext.cpp		WeightsContext.cpp
WeightsContext.hpp		WeightsContext.hpp
WeightsContextMemoryMap.cpp		WeightsContextMemoryMap.cpp
bfloat16.cpp		bfloat16.cpp
bfloat16.hpp		bfloat16.hpp
common.hpp		common.hpp
errorHelpers.cpp		errorHelpers.cpp
errorHelpers.hpp		errorHelpers.hpp
getSupportedAPITest.cpp		getSupportedAPITest.cpp
half.h		half.h
ieee_half.h		ieee_half.h
importerUtils.cpp		importerUtils.cpp
importerUtils.hpp		importerUtils.hpp
libnvonnxparser.version		libnvonnxparser.version
onnx2trt_common.hpp		onnx2trt_common.hpp
onnx2trt_runtime.hpp		onnx2trt_runtime.hpp
onnxErrorRecorder.cpp		onnxErrorRecorder.cpp
onnxErrorRecorder.hpp		onnxErrorRecorder.hpp
onnxOpCheckers.cpp		onnxOpCheckers.cpp
onnxOpCheckers.hpp		onnxOpCheckers.hpp
onnxOpImporters.cpp		onnxOpImporters.cpp
onnxOpImporters.hpp		onnxOpImporters.hpp
onnxProtoUtils.cpp		onnxProtoUtils.cpp
onnxProtoUtils.hpp		onnxProtoUtils.hpp
onnx_backend_test.py		onnx_backend_test.py
onnx_trt_backend.cpp		onnx_trt_backend.cpp
setup.py		setup.py
toposort.hpp		toposort.hpp
weightUtils.cpp		weightUtils.cpp
weightUtils.hpp		weightUtils.hpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TensorRT Backend For ONNX

Supported TensorRT Versions

Supported Operators

Installation

Dependencies

Building

InstanceNormalizaiton Performance

Executable Usage

Python Modules

ONNX-TensorRT Python Backend Usage

C++ Library Usage

Tests

Pre-trained Models

About

Uh oh!

Releases 30

Packages

Uh oh!

Contributors 36

Uh oh!

Languages

License

onnx/onnx-tensorrt

Folders and files

Latest commit

History

Repository files navigation

TensorRT Backend For ONNX

Supported TensorRT Versions

Supported Operators

Installation

Dependencies

Building

InstanceNormalizaiton Performance

Executable Usage

Python Modules

ONNX-TensorRT Python Backend Usage

C++ Library Usage

Tests

Pre-trained Models

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 30

Packages 0

Uh oh!

Contributors 36

Uh oh!

Languages

Packages