TensorRT-OSS 8.2 GA release #1638

rajeevsrao · 2021-11-24T16:21:22Z

TensorRT OSS release corresponding to TensorRT 8.2.1.8 GA release.

Updates since TensorRT 8.2.0 EA release.
Please refer to the TensorRT 8.2.1 GA release notes for more information.
ONNX parser v8.2.1
- Removed duplicate constant layer checks that caused some performance regressions
- Fixed expand dynamic shape calculations
- Added parser-side checks for Scatter layer support
Sample updates
- Added Tensorflow Object Detection API converter samples, including Single Shot Detector, Faster R-CNN and Mask R-CNN models
- Multiple enhancements in HuggingFace transformer demos
  - Added multi-batch support
  - Fixed resultant performance regression in batchsize=1
  - Fixed T5 large/T5-3B accuracy issues
  - Added notebooks for T5 and GPT-2
  - Added CPU benchmarking option
- Deprecated kSTRICT_TYPES (strict type constraints). Equivalent behaviour now achieved by setting PREFER_PRECISION_CONSTRAINTS, DIRECT_IO, and REJECT_EMPTY_ALGORITHMS
- Removed sampleMovieLens
- Renamed sampleReformatFreeIO to sampleIOFormats
- Add idleTime option for samples to control qps
- Specify default value for precisionConstraints
- Fixed reporting of TensorRT build version in trtexec
- Fixed combineDescriptions typo in trtexec/tracer.py
- Fixed usages of of kDIRECT_IO
Plugin updates
- EfficientNMS plugin support extended to TF-TRT, and for clang builds.
- Sanitize header definitions for BERT fused MHA plugin
- Separate C++ and cu files in splitPlugin to avoid PTX generation (required for CUDA enhanced compatibility support)
- Enable C++14 build for plugins
ONNX tooling updates
- onnx-graphsurgeon upgraded to v0.3.14
- Polygraphy upgraded to v0.33.2
- pytorch-quantization toolkit upgraded to v2.1.2
Build and container fixes
- Add SM86 target to default GPU_ARCHS for platforms with cuda-11.1+
- Remove deprecated SM_35 and add SM_60 to default GPU_ARCHS
- Skip CUB builds for cuda 11.0+ #1455
- Fixed cuda-10.2 container build failures in Ubuntu 20.04
- Add native ARM server build container
- Install devtoolset-8 for updated g++ version in CentOS7
- Added a note on supporting c++14 builds for CentOS7
- Fixed docker build for large UIDs #1373
- Updated README instructions for Jetpack builds
demo enhancements
- Updated Tacotron2 instructions and add CPU benchmarking
- Fixed issues in demoBERT python notebook
Documentation updates
- Updated Python documentation for add_reduce, add_top_k, and ISoftMaxLayer
- Renamed default GitHub branch to main and updated hyperlinks

Signed-off-by: Rajeev Rao rajeevrao@nvidia.com

Signed-off-by: Rajeev Rao <rajeevrao@nvidia.com>

lgtm-com · 2021-11-24T16:55:13Z

This pull request introduces 21 alerts and fixes 3 when merging 39f764f into 9ec6eb6 - view on LGTM.com

new alerts:

9 for Unused import
6 for Wrong number of arguments in a class instantiation
2 for Wrong name for an argument in a call
2 for Unused local variable
1 for Module is imported with 'import' and 'import from'
1 for Wrong number of arguments in a call

fixed alerts:

3 for Unused import

TensorRT-OSS 8.2 GA release

39f764f

Signed-off-by: Rajeev Rao <rajeevrao@nvidia.com>

rajeevsrao requested review from kevinch-nv and pranavm-nvidia November 24, 2021 16:21

pranavm-nvidia approved these changes Nov 24, 2021

View reviewed changes

kevinch-nv approved these changes Nov 24, 2021

View reviewed changes

rajeevsrao merged commit 6f38570 into NVIDIA:main Nov 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TensorRT-OSS 8.2 GA release #1638

TensorRT-OSS 8.2 GA release #1638

rajeevsrao commented Nov 24, 2021

lgtm-com bot commented Nov 24, 2021

TensorRT-OSS 8.2 GA release #1638

TensorRT-OSS 8.2 GA release #1638

Conversation

rajeevsrao commented Nov 24, 2021

lgtm-com bot commented Nov 24, 2021