Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorRT-OSS 8.2 GA release #1638

Merged
merged 1 commit into from
Nov 24, 2021
Merged

TensorRT-OSS 8.2 GA release #1638

merged 1 commit into from
Nov 24, 2021

Conversation

rajeevsrao
Copy link
Collaborator

TensorRT OSS release corresponding to TensorRT 8.2.1.8 GA release.

  • Updates since TensorRT 8.2.0 EA release.

  • Please refer to the TensorRT 8.2.1 GA release notes for more information.

  • ONNX parser v8.2.1

    • Removed duplicate constant layer checks that caused some performance regressions
    • Fixed expand dynamic shape calculations
    • Added parser-side checks for Scatter layer support
  • Sample updates

    • Added Tensorflow Object Detection API converter samples, including Single Shot Detector, Faster R-CNN and Mask R-CNN models
    • Multiple enhancements in HuggingFace transformer demos
      • Added multi-batch support
      • Fixed resultant performance regression in batchsize=1
      • Fixed T5 large/T5-3B accuracy issues
      • Added notebooks for T5 and GPT-2
      • Added CPU benchmarking option
    • Deprecated kSTRICT_TYPES (strict type constraints). Equivalent behaviour now achieved by setting PREFER_PRECISION_CONSTRAINTS, DIRECT_IO, and REJECT_EMPTY_ALGORITHMS
    • Removed sampleMovieLens
    • Renamed sampleReformatFreeIO to sampleIOFormats
    • Add idleTime option for samples to control qps
    • Specify default value for precisionConstraints
    • Fixed reporting of TensorRT build version in trtexec
    • Fixed combineDescriptions typo in trtexec/tracer.py
    • Fixed usages of of kDIRECT_IO
  • Plugin updates

    • EfficientNMS plugin support extended to TF-TRT, and for clang builds.
    • Sanitize header definitions for BERT fused MHA plugin
    • Separate C++ and cu files in splitPlugin to avoid PTX generation (required for CUDA enhanced compatibility support)
    • Enable C++14 build for plugins
  • ONNX tooling updates

  • Build and container fixes

    • Add SM86 target to default GPU_ARCHS for platforms with cuda-11.1+
    • Remove deprecated SM_35 and add SM_60 to default GPU_ARCHS
    • Skip CUB builds for cuda 11.0+ #1455
    • Fixed cuda-10.2 container build failures in Ubuntu 20.04
    • Add native ARM server build container
    • Install devtoolset-8 for updated g++ version in CentOS7
    • Added a note on supporting c++14 builds for CentOS7
    • Fixed docker build for large UIDs #1373
    • Updated README instructions for Jetpack builds
  • demo enhancements

    • Updated Tacotron2 instructions and add CPU benchmarking
    • Fixed issues in demoBERT python notebook
  • Documentation updates

    • Updated Python documentation for add_reduce, add_top_k, and ISoftMaxLayer
    • Renamed default GitHub branch to main and updated hyperlinks

Signed-off-by: Rajeev Rao rajeevrao@nvidia.com

Signed-off-by: Rajeev Rao <rajeevrao@nvidia.com>
@lgtm-com
Copy link

lgtm-com bot commented Nov 24, 2021

This pull request introduces 21 alerts and fixes 3 when merging 39f764f into 9ec6eb6 - view on LGTM.com

new alerts:

  • 9 for Unused import
  • 6 for Wrong number of arguments in a class instantiation
  • 2 for Wrong name for an argument in a call
  • 2 for Unused local variable
  • 1 for Module is imported with 'import' and 'import from'
  • 1 for Wrong number of arguments in a call

fixed alerts:

  • 3 for Unused import

@rajeevsrao rajeevsrao merged commit 6f38570 into NVIDIA:main Nov 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants