ONNX Runtime v1.7.0
Announcements
Starting from this release, all ONNX Runtime CPU packages are now built without OpenMP. A version with OpenMP is available on Nuget (Microsoft.ML.OnnxRuntime.OpenMP) and PyPi (onnxruntime-openmp). Please report any issues in GH Issues.
Note: The 1.7.0 GPU package is uploaded on this Azure DevOps Feed due to the size limit on Nuget.org. Please use 1.7.1 for the GPU package through Nuget.
Key Feature Updates
General
- Mobile
- Custom operators now supported in the ONNX Runtime Mobile build
- Added ability to reduce types supported by operator kernels to only the types required by the models
- Expect a 25-33% reduction in binary size contribution from the kernel implementations. Reduction is model dependent, but testing with common models like Mobilenet v2, SSD Mobilenet and Mobilebert achieved reductions in this range.
- Custom op support for dynamic input
- MKLML/openblas/jemalloc build configs removed
- Removed dependency on gemmlowp
- [Experimental] Audio Operators
- Fourier Transforms (DFT, IDFT, STFT), Windowing Functions (Hann, Hamming, Blackman), and a MelWeightMatrix operator in "com.microsoft.experimental” domain
- Buildable using ms_experimental build flag (included in Microsoft.AI.MachineLearning NuGet package)
Performance
- Quantization
- Quantization tool now supports quantization of models in QDQ (QuantizeLinear-DequantizeLinear) format
- Depthwise Conv quantization performance improvement
- Quantization support added for Pad, Split and MaxPool for channel last
- QuantizeLinear performance improvement on AVX512
- Optimization: Fusion for Conv + Mul/Add
- Transformers
- Longformer Attention CUDA kernel memory footprint reduction
- Einsum Float16 CUDA kernel for ALBERT and XLNet
- Python optimizer tool now supports fusion for BART
- CPU profiling tool for transformers models
APIs and Packages
- Python 3.8 and 3.9 support added for all platforms, removed support for 3.5
- ARM32/64 Windows builds are now included in the CPU Nuget and zip packages
- WinML
- .NET5 support - will work with .NET5 Standard 2.0 Projections
- Image descriptors expose NominalPixelRange properties
- Native support added for additional pixel ranges [0..1] and [-1..1] in image models.
- A new property is added to the ImageFeatureDescriptor runtimeclass to expose the ImageNominalPixelRange property in ImageFeatureDescriptor. Other similar properties exposed are the image’s BitmapPixelFormat and BitmapAlphaMode.
- Bug fixes and performance improvements, including #6249
- [Experimental] Model Building API available under the Microsoft.AI.MachineLearning.Experimental namespace. (included in Microsoft.AI.MachineLearning NuGet package)
- Can be used to create dynamic models on the fly to enable engine-optimized and hardware accelerated dynamic tensor featurization code sample
Execution Providers
- CUDA EP
- Official GPU build now built with CUDA 11
- OpenVINO EP
- Support for OpenVINO 2021.2
- Deprecated support for OpenVINO 2020.2
- Support for OpenVINO EP options in onnxruntime_perf_test tool
- General fixes
- TensorRT EP
- Support for TensorRT 7.2
- General fixes and perf improvements
- DirectML EP
- Support for DirectML 1.4.2
- DirectML PIX markers added to enable profiling graph at operator level.
- NNAPI EP
- Performance improvement for quantized models
- Support of per-channel quantization for QlinearConv
- Additional operator support – Min/Max/Pow
Contributions
Contributors to ONNX Runtime include members across teams at Microsoft, along with our community members:
edgchen1, snnn, skottmckay, gwang-msft, hariharans29, tianleiwu, xadupre, yufenglee, ryanlai2, wangyems, suffiank, liqunfu, orilevari, baijumeswani, weixingzhang, pranavsharma, RandySheriffH, ashbhandare, oliviajain, smk2007, tracysh, stevenlix, fs-eire, Craigacp, faxu, mrry, codemzs, chilo-ms, jcwchen, zhanghuanrong, SherlockNoMad, iK1D, askhade, zhangxiang1993, yuslepukhin, tlh20, MaajidKhan, wschin, smkarlap, wenbingl, pengwa, duli2012, natke, alberto-magni, Tixxx, HectorSVC, jingyanwangms, jstoecker, kit1980, suryasidd, RandyShuai, sfatimar, jywu-msft, liuziyue, mosdav, thiagocrepaldi, souptc, fdwr