Releases: aws-neuron/aws-neuron-sdk
Neuron SDK Release - July 02, 2021
This release (Neuron 1.14.1) , include bug fixes and minor enhancements:
Neuron PyTorch - This release adds “Dynamic Batching” feature support, see PyTorch Neuron trace python API for more information, the release also add support for new operators and include additional bug fixes and minor enhancements, for more information see PyTorch Neuron release notes.
Neuron TensorFlow - see Tensorflow-Neuron Release Notes
Neuron MXNet - see Neuron Apache MXNet (Incubating) Release Notes
Neuron Compiler - see Neuron Compiler Release Notes
Neuron Runtime - see Neuron Runtime Release Notes
Neuron Tools - see Neuron Tools Release Notes
Neuron SDK Release - May 28, 2021
This release (Neuron 1.14.0) introduces first release of Neuron PyTorch 1.8.1, tutorials update, performance enhancements and memory optimizations for Neuron PyTorch, Neuron TensorFlow and Neuron MXNet.
Neuron PyTorch - First release of Neuron PyTorch 1.8.1.
Neuron PyTorch - Convolution operator support has been extended to include ConvTranspose2d variants.
Neuron PyTorch - Updated tutorials to use Hugging Face Transformers 4.6.0.
Neuron PyTorch - Additional performance enhancements, memory optimizations, and bug fixes. see PyTorch Neuron release notes.
Neuron Compiler - New feature - Uncompressed NEFF format for faster loading models prior inference. Enable it by –enable-fast-loading-neuron-binaries. Some cases of large models may be detrimentally impacted as it will not be compressed but many cases will benefit.
Neuron Compiler - Additional performance enhancements, memory optimizations, and bug fixes, see Neuron Compiler Release Notes.
Neuron TensorFlow - Performance enhancements, memory optimizations, and bug fixes. see Tensorflow-Neuron Release Notes.
Neuron MXNet - Enhancements and minor bug fixes (MXNet 1.8), see Neuron Apache MXNet (Incubating) Release Notes.
Neuron Runtime - Performance enhancements, memory optimizations, and bug fixes. Neuron Runtime Release Notes.
Neuron Tools - Minor bug fixes and enhancements.
Software Deprecation
End of support for Neuron Conda packages in Deep Learning AMI, users should use pip upgrade commands to upgrade to latest Neuron version in DLAMI, see blog.
End of support for Ubuntu 16, see documentation.
Neuron SDK Release - May 01, 2021
The AWS Neuron team is happy to announce that ** Neuron release 1.13.0 ** is out!
What’s New
Up to 20% throughput performance improvements across the board.
Out-of-the-box 12x higher throughput at 70% lower cost for HuggingFace Transformers pre-trained BERT Base models on AWS Inferentia, see Tutorial
Added PyTorch ResNext and Yolov5 support.
PyTorch convolution operator support has been extended to include most Conv1d and Conv3d variants
First release of Neuron MXNet 1.8, support for Gluon API and NLP BERT models
First release of Neuron plugin for TensorBoard
Software Deprecation
Users should use pip upgrade command to upgrade to latest Neuron version in Deep Learning AMI instead of calling conda update. We are ending the support for Neuron Conda packages in Deep Learning AMI starting Neuron 1.14.0. blog (https://aws.amazon.com/blogs/developer/neuron-conda-packages-eol/).
End of support for Ubuntu 16 starting Neuron 1.14.0 see Software Deprecation.
End of support for classic TensorBoard-Neuron starting Neuron 1.13.0 and introducing Neuron Plugin for TensorBoard see Software Deprecation.
For more information see What’s New (https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/index.html#neuron-whatsnew) and Release Content (https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/releasecontent.html#neuron-release-content).
Neuron SDK Release - March 04, 2021
This release include bug fixes and minor enhancements to the Neuron Runtime and Tools.
Neuron SDK Release - February 24, 2021
This release updates all Neuron packages and libraries in response to the Python Secutity issue CVE-2021-3177 as described here: https://nvd.nist.gov/vuln/detail/CVE-2021-3177. This vulnerability potentially exists in multiple versions of Python including 3.5, 3.6, 3.7. Python is used by various components of Neuron, including the Neuron compiler as well as Machine Learning frameworks including TensorFlow, PyTorch and MXNet. It is recommended that the Python interpreters used in any AMIs and containers used with Neuron are also updated.
Python 3.5 reached end-of-life as described here: https://devguide.python.org/devcycle/?highlight=python%203.5%20end%20of%20life#end-of-life-branches
From this release Neuron packages will not support Python 3.5. Users should upgrade to latest DLAMI or upgrade to a newer Python versions if they are using other AMI.
Neuron SDK Release - January 30, 2021
This release continues to improve the NeuronCore Pipeline performance for BERT models. For example, running BERT Base with the the neuroncore-pipeline-cores compile option, at batch=3, seqlen=32 using 16 Neuron Cores, results in throughput of up to 5340 sequences per second and P99 latency of 9ms using Tensorflow Serving.
This release also adds operator support and performance improvements for the PyTorch based DistilBert model for sequence classification.
Neuron SDK Release - December 23, 2020
This release introduces a PyTorch 1.7 based torch-neuron package as a part of the Neuron SDK. Support for PyTorch model serving with TorchServe 0.2 is added and will be demonstrated with a tutorial. This release also provides an example tutorial for PyTorch based Yolo v4 model for Inferentia.
To aid visibility into compiler activity, the Neuron-extended Frameworks TensorFlow and PyTorch will display a new compilation status indicator that prints a dot (.) every 20 seconds to the console as compilation is executing.
Important to know:
- This update continues to support the torch-neuron version of PyTorch 1.5.1 for backwards compatibility.
Neuron SDK Release - November 17, 2020
This release improves NeuronCore Pipeline performance. For example, running BERT Small, batch=4, seqlen=32 using 4 Neuron Cores, results in throughput of up to 7000 sequences per second and P99 latency of 3ms using Tensorflow Serving.
Neuron tools updated the NeuronCore utilization metric to include all inf1 compute engines and DMAs. Added a new neuron-monitor example that connects to Grafana via Prometheus. We've added a new sample script which exports most of neuron-monitor's metrics to a Prometheus monitoring server. Additionally, we also provided a sample Grafana dashboard. More details here
ONNX support is limited and from this version onwards we are not planning to add any additional capabilities to ONNX. We recommend running models in TensorFlow, PyTorch or MXNet for best performance and support.
Neuron SDK Release - October 22, 2020
This release adds a Neuron kernel mode driver (KMD). The Neuron KMD simplifies Neuron Runtime deployments by removing the need for elevated privileges, improves memory management by removing the need for huge pages configuration, and eliminates the need for running neuron-rtd as a sidecar container. Documentation throughout the repo has been updated to reflect the new support. The new Neuron KMD is backwards compatible with prior versions of Neuron ML Frameworks and Compilers - no changes are required to existing application code.
More details in the Neuron Runtime release notes here.
Neuron SDK Release - September 22, 2020
This release improves performance of YOLO v3 and v4, VGG16, SSD300, and BERT. As part of these improvements, Neuron Compiler doesn’t require any special compilation flags for most models. Details on how to use the prior optimizations are outlined in the neuron-cc release notes.
The release also improves operational deployments of large scale inference applications, with a session management agent incorporated into all supported ML Frameworks and a new neuron tool called neuron-monitor allows to easily scale monitoring of large fleets of Inference applications. A sample script for connecting neuron-monitor to Amazon CloudWatch metrics is provided as well. Read more about using neuron-monitor.