Skip to content

v0.14.0

Compare
Choose a tag to compare
@panyx0718 panyx0718 released this 03 Jul 08:29
· 105 commits to release/0.14.0 since this release
163b5e5

Release Log

Major Features

  • Enhanced the inference library. Better memory buffer. Added several demos.
  • Inference library added support for Anakin engine, TensorRT engine.
  • ParallelExecutor supports multi-threaded CPU training. (In addition to multi-GPU training)
  • Added mean IOU operator, argsort operator, etc. Improved L2norm operator. Added crop API.
  • Released pre-trained ResNet50, Se-Resnext50, AlexNet, etc, Enahanced Transformer, etc.
  • New data augmentation operators.
  • Major documentation and API comment improvements.
  • Enhance the continuous evaluation system.

Performance Improvements

  • More overlap of distributed training network operation with computation. ~10% improvements
  • CPU performance improvements with more MKLDNN support.

Major Bug Fixes

  • Fix memory leak issues.
  • Fix concat operator.
  • Fix ParallelExecutor input data memcpy issue.
  • Fix ParallelExecutor deadlock issue.
  • Fix distributed training client timeout.
  • Fix distributed training pserver side learning rate decay.
  • Thread-safe Scope implementation.
  • Fix some issue using memory optimizer and parallelexecutor together.

Known Issues

  • IfElse has some bugs.
  • BatchNorm is not stable if batch_size=1