- Supported dilated convolution
- Memory optimization is introduced to save memory during training and testing. Wiki
BatchReductionLayer
supports reduction on an arbitrary axis with cuda implementation.- Other small fixes.
Features:
- Supported cuDNN v5
- Use the cuDNN's BatchNormalization implementation as the default engine for BN layer
- BN layer will now store running variance in its fourth blob.
- the script
python/bn_convert_style.py
is added to help converting the bn style forth and back.
Features:
- Implemented a planning algorithm to globally optimize the cudnn workspace consumption and speed trade-off.
- Now
richness
parameter specifies the total memory in MBs available to cudnn for convolution workspaces. - Now the framework will try to find the best convolution algorithm combinations under memory limit.
Features:
- cuDNN v4 support
- 20% overall speed gain with faster convolution and batch normalization
- the native batch normalization is changed to comply with cuDNN. Use the script
python/bn_var_to_inv_std.py
to upgrade your models.
Features:
- python layer can expose a prefetch() method, which will be run in parallel with network processing.
Features:
- Improved cuDNN wrapper to use less GPU memory.
- Now there is a new parameter
richness
which controls the limit of workspace for cuDNN.
Features:
- Support for cuDNN v3.
Features:
- New mechanism for parallel comminucation reduced parallel overhead.
- Batch normalization, courtesy of @Cysu.
Features:
- Action recognition tools, scripts, and examples.
- Basic parallel training support
- Various extra data augmentations