Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
SSD512_diff_underflow_rate.pdf		SSD512_diff_underflow_rate.pdf
SSD512_expected_loss_scale.pdf		SSD512_expected_loss_scale.pdf
SSD512_underflow_rate.pdf		SSD512_underflow_rate.pdf
SSD512_underflow_rate_loss_scaled.pdf		SSD512_underflow_rate_loss_scaled.pdf
SSD_Result.ipynb		SSD_Result.ipynb
train.py		train.py
train_multi.py		train_multi.py

README.md

SSD

This directory contains code to explore the training of SSD by adaptive loss scaling.

Changes made to the original SSD model

The original SSD example provided in chainercv cannot be automatically transformed to one that supports adaptive loss scaling.

We've made the following changes to ensure the transformation can carry out smoothly:

The backbone VGG network is now built on PickableSequentialChain, such that the branching points among its layers can be automatically discovered and transformed to AdaLossBranch.
Convolution layers in the Multibox class are now initialized at the highest level, instead of being wrapped within ChainList, which cannot be discovered by our AdaLossScaled API.
The post-processing and concatenation are wrapped into functions and initialized within Multibox's scope.
We need to call AdaLossBranch explicitly in the forward method of Multibox such that its branches can propagate gradients correctly.

Commands

Please remember to add mpiexec -n <N_GPU> as a prefix if you intend to run this code in an MPI environment.

# adaptive loss scaling
python3 train_multi.py --model ssd512 --dtype mixed16 --out /home/user/results --loss-scale-method approx_range

# static loss scaling
python3 train_multi.py --model ssd512 --dtype mixed16 --out /home/user/results --loss-scale-method fixed --init-scale 128

# dynamic loss scaling
# fixed makes sure that ada_loss won't take any effect
python3 train_multi.py --model ssd512 --dtype mixed16 --out /home/user/results --loss-scale-method fixed --dynamic-interval 10

# no loss scaling
python3 train_multi.py --model ssd512 --dtype mixed16 --out /home/user/results --loss-scale-method fixed --init-scale 1

Results

Model	Loss scale	Batch size	Validation mAP
SSD512	FP32 baseline	8	78.94
SSD512	adaptive	8	0.7927
SSD512	adaptive (freq=10)	8	0.7924
SSD512	no loss scale	8	diverged
SSD512	loss scale (8)	8	0.7883
SSD512	loss scale (128)	8	0.7871
SSD512	loss scale (256)	8	0.7836
SSD512	loss scale (dynamic, interval=10)	8	0.7504

Model	Loss scale	Batch size	Validation mAP
SSD512	adaptive	32	0.8030
SSD512	no loss scale	32	diverged
SSD512	loss scale (128)	32	0.8001
SSD512	loss scale (dyn, intv=10)	32	0.8017

TODO

Reduce memory usage by fully transforming ReLU.
Remove the explicit reference of AdaLossBranch in the SSD model (Multibox).
Support cuDNN

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ssd

ssd

README.md

SSD

Changes made to the original SSD model

Commands

Results

TODO

Files

ssd

Directory actions

More options

Directory actions

More options

Latest commit

History

ssd

Folders and files

parent directory

README.md

SSD

Changes made to the original SSD model

Commands

Results

TODO