-
Notifications
You must be signed in to change notification settings - Fork 231
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[FEATURE] add quant algo
Learned Step Size Quantization
(#346)
* update * Fix a bug in make_divisible. (#333) fix bug in make_divisible Co-authored-by: liukai <liukai@pjlab.org.cn> * [Fix] Fix counter mapping bug (#331) * fix counter mapping bug * move judgment into get_counter_type & update UT * [Docs]Add MMYOLO projects link (#334) * [Doc] fix typos in en/usr_guides (#299) * Update README.md * Update README_zh-CN.md Co-authored-by: Sheffield <49406546+SheffieldCao@users.noreply.github.com> * [Features]Support `MethodInputsRecorder` and `FunctionInputsRecorder` (#320) * support MethodInputsRecorder and FunctionInputsRecorder * fix bugs that the model can not be pickled * WIP: add pytest for ema model * fix bugs in recorder and delivery when ema_hook is used * don't register the DummyDataset * fix pytest * updated * retina loss & predict & tesnor DONE * [Feature] Add deit-base (#332) * WIP: support deit * WIP: add deithead * WIP: fix checkpoint hook * fix data preprocessor * fix cfg * WIP: add readme * reset single_teacher_distill * add metafile * add model to model-index * fix configs and readme * [Feature]Feature map visualization (#293) * WIP: vis * WIP: add visualization * WIP: add visualization hook * WIP: support razor visualizer * WIP * WIP: wrap draw_featmap * support feature map visualization * add a demo image for visualization * fix typos * change eps to 1e-6 * add pytest for visualization * fix vis hook * fix arguments' name * fix img path * support draw inference results * add visualization doc * fix figure url * move files Co-authored-by: weihan cao <HIT-cwh> * [Feature] Add kd examples (#305) * support kd for mbv2 and shufflenetv2 * WIP: fix ckpt path * WIP: fix kd r34-r18 * add metafile * fix metafile * delete * [Doc] add documents about pruning. (#313) * init * update user guide * update images * update * update How to prune your model * update how_to_use_config_tool_of_pruning.md * update doc * move location * update * update * update * add mutablechannels.md * add references Co-authored-by: liukai <liukai@pjlab.org.cn> Co-authored-by: jacky <jacky@xx.com> * [Feature] PyTorch version of `PKD: General Distillation Framework for Object Detectors via Pearson Correlation Coefficient`. (#304) * add pkd * add pytest for pkd * fix cfg * WIP: support fcos3d * WIP: support fcos3d pkd * support mmdet3d * fix cfgs * change eps to 1e-6 and add some comments * fix docstring * fix cfg * add assert * add type hint * WIP: add readme and metafile * fix readme * update metafiles and readme * fix metafile * fix pipeline figure * for RFC * Customed FX initialize * add UT init * [Refactor] Refactor Mutables and Mutators (#324) * refactor mutables * update load fix subnet * add DumpChosen Typehint * adapt UTs * fix lint * Add GroupMixin to ChannelMutator (temporarily) * fix type hints * add GroupMixin doc-string * modified by comments * fix type hits * update subnet format * fix channel group bugs and add UTs * fix doc string * fix comments * refactor diff module forward * fix error in channel mutator doc * fix comments Co-authored-by: liukai <liukai@pjlab.org.cn> * [Fix] Update readme (#341) * update kl readme * update dsnas readme * fix url * Bump version to 1.0.0rc1 (#338) update version * init demo * add customer_tracer * add quantizer * add fake_quant, loop, config * remove CPatcher in custome_tracer * demo_try * init version * modified base.py * pre-rebase * wip of adaround series * adaround experiment * trasfer to s2 * update api * point at sub_reconstruction * pre-checkout * export onnx * add customtracer * fix lint * move custom tracer * fix import * TDO: UTs * Successfully RUN * update loop * update loop docstrings * update quantizer docstrings * update qscheme docstrings * update qobserver docstrings * update tracer docstrings * update UTs init * update UTs init * fix review comments * fix CI * fix UTs * update torch requirements Co-authored-by: huangpengsheng <huangpengsheng@sensetime.com> Co-authored-by: LKJacky <108643365+LKJacky@users.noreply.github.com> Co-authored-by: liukai <liukai@pjlab.org.cn> Co-authored-by: Yang Gao <Gary1546308416AL@gmail.com> Co-authored-by: kitecats <90194592+kitecats@users.noreply.github.com> Co-authored-by: Sheffield <49406546+SheffieldCao@users.noreply.github.com> Co-authored-by: whcao <41630003+HIT-cwh@users.noreply.github.com> Co-authored-by: jacky <jacky@xx.com> Co-authored-by: pppppM <67539920+pppppM@users.noreply.github.com> Co-authored-by: humu789 <humu@pjlab.org.cn>
- Loading branch information
1 parent
b3c8bb9
commit c6637be
Showing
168 changed files
with
7,725 additions
and
805 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
# DeiT | ||
|
||
> [](https://arxiv.org/abs/2012.12877) | ||
> Training data-efficient image transformers & distillation through attention | ||
<!-- [ALGORITHM] --> | ||
|
||
## Abstract | ||
|
||
Recently, neural networks purely based on attention were shown to address image understanding tasks such as image classification. However, these visual transformers are pre-trained with hundreds of millions of images using an expensive infrastructure, thereby limiting their adoption. In this work, we produce a competitive convolution-free transformer by training on Imagenet only. We train them on a single computer in less than 3 days. Our reference vision transformer (86M parameters) achieves top-1 accuracy of 83.1% (single-crop evaluation) on ImageNet with no external data. More importantly, we introduce a teacher-student strategy specific to transformers. It relies on a distillation token ensuring that the student learns from the teacher through attention. We show the interest of this token-based distillation, especially when using a convnet as a teacher. This leads us to report results competitive with convnets for both Imagenet (where we obtain up to 85.2% accuracy) and when transferring to other tasks. We share our code and models. | ||
|
||
<div align=center> | ||
<img src="https://user-images.githubusercontent.com/26739999/143225703-c287c29e-82c9-4c85-a366-dfae30d198cd.png" width="40%"/> | ||
</div> | ||
|
||
## Results and models | ||
|
||
### Classification | ||
|
||
| Dataset | Model | Teacher | Top-1 (%) | Top-5 (%) | Configs | Download | | ||
| -------- | --------- | ----------- | --------- | --------- | ------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| ImageNet | Deit-base | RegNety-160 | 83.24 | 96.33 | [config](deit-base_regnety160_pt-16xb64_in1k.py) | [model](https://download.openmmlab.com/mmrazor/v1/deit/deit-base/deit-base_regnety160_pt-16xb64_in1k_20221011_113403-a67bf475.pth?versionId=CAEQThiBgMCFteW0oBgiIDdmMWY2NGRiOGY1YzRmZWZiOTExMzQ2NjNlMjk2Nzcz) \| [log](https://openmmlab-share.oss-cn-hangzhou.aliyuncs.com/mmrazor/v1/deit/deit-base/deit-base_regnety160_pt-16xb64_in1k_20221011_113403-a67bf475.json?versionId=CAEQThiBgIDGos20oBgiIGVlNDgyM2M2ZTk5MzQyYjFhNTgwNGIzMjllZjg3YmZm) | | ||
|
||
```{warning} | ||
Before training, please first install `timm`. | ||
pip install timm | ||
or | ||
git clone https://github.com/rwightman/pytorch-image-models | ||
cd pytorch-image-models && pip install -e . | ||
``` | ||
|
||
## Citation | ||
|
||
``` | ||
@InProceedings{pmlr-v139-touvron21a, | ||
title = {Training data-efficient image transformers & distillation through attention}, | ||
author = {Touvron, Hugo and Cord, Matthieu and Douze, Matthijs and Massa, Francisco and Sablayrolles, Alexandre and Jegou, Herve}, | ||
booktitle = {International Conference on Machine Learning}, | ||
pages = {10347--10357}, | ||
year = {2021}, | ||
volume = {139}, | ||
month = {July} | ||
} | ||
``` |
64 changes: 64 additions & 0 deletions
64
configs/distill/mmcls/deit/deit-base_regnety160_pt-16xb64_in1k.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
_base_ = ['mmcls::deit/deit-base_pt-16xb64_in1k.py'] | ||
|
||
# student settings | ||
student = _base_.model | ||
student.backbone.type = 'DistilledVisionTransformer' | ||
student.head = dict( | ||
type='mmrazor.DeiTClsHead', | ||
num_classes=1000, | ||
in_channels=768, | ||
loss=dict( | ||
type='mmcls.LabelSmoothLoss', | ||
label_smooth_val=0.1, | ||
mode='original', | ||
loss_weight=0.5)) | ||
|
||
data_preprocessor = dict( | ||
type='mmcls.ClsDataPreprocessor', batch_augments=student.train_cfg) | ||
|
||
# teacher settings | ||
checkpoint_path = 'https://dl.fbaipublicfiles.com/deit/regnety_160-a5fe301d.pth' # noqa: E501 | ||
teacher = dict( | ||
_scope_='mmcls', | ||
type='ImageClassifier', | ||
backbone=dict( | ||
type='TIMMBackbone', model_name='regnety_160', pretrained=True), | ||
neck=dict(type='GlobalAveragePooling'), | ||
head=dict( | ||
type='LinearClsHead', | ||
num_classes=1000, | ||
in_channels=3024, | ||
loss=dict( | ||
type='LabelSmoothLoss', | ||
label_smooth_val=0.1, | ||
mode='original', | ||
loss_weight=0.5), | ||
topk=(1, 5), | ||
init_cfg=dict( | ||
type='Pretrained', checkpoint=checkpoint_path, prefix='head.'))) | ||
|
||
model = dict( | ||
_scope_='mmrazor', | ||
_delete_=True, | ||
type='SingleTeacherDistill', | ||
architecture=student, | ||
teacher=teacher, | ||
distiller=dict( | ||
type='ConfigurableDistiller', | ||
student_recorders=dict( | ||
fc=dict(type='ModuleOutputs', source='head.layers.head_dist')), | ||
teacher_recorders=dict( | ||
fc=dict(type='ModuleOutputs', source='head.fc')), | ||
distill_losses=dict( | ||
loss_distill=dict( | ||
type='CrossEntropyLoss', | ||
loss_weight=0.5, | ||
)), | ||
loss_forward_mappings=dict( | ||
loss_distill=dict( | ||
preds_S=dict(from_student=True, recorder='fc'), | ||
preds_T=dict(from_student=False, recorder='fc'))))) | ||
|
||
find_unused_parameters = True | ||
|
||
val_cfg = dict(_delete_=True, type='mmrazor.SingleTeacherDistillValLoop') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
Collections: | ||
- Name: DEIT | ||
Metadata: | ||
Training Data: | ||
- ImageNet-1k | ||
Paper: | ||
URL: https://arxiv.org/abs/2012.12877 | ||
Title: Training data-efficient image transformers & distillation through attention | ||
README: configs/distill/mmcls/deit/README.md | ||
|
||
Models: | ||
- Name: deit-base_regnety160_pt-16xb64_in1k | ||
In Collection: DEIT | ||
Metadata: | ||
Student: | ||
Config: mmcls::deit/deit-base_pt-16xb64_in1k.py | ||
Weights: https://download.openmmlab.com/mmclassification/v0/deit/deit-base_pt-16xb64_in1k_20220216-db63c16c.pth | ||
Metrics: | ||
Top 1 Accuracy: 81.76 | ||
Top 5 Accuracy: 95.81 | ||
Teacher: | ||
Config: mmrazor::distill/mmcls/deit/deit-base_regnety160_pt-16xb64_in1k.py | ||
Weights: https://dl.fbaipublicfiles.com/deit/regnety_160-a5fe301d.pth | ||
Metrics: | ||
Top 1 Accuracy: 82.83 | ||
Top 5 Accuracy: 96.42 | ||
Results: | ||
- Task: Classification | ||
Dataset: ImageNet-1k | ||
Metrics: | ||
Top 1 Accuracy: 83.24 | ||
Top 5 Accuracy: 96.33 | ||
Weights: https://download.openmmlab.com/mmrazor/v1/deit/deit-base/deit-base_regnety160_pt-16xb64_in1k_20221011_113403-a67bf475.pth?versionId=CAEQThiBgMCFteW0oBgiIDdmMWY2NGRiOGY1YzRmZWZiOTExMzQ2NjNlMjk2Nzcz | ||
Config: configs/distill/mmcls/deit/deit-base_regnety160_pt-16xb64_in1k.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.