-
Notifications
You must be signed in to change notification settings - Fork 9.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Implement YOLOv3 * Remove unused function * Update yolov3_ms_aug_273e.py Clean the comments in config file * Add README.md * port to mmdet-2.0 api * unify registry * port to ConvModule and remove ConvLayer * Refactor Backbone * Update README * Lint and format * Unify the class name * fix the `label - 1` problem * Move a lot hard-coded params to the __init__ function * Refactor YOLOV3Neck * Add norm_cfg and act_cfg to backbone * Update Config * Fix doc string * Fix nms (thanks to @LMerCy) * Add doc string * Update config * Remove pretrained in head and neck * Add support for conv_cfg in neck * Update mmdet/models/dense_heads/yolo_head.py Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com> * Update mmdet/models/dense_heads/yolo_head.py Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com> * Fix README.md * Fix typos * update config * flake8, yapf, docformatter, etc * Update README * Add conv_cfg to backbone and head * Move some config to arch_settings in backbone * Add doc strings and replace Warning with warnings.warn() * Fix bug. * Update doc * Add _frozen_stages for backbone * Update mmdet/models/backbones/darknet.py Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com> * Fix inplace bug * fix indent * refactor config * set 8GPU lr * fixed typo * update performance table * Resolve conversation * Add anchor generator and coder * fixed test * Finish refactor * refactor anchor order * fixed batch size * Fixed train_cfg * fix yolo assigner * clean up * Fixed format * Update model zoo * change to mmcv pretrain link * add test forward * fixed comma and docstring * Refactor loss * reformat * fixed avg_factor * revert to original * fixed format * update table * fixed BCE Co-authored-by: Haoyu Wu <haoyu.wu@wdc.com> Co-authored-by: Haoyu Wu <wuhy08@users.noreply.github.com> Co-authored-by: Haoyu Wu <wuhaoyu1989@gmail.com> Co-authored-by: Jerry Jiarui XU <xvjiarui0826@gmail.com> Co-authored-by: xmpeng <1051323399@qq.com>
- Loading branch information
1 parent
83f0ca4
commit dfbb6d6
Showing
25 changed files
with
1,540 additions
and
13 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
# YOLOv3 | ||
|
||
## Introduction | ||
``` | ||
@misc{redmon2018yolov3, | ||
title={YOLOv3: An Incremental Improvement}, | ||
author={Joseph Redmon and Ali Farhadi}, | ||
year={2018}, | ||
eprint={1804.02767}, | ||
archivePrefix={arXiv}, | ||
primaryClass={cs.CV} | ||
} | ||
``` | ||
|
||
## Results and Models | ||
|
||
| Backbone | Scale | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download | | ||
| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-------: | | ||
| DarkNet-53 | 320 | 273e | 2.7 | 63.9 | 27.9 | [model](https://openmmlab.oss-accelerate.aliyuncs.com/mmdetection/v2.0/yolo/yolov3_d53_320_273e_coco/yolov3_d53_320_273e_coco-421362b6.pth) | [log](https://openmmlab.oss-accelerate.aliyuncs.com/mmdetection/v2.0/yolo/yolov3_d53_320_273e_coco/yolov3_d53_320_273e_coco-20200819_172101.log.json) | | ||
| DarkNet-53 | 416 | 273e | 3.8 | 61.2 | 30.9 | [model](https://openmmlab.oss-accelerate.aliyuncs.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-416_273e_coco/yolov3_d53_mstrain-416_273e_coco-2b60fcd9.pth) | [log](https://openmmlab.oss-accelerate.aliyuncs.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-416_273e_coco/yolov3_d53_mstrain-416_273e_coco-20200819_173424.log.json) | | ||
| DarkNet-53 | 608 | 273e | 7.1 | 48.1 | 33.4 | [model](https://openmmlab.oss-accelerate.aliyuncs.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-608_273e_coco/yolov3_d53_mstrain-608_273e_coco-139f5633.pth) | [log](https://openmmlab.oss-accelerate.aliyuncs.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-608_273e_coco/yolov3_d53_mstrain-608_273e_coco-20200819_170820.log.json) | | ||
|
||
|
||
## Credit | ||
This implementation originates from the project of Haoyu Wu(@wuhy08) at Western Digital. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
_base_ = './yolov3_d53_mstrain-608_273e_coco.py' | ||
# dataset settings | ||
img_norm_cfg = dict(mean=[0, 0, 0], std=[255., 255., 255.], to_rgb=True) | ||
train_pipeline = [ | ||
dict(type='LoadImageFromFile', to_float32=True), | ||
dict(type='LoadAnnotations', with_bbox=True), | ||
dict(type='PhotoMetricDistortion'), | ||
dict( | ||
type='Expand', | ||
mean=img_norm_cfg['mean'], | ||
to_rgb=img_norm_cfg['to_rgb'], | ||
ratio_range=(1, 2)), | ||
dict( | ||
type='MinIoURandomCrop', | ||
min_ious=(0.4, 0.5, 0.6, 0.7, 0.8, 0.9), | ||
min_crop_size=0.3), | ||
dict(type='Resize', img_scale=(320, 320), keep_ratio=True), | ||
dict(type='RandomFlip', flip_ratio=0.5), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='Pad', size_divisor=32), | ||
dict(type='DefaultFormatBundle'), | ||
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) | ||
] | ||
test_pipeline = [ | ||
dict(type='LoadImageFromFile'), | ||
dict( | ||
type='MultiScaleFlipAug', | ||
img_scale=(320, 320), | ||
flip=False, | ||
transforms=[ | ||
dict(type='Resize', keep_ratio=True), | ||
dict(type='RandomFlip'), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='Pad', size_divisor=32), | ||
dict(type='ImageToTensor', keys=['img']), | ||
dict(type='Collect', keys=['img']) | ||
]) | ||
] | ||
data = dict( | ||
train=dict(pipeline=train_pipeline), | ||
val=dict(pipeline=test_pipeline), | ||
test=dict(pipeline=test_pipeline)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
_base_ = './yolov3_d53_mstrain-608_273e_coco.py' | ||
# dataset settings | ||
img_norm_cfg = dict(mean=[0, 0, 0], std=[255., 255., 255.], to_rgb=True) | ||
train_pipeline = [ | ||
dict(type='LoadImageFromFile', to_float32=True), | ||
dict(type='LoadAnnotations', with_bbox=True), | ||
dict(type='PhotoMetricDistortion'), | ||
dict( | ||
type='Expand', | ||
mean=img_norm_cfg['mean'], | ||
to_rgb=img_norm_cfg['to_rgb'], | ||
ratio_range=(1, 2)), | ||
dict( | ||
type='MinIoURandomCrop', | ||
min_ious=(0.4, 0.5, 0.6, 0.7, 0.8, 0.9), | ||
min_crop_size=0.3), | ||
dict(type='Resize', img_scale=[(320, 320), (416, 416)], keep_ratio=True), | ||
dict(type='RandomFlip', flip_ratio=0.5), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='Pad', size_divisor=32), | ||
dict(type='DefaultFormatBundle'), | ||
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) | ||
] | ||
test_pipeline = [ | ||
dict(type='LoadImageFromFile'), | ||
dict( | ||
type='MultiScaleFlipAug', | ||
img_scale=(416, 416), | ||
flip=False, | ||
transforms=[ | ||
dict(type='Resize', keep_ratio=True), | ||
dict(type='RandomFlip'), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='Pad', size_divisor=32), | ||
dict(type='ImageToTensor', keys=['img']), | ||
dict(type='Collect', keys=['img']) | ||
]) | ||
] | ||
data = dict( | ||
train=dict(pipeline=train_pipeline), | ||
val=dict(pipeline=test_pipeline), | ||
test=dict(pipeline=test_pipeline)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,121 @@ | ||
_base_ = '../_base_/default_runtime.py' | ||
# model settings | ||
model = dict( | ||
type='YOLOV3', | ||
pretrained='open-mmlab://darknet53', | ||
backbone=dict(type='Darknet', depth=53, out_indices=(3, 4, 5)), | ||
neck=dict( | ||
type='YOLOV3Neck', | ||
num_scales=3, | ||
in_channels=[1024, 512, 256], | ||
out_channels=[512, 256, 128]), | ||
bbox_head=dict( | ||
type='YOLOV3Head', | ||
num_classes=80, | ||
in_channels=[512, 256, 128], | ||
out_channels=[1024, 512, 256], | ||
anchor_generator=dict( | ||
type='YOLOAnchorGenerator', | ||
base_sizes=[[(116, 90), (156, 198), (373, 326)], | ||
[(30, 61), (62, 45), (59, 119)], | ||
[(10, 13), (16, 30), (33, 23)]], | ||
strides=[32, 16, 8]), | ||
bbox_coder=dict(type='YOLOBBoxCoder'), | ||
featmap_strides=[32, 16, 8], | ||
loss_cls=dict( | ||
type='CrossEntropyLoss', | ||
use_sigmoid=True, | ||
loss_weight=1.0, | ||
reduction='sum'), | ||
loss_conf=dict( | ||
type='CrossEntropyLoss', | ||
use_sigmoid=True, | ||
loss_weight=1.0, | ||
reduction='sum'), | ||
loss_xy=dict( | ||
type='CrossEntropyLoss', | ||
use_sigmoid=True, | ||
loss_weight=2.0, | ||
reduction='sum'), | ||
loss_wh=dict(type='MSELoss', loss_weight=2.0, reduction='sum'))) | ||
# training and testing settings | ||
train_cfg = dict( | ||
assigner=dict( | ||
type='GridAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0)) | ||
test_cfg = dict( | ||
nms_pre=1000, | ||
min_bbox_size=0, | ||
score_thr=0.05, | ||
conf_thr=0.005, | ||
nms=dict(type='nms', iou_thr=0.45), | ||
max_per_img=100) | ||
# dataset settings | ||
dataset_type = 'CocoDataset' | ||
data_root = 'data/coco/' | ||
img_norm_cfg = dict(mean=[0, 0, 0], std=[255., 255., 255.], to_rgb=True) | ||
train_pipeline = [ | ||
dict(type='LoadImageFromFile', to_float32=True), | ||
dict(type='LoadAnnotations', with_bbox=True), | ||
dict(type='PhotoMetricDistortion'), | ||
dict( | ||
type='Expand', | ||
mean=img_norm_cfg['mean'], | ||
to_rgb=img_norm_cfg['to_rgb'], | ||
ratio_range=(1, 2)), | ||
dict( | ||
type='MinIoURandomCrop', | ||
min_ious=(0.4, 0.5, 0.6, 0.7, 0.8, 0.9), | ||
min_crop_size=0.3), | ||
dict(type='Resize', img_scale=[(320, 320), (608, 608)], keep_ratio=True), | ||
dict(type='RandomFlip', flip_ratio=0.5), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='Pad', size_divisor=32), | ||
dict(type='DefaultFormatBundle'), | ||
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) | ||
] | ||
test_pipeline = [ | ||
dict(type='LoadImageFromFile'), | ||
dict( | ||
type='MultiScaleFlipAug', | ||
img_scale=(608, 608), | ||
flip=False, | ||
transforms=[ | ||
dict(type='Resize', keep_ratio=True), | ||
dict(type='RandomFlip'), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='Pad', size_divisor=32), | ||
dict(type='ImageToTensor', keys=['img']), | ||
dict(type='Collect', keys=['img']) | ||
]) | ||
] | ||
data = dict( | ||
samples_per_gpu=8, | ||
workers_per_gpu=4, | ||
train=dict( | ||
type=dataset_type, | ||
ann_file=data_root + 'annotations/instances_train2017.json', | ||
img_prefix=data_root + 'train2017/', | ||
pipeline=train_pipeline), | ||
val=dict( | ||
type=dataset_type, | ||
ann_file=data_root + 'annotations/instances_val2017.json', | ||
img_prefix=data_root + 'val2017/', | ||
pipeline=test_pipeline), | ||
test=dict( | ||
type=dataset_type, | ||
ann_file=data_root + 'annotations/instances_val2017.json', | ||
img_prefix=data_root + 'val2017/', | ||
pipeline=test_pipeline)) | ||
# optimizer | ||
optimizer = dict(type='SGD', lr=0.001, momentum=0.9, weight_decay=0.0005) | ||
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2)) | ||
# learning policy | ||
lr_config = dict( | ||
policy='step', | ||
warmup='linear', | ||
warmup_iters=2000, # same as burn-in in darknet | ||
warmup_ratio=0.1, | ||
step=[218, 246]) | ||
# runtime settings | ||
total_epochs = 273 | ||
evaluation = dict(interval=1, metric=['bbox']) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,11 @@ | ||
from .anchor_generator import AnchorGenerator, LegacyAnchorGenerator | ||
from .anchor_generator import (AnchorGenerator, LegacyAnchorGenerator, | ||
YOLOAnchorGenerator) | ||
from .builder import ANCHOR_GENERATORS, build_anchor_generator | ||
from .point_generator import PointGenerator | ||
from .utils import anchor_inside_flags, calc_region, images_to_levels | ||
|
||
__all__ = [ | ||
'AnchorGenerator', 'LegacyAnchorGenerator', 'anchor_inside_flags', | ||
'PointGenerator', 'images_to_levels', 'calc_region', | ||
'build_anchor_generator', 'ANCHOR_GENERATORS' | ||
'build_anchor_generator', 'ANCHOR_GENERATORS', 'YOLOAnchorGenerator' | ||
] |
Oops, something went wrong.