Skip to content

YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-Time Object Detection

License

Notifications You must be signed in to change notification settings

FishAndWasabi/YOLO-MS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿš€ YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-Time Object Detection

Python 3.8 pytorch 1.12.1 docs

This repository contains the official implementation of the following paper:

YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-Time Object Detection
Yuming Chen, Xinbin Yuan, Ruiqi Wu, Jiabao Wang, Qibin Hou, Ming-Ming Cheng
Under review

[Homepage (TBD)] [Paper] [็ŸฅไนŽ (TBD)] [้›†ๆ™บไนฆ็ซฅ] [Poster (TBD)] [Video (TBD)]

YOLOMS_TEASER0 YOLOMS_TEASER0

๐Ÿ“„ Table of Contents

โœจ News ๐Ÿ”

Future work can be found in todo.md.

  • Aug, 2023: Our code is publicly available!

๐Ÿ› ๏ธ Dependencies and Installation ๐Ÿ”

We provide a simple scrpit install.sh for installation, or refer to install.md for more details.

  1. Clone and enter the repo.

    git clone https://github.com/FishAndWasabi/YOLO-MS.git
    cd YOLO-MS
  2. Run install.sh.

    bash install.sh
  3. Activate your environment!

    conda activate YOLO-MS

๐Ÿ‘ผ Quick Demo ๐Ÿ”

python demo/image_demo.py ${IMAGE_PATH} ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]

# for sam output
python demo/sam_demo.py ${IMAGE_PATH} ${CONFIG_FILE} ${CHECKPOINT_FILE} --sam_size ${SAM_MODEL_SIZE} --sam_model ${SAM_MODEL_PATH}

You could run python demo/image_demo.py --help to get detailed information of this scripts.

Detailed arguments
positional arguments:
  img                   Image path, include image file, dir and URL.
  config                Config file
  checkpoint            Checkpoint file

optional arguments:
  -h, --help            show this help message and exit
  --out-dir OUT_DIR     Path to output file
  --device DEVICE       Device used for inference
  --show                Show the detection results
  --deploy              Switch model to deployment mode
  --tta                 Whether to use test time augmentation
  --score-thr SCORE_THR
                        Bbox score threshold
  --class-name CLASS_NAME [CLASS_NAME ...]
                        Only Save those classes if set
  --to-labelme          Output labelme style label file
  
  --sam_size            Default: vit_h, Optional: vit_l, vit_b
  --sam_model           Path of the sam model checkpoint
DEMO DEMO_OUTPUT
DEMO DEMO_SAM_OUTPUT
DEMO_VIDEO DEMO_VIDEO_OUTPUT

๐Ÿค– Training and Evaluation ๐Ÿ”

  1. Training

    1.1 Single GPU

    python tools/train.py ${CONFIG_FILE} [optional arguments]

    1.2 Multi GPU

    CUDA_VISIBLE_DEVICES=x bash tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]

    You could run python tools/train.py --help to get detailed information of this scripts.

    Detailed arguments
    positional arguments:
    config                train config file path
    
    optional arguments:
    -h, --help            show this help message and exit
    --work-dir WORK_DIR   the dir to save logs and models
    --amp                 enable automatic-mixed-precision training
    --resume [RESUME]     If specify checkpoint path, resume from it, while if not specify, try to auto resume from the latest checkpoint in the work directory.
    --cfg-options CFG_OPTIONS [CFG_OPTIONS ...]
                            override some settings in the used config, the key-value pair in xxx=yyy format will be merged into config file. If the value to be overwritten is a list, it should be like key="[a,b]" or key=a,b It also allows nested
                            list/tuple values, e.g. key="[(a,b),(c,d)]" Note that the quotation marks are necessary and that no white space is allowed.
    --launcher {none,pytorch,slurm,mpi}
                            job launcher
    --local_rank LOCAL_RANK
    
  2. Evaluation

    1.1 Single GPU

    python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]

    1.2 Multi GPU

    CUDA_VISIBLE_DEVICES=x bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [optional arguments]

    You could run python tools/test.py --help to get detailed information of this scripts.

    Detailed arguments
    positional arguments:
    config                test config file path
    checkpoint            checkpoint file
    
    optional arguments:
    -h, --help            show this help message and exit
    --work-dir WORK_DIR   the directory to save the file containing evaluation metrics
    --out OUT             output result file (must be a .pkl file) in pickle format
    --json-prefix JSON_PREFIX
                            the prefix of the output json file without perform evaluation, which is useful when you want to format the result to a specific format and submit it to the test server
    --tta                 Whether to use test time augmentation
    --show                show prediction results
    --deploy              Switch model to deployment mode
    --show-dir SHOW_DIR   directory where painted images will be saved. If specified, it will be automatically saved to the work_dir/timestamp/show_dir
    --wait-time WAIT_TIME
                            the interval of show (s)
    --cfg-options CFG_OPTIONS [CFG_OPTIONS ...]
                            override some settings in the used config, the key-value pair in xxx=yyy format will be merged into config file. If the value to be overwritten is a list, it should be like key="[a,b]" or key=a,b It also allows nested
                            list/tuple values, e.g. key="[(a,b),(c,d)]" Note that the quotation marks are necessary and that no white space is allowed.
    --launcher {none,pytorch,slurm,mpi}
                            job launcher
    --local_rank LOCAL_RANK
    
  3. Deployment

# Build docker images
docker build docker/mmdeploy/ -t mmdeploy:inside --build-arg USE_SRC_INSIDE=true
# Run docker container
docker run --gpus all --name mmdeploy_yoloms -dit mmdeploy:inside
# Convert ${CONFIG_FILE}
python tools/misc/print_config.py ${O_CONFIG_FILE} --save-path ${CONFIG_FILE}

# Copy local file into docker container
docker cp deploy.sh mmdeploy_yoloms:/root/workspace
docker cp ${DEPLOY_CONFIG_FILE}  mmdeploy_yoloms:/root/workspace/${DEPLOY_CONFIG_FILE}
docker cp ${CONFIG_FILE} mmdeploy_yoloms:/root/workspace/${CONFIG_FILE}
docker cp ${CHECKPOINT_FILE} mmdeploy_yoloms:/root/workspace/${CHECKPOINT_FILE}

# Start docker container
docker start mmdeploy_yoloms
# Attach docker container
docker attach mmdeploy_yoloms

# Run the deployment shell
sh deploy.sh ${DEPLOY_CONFIG_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} ${SAVE_DIR}
# Copy the results to local
docker cp mmdeploy_yoloms:/root/workspace/${SAVE_DIR} ${SAVE_DIR}
  • DEPLOY_CONFIG_FILE: Config file for deployment.
  • O_CONFIG_FILE: Original config file of model.
  • CONFIG_FILE: Converted config file of model.
  • CHECKPOINT_FILE: Checkpoint of model.
  • SAVE_DIR: Save dir.
  1. Test FPS

    4.1 Deployed Model

    # Copy local file into docker container
    docker cp ${DATA_DIR} mmdeploy_yoloms:/root/workspace/${DATA_DIR}
    docker cp fps.sh mmdeploy_yoloms:/root/workspace
    # Start docker container
    docker start mmdeploy_yoloms
    # Attach docker container
    docker attach mmdeploy_yoloms
    # In docker container
    # Run the FPS shell
    python mmdeploy/tools/profiler.py ${DEPLOY_CONFIG_FILE} \
                                      ${CONFIG_FILE} \
                                      ${DATASET} \
                                      --model ${PROFILER_MODEL} \
                                      --device ${DEVICE}

    4.2 Undeployed Model

    python tools/analysis_tools/benchmark.py ${CONFIG_FILE} --checkpoint ${CHECKPOINT_FILE} [optional arguments]
  2. Test FLOPs and Params

python tools/analysis_tools/get_flops.py ${CONFIG_FILE} --shape 640 640 [optional arguments]

๐Ÿก Model Zoo ๐Ÿ”

1. YOLO-MS
Model Resolution Epoch Params(M) FLOPs(G) $AP$ $AP_s$ $AP_m$ $AP_l$ Config ๐Ÿ”—
XS 640 300 4.5 8.7 43.1 24.0 47.8 59.1 [config] [model]
XS* 640 300 4.5 8.7 43.4 23.7 48.3 60.3 [config] [model]
S 640 300 8.1 15.6 46.2 27.5 50.6 62.9 [config] [model]
S* 640 300 8.1 15.6 46.2 26.9 50.5 63.0 [config] [model]
- 640 300 22.0 40.1 50.8 33.2 54.8 66.4 [config] [model]
-* 640 300 22.2 40.1 50.8 33.2 54.8 66.4 [config] [model]

* refers to with SE attention

2. YOLOv6
Model Resolution Epoch Params(M) FLOPs(G) $AP$ $AP_s$ $AP_m$ $AP_l$ Config ๐Ÿ”—
t 640 400 9.7 12.4 41.0 21.2 45.7 57.7 [config] [model]
t-MS 640 400 8.1 9.6 43.5 (+2.5) 26.0 48.3 57.8 [config] [model]
3. YOLOv8
Model Resolution Epoch Params(M) FLOPs(G) $AP$ $AP_s$ $AP_m$ $AP_l$ Config ๐Ÿ”—
n 640 500 2.9 4.4 37.2 18.9 40.5 52.5 [config] [model]
n-MS 640 500 2.9 4.4 40.3 (+3.1) 22.0 44.6 53.7 [config] [model]

๐Ÿ—๏ธ Supported Tasks ๐Ÿ”

  • Object Detection
  • Instance Segmentation (TBD)
  • Rotated Object Detection (TBD)
  • Object Tracking (TBD)
  • Detection in Crowded Scene (TBD)
  • Small Object Detection (TBD)

๐Ÿ“– Citation ๐Ÿ”

If you find our repo useful for your research, please cite us:

@misc{chen2023yoloms,
      title={YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection},
      author={Yuming Chen and Xinbin Yuan and Ruiqi Wu and Jiabao Wang and Qibin Hou and Ming-Ming Cheng},
      year={2023},
      eprint={2308.05480},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

This project is based on the open source codebase MMYOLO.

@misc{mmyolo2022,
    title={{MMYOLO: OpenMMLab YOLO} series toolbox and benchmark},
    author={MMYOLO Contributors},
    howpublished = {\url{https://github.com/open-mmlab/mmyolo}},
    year={2022}
}

๐Ÿ“œ License ๐Ÿ”

Licensed under a Creative Commons Attribution-NonCommercial 4.0 International for Non-commercial use only. Any commercial use should get formal permission first.

๐Ÿ“ฎ Contact ๐Ÿ”

For technical questions, please contact chenyuming[AT]mail.nankai.edu.cn. For commercial licensing, please contact cmm[AT]nankai.edu.cn and andrewhoux[AT]gmail.com.

๐Ÿค Acknowledgement ๐Ÿ”

This repo is modified from open source real-time object detection codebase MMYOLO. The README file is referred to LED and CrossKD

About

YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-Time Object Detection

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •