Skip to content

[NeurIPS 2024 spotlight] Offical implementation of MSFA and release of SARDet_100K dataset for Large-Scale Synthetic Aperture Radar (SAR) Object Detection

License

Notifications You must be signed in to change notification settings

zcablii/SARDet_100K

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🔥🔥 SARDet-100K has been accepted at NeurIPS 2024 as a spotlight!! 🔥🔥


This repository now supports DenoDet!!

"DenoDet: Attention as Deformable Multi-Subspace Feature Denoising for Target Detection in SAR Images" at: https://arxiv.org/pdf/2406.02833


"SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection" at: https://arxiv.org/pdf/2403.06534.pdf

Yuxuan Li, Xiang Li*, Weijie Li, Qibin Hou, Li Liu, Ming-ming Cheng, Jian Yang*

李宇轩,李翔*,李玮杰,侯淇彬,刘丽,程明明,杨健*

PWC

走过路过不要错过!!

面对2024年的科研战场,你是否感到了前所未有的科研压力?计算机视觉的各大任务仿佛已经达到了饱和点,榜单一次又一次被那些拥有海量数据和计算资源的大模型所主宰。对于我们这些还在校园里奋斗的学生来说,手头的资源始终显得那么微不足道,似乎永远也竞争不过那些垄断科研的巨头。每一个方向似乎都被人挖得满满的,想发表一篇paper变得越来越难,快毕业的你是否也在焦虑地寻找一个真正属于自己的研究方向?多少次,我们都幻想能回到15年前ImageNet发布的那个夜晚,那时候没有群魔乱舞的注意力机制,也没有数以Billion计参数的大模型,每一个任务都是一片待开垦的蓝海……

但是今天,告诉你,有这么一个方向,曾一度因为缺乏大规模数据集和寥寥无几的开源代码而默默无闻,它的发展似乎一直滞后。但随着开源大规模数据集的诞生和完善的代码库的出现,之前的难题统统烟消云散。现在,就如同被一道闪电击中,穿越回了数年前,还有一片未被充分开发的蓝海正展现在你的眼前,等待你去探索、去征服。

所以,你想知道我说的是什么领域吗?今天,你不需要支付998,也不需要98,只需在 GitHub 上给我们一个Star,SAR目标检测的大礼包就能免费带回家!这里有你需要的一切:从大规模数据集到详尽的实现代码,我们都为你准备好了。是的,我在说的就是遥感目标检测的掌上明珠:SAR(合成孔径雷达)目标检测!可能听起来有些神秘,但这个领域的潜力和含金量,真正懂的人都心照不宣,DDDD,YYDS!它不仅完美契合当前的国家战略需求,而且无论是在科研界还是工业界,都有着广阔的应用前景和无限的可能性。近年来,在SAR检测领域发表文章变得更加容易,这预示着这一领域的迅速发展和对新思想、新技术的渴求。现在就加入SAR目标检测的行列吧!它就像15年前的比特币,20年前的房地产,现在入局不后悔!让我们一起驰骋在这片广袤的蓝海之中,探索它的未知之处,共同开拓它的无限可能!

Don't miss out as you pass by!

Facing the scientific research battlefield of 2024, do you feel unprecedented pressure in research? The major tasks in Computer Vision seem to have reached a saturation point, with the charts being dominated time and again by large models that have access to massive data and computational resources. For those of us still striving in academia, our resources always seem so insignificant, as if we could never compete with those giants monopolizing research. Every direction seems to be thoroughly explored, making it increasingly difficult to publish a paper. Are you, about to graduate, also anxiously looking for a research direction that truly belongs to you? How many times have we fantasized about going back to the night when ImageNet was released 15 years ago, when there was no chaotic dance of attention mechanisms, no models with billions of parameters, and every task was an uncharted blue ocean...

But today, let me tell you my friend, about a direction that once lingered in obscurity due to the lack of large-scale datasets and scarce open-source codes, its development seemed to be always lagging. But with the emergence of open-source large-scale datasets and comprehensive code libraries, all previous problems have vanished like smoke. Now, as if struck by lightning and transported back several years, there is an undeveloped blue ocean in front of you, waiting for you to explore and conquer.

So, do you want to know what field I am talking about? Today, you don't need to pay 998, not even 98, just give us a Star on GitHub, and the SAR object detection big gift pack can be taken home for free! Here you have everything you need: from large-scale datasets to detailed implementation codes, we have prepared everything for you. Yes, I am talking about the jewel in the crown of remote sensing target detection: SAR (Synthetic Aperture Radar) object detection! It might sound mysterious, but the potential and value of this field are well understood by those in the know. It not only perfectly matches the current national strategic needs but also has a wide range of applications and limitless possibilities in both the scientific and industrial communities. In recent years, publishing articles in the SAR detection field has become easier, indicating the rapid development of this field and its thirst for new ideas and technologies. Join the ranks of SAR object detection now! It's like Bitcoin 15 years ago, real estate 20 years ago, getting involved now is something you won't regret! Let's gallop together in this vast blue ocean, explore its unknowns, and jointly tap into its limitless possibilities!

MSFA

Abstract

Synthetic Aperture Radar (SAR) object detection has gained significant attention recently due to its irreplaceable all-weather imaging capabilities. However, this research field suffers from both limited public datasets (mostly comprising <2K images with only mono-category objects) and inaccessible source code. To tackle these challenges, we establish a new benchmark dataset and an open-source method for large-scale SAR object detection. Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets, providing a large-scale and diverse dataset for research purposes. To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created. With this high-quality dataset, we conducted comprehensive experiments and uncovered a crucial challenge in SAR object detection: the substantial disparities between the pretraining on RGB datasets and finetuning on SAR datasets in terms of both data domain and model structure. To bridge these gaps, we propose a novel Multi-Stage with Filter Augmentation (MSFA) pretraining framework that tackles the problems from the perspective of data input, domain transition, and model migration. The proposed MSFA method significantly enhances the performance of SAR object detection models while demonstrating exceptional generalizability and flexibility across diverse models. This work aims to pave the way for further advancements in SAR object detection.

MSFA

Introduction

This repository is the official site for "SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection" at: https://arxiv.org/pdf/2403.06534.pdf

DATASET DOWNLOAD at:

(Train, Val, Test)

Model Weights DOWNLOAD at:

(Only Train and Val sets are released so far.)

Image and instance level statistics of SARDet-100K dataset. *: Origin datasets are cropped into 512 x 512 patches.
Dataset Images Instances Ins/Img
Train Val Test ALL Train Val Test ALL
AIR_SARShip 1*  438 23 40 501 816 33 209 1,058 2.11
AIR_SARShip 2  270 15 15 300 1,819 127 94 2,040 6.80
HRSID  3,642 981 981 5,604 11,047 2,975 2,947 16,969 3.03
MSAR*  27,159 1,479 1,520 30,158 58,988 3,091 3,123 65,202 2.16
SADD  795 44 44 883 6,891 448 496 7,835 8.87
SAR-AIRcraft*  13,976 1,923 2,989 18,888 27,848 4,631 5,996 38,475 2.04
ShipDataset  31,784 3,973 3,972 39,729 40,761 5,080 5,044 50,885 1.28
SSDD  928 116 116 1,160 2,041 252 294 2,587 2.23
OGSOD  14,664 1,834 1,833 18,331 38,975 4,844 4,770 48,589 2.65
SIVED  837 104 103 1,044 9,561 1,222 1,230 12,013 11.51 
SARDet-100k 94,493 10,492 11,613 116,598 198,747 22,703 24,023 245,653 2.11
SARDet-100K source datasets information. GF-3: Gaofen-3, S-1: Sentinel-1. Target categories S: ship, A: aircraft, C: car, B: bridge, H: harbour, T: tank.
Datasets Target Res. (m) Band Polarization Satellites
AIR_SARShip S 1,3m C VV GF-3
HRSID S 0.5~3m C/X HH, HV, VH, VV S-1B,TerraSAR-X,TanDEMX
MSAR A, T, B, S < 1m C HH, HV, VH, VV HISEA-1
SADD A 0.5~3m X HH TerraSAR-X
SAR-AIRcraft A 1m C Uni-polar GF-3
ShipDataset S 3~25m C HH, VV, VH, HV S-1,GF-3
SSDD S 1~15m C/X HH, VV, VH, HV S-1,RadarSat-2,TerraSAR-X
OGSOD B, H, T 3m C VV/VH GF-3
SIVED C 0.1,0.3m Ka,Ku,X VV/HH Airborne SAR synthetic slice

MSFA: Multi-Stage with Filter Augmentation pretraining framework

MSFA

Introduction

This repository is the official implementation of Multi-Stage with Filter Augmentation (MSFA) pretraining framework in "SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection"

Filter Augmentation code is placed under MSFA/msfa/models/backbones/MSFA.py. The code about SARDet-100K dataset is placed under MSFA/msfa/datasets/SAR_Det.py. The train/test configure files used in the main paper are placed under local_configs.

Results and models

MSFA_generalizability

Comparison of different pretrain strategies using Faster-RCNN and Res50 as the detection model.
Model Input Pretrain mAP  Config Weight
Multi-stage Dataset Component
SAR
(Raw pixels)

ImageNet Backbone 49.0 config weight
ImageNet + DIOR Framework 49.5 config weight
ImageNet + DOTA Backbone 49.3 config weight
Framework 50.2 config weight
SAR+WST
(Filter Augmented)
ImageNet Backbone 49.2 config weight
ImageNet + DIOR Framework 50.1 config weight
ImageNet + DOTA Backbone 49.6 config weight
Framework 51.1 config weight

Generalization of MSFA on different detection frameworks. IMP: Traditional ImageNet Pretrain on backbone network only.
Framework Pretrain/Model Test Config Weight
mAP @50 @75 @s @m @l
Two
Stage
Faster RCNN  IMP 49.0 82.2 52.9 43.5 60.6 55.0 config weight
MSFA 51.1 (+2.1) 83.9 54.7 45.2 62.3 57.5 config weight
Cascade RCNN  IMP 51.1 81.9 55.8 44.9 62.9 60.3 config weight
MSFA 53.9 (+2.8) 83.4 59.8 47.2 66.1 63.2 config weight
Grid RCNN  IMP 48.8 79.1 52.9 42.4 61.9 55.5 config weight
MSFA 51.5 (+2.7) 81.7 56.3 45.1 64.1 60.0 config weight
Single
Stage
RetinaNet  IMP 47.4 79.3 49.7 40.0 59.2 57.5 config weight
MSFA 49.0 (+1.6) 80.1 52.6 41.3 61.1 59.4 config weight
GFL  IMP 49.8 80.9 53.3 42.3 62.4 58.1 config weight
MSFA 53.7 (+3.9) 84.2 57.8 47.8 66.2 59.5 config weight
DenoDet 55.4 (+5.6) 84.7 58.3 49.5 67.6 63.2 config weight
FCOS  IMP 46.5 80.9 49.0 41.1 59.2 50.4 config weight
MSFA 48.5 (+2.0) 82.1 51.4 42.9 60.4 56.0 config weight
End to
End
DETR  IMP 31.8 62.3 30.0 22.2 44.9 41.1 config weight
MSFA 47.2 (+15.4) 77.5 49.8 37.9 62.9 58.2 config weight
Deformable DETR  IMP 50.0 85.1 51.7 44.0 65.1 61.2 config weight
MSFA 51.3 (+1.3) 85.3 54.0 44.9 65.6 61.7 config weight
Sparse RCNN  IMP 38.1 68.8 38.8 29.0 51.3 48.7 config weight
MSFA 41.4 (+3.3) 74.1 41.8 33.6 53.9 53.4 config weight
Dab-DETR  IMP 45.9 79.0 47.9 38.0 61.1 55.0 config weight
MSFA 48.2 (+2.3) 81.1 51.0 41.2 63.1 55.4 config weight

Generalization of MSFA on different detection backbones. IMP: Traditional ImageNet Pretrain on backbone network only.
Framework #P(M) Pretrain Test Config Weight
mAP @50 @75 @s @m @l
R50  25.6 IMP 49.0 82.2 52.9 43.5 60.6 55.0 config weight
MSFA 51.1 (+2.1) 83.9 54.7 45.2 62.3 57.5 config weight
R101  44.7 IMP 51.2 84.1 55.6 45.9 61.9 56.3 config weight
MSFA 52.0 (+0.8) 84.6 56.6 46.6 63.4 57.7 config weight
R152  60.2 IMP 51.9 85.2 55.9 46.4 62.5 57.9 config weight
MSFA 52.4 (+0.5) 85.4 57.2 47.4 63.3 58.7 config weight
ConvNext-T  28.6 IMP 53.2 86.3 58.1 47.2 65.2 59.6 config weight
MSFA 54.8 (+1.6) 87.1 59.8 48.8 66.7 62.1 config weight
ConvNext-S  50.1 IMP 54.2 87.8 59.2 49.2 65.8 59.8 config weight
MSFA 55.4 (+1.2) 87.6 60.7 50.1 67.1 61.3 config weight
ConvNext-B  88.6 IMP 55.1 87.8 59.5 48.9 66.9 61.1 config weight
MSFA 56.4 (+1.3) 88.2 61.5 51.1 68.3 62.4 config weight
VAN-T   4.1 IMP 45.8 79.8 48.0 38.6 57.9 53.3 config weight
MSFA 47.6 (+1.8) 81.4 50.6 40.5 59.4 56.7 config weight
VAN-S  13.9 IMP 49.5 83.8 52.8 43.2 61.6 56.4 config weight
MSFA 51.5 (+2.0) 85.0 55.6 44.8 63.4 60.4 config weight
VAN-B  26.6 IMP 53.5 86.8 58.0 47.3 65.5 60.6 config weight
MSFA 55.1 (+1.6) 87.7 60.2 48.8 67.3 62.2 config weight
Swin-T  28.3 IMP 48.4 83.5 50.8 42.8 59.7 55.7 config weight
MSFA 50.2 (+1.8) 84.1 53.9 44.1 61.3 58.8 config weight
Swin-S  49.6 IMP 53.1 87.3 57.8 47.4 63.9 60.6 config weight
MSFA 54.0 (+0.9) 87.0 59.2 48.2 64.5 61.9 config weight
Swin-B  87.8 IMP 53.8 87.8 59.0 49.1 64.6 60.0 config weight
MSFA 55.7 (+1.9) 87.8 61.4 50.5 66.5 62.5 config weight

Installation

Our code depends on PyTorch, MMCV and MMDetection. Below are quick steps for installation. Please refer to Install Guide for more detailed instruction.

# change directory into the project main code
cd MSFA

# create env
conda create -y -n MSFA python=3.8
conda activate MSFA

# install pytorch
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.8 -c pytorch -c nvidia
# or 
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

# install dependencies of openmmlab
pip install -U openmim
mim install "mmengine==0.8.4"
mim install "mmcv==2.0.1"
mim install "mmdet==3.1.0"

# install other dependencies
pip install -r requirements.txt

# install MSFA
pip install -v -e .

Get Started

Please see get_started.md for the basic usage of MMDetection.

Acknowledgement

We extend our deepest gratitude to Bo Zhang, Chenglong Li, Tian Tian, Tianwen Zhang, Xiaoling Zhang (ordered alphabetically by first name) and numerous other researchers for permitting us to integrate their datasets. Their contributions have significantly advanced and promoted research in this field

Citation

If you use this toolbox or benchmark in your research, please cite this project.

@inproceedings{li2024sardet100k,
	title={SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection}, 
	author={Yuxuan Li and Xiang Li and Weijie Li and Qibin Hou and Li Liu and Ming-Ming Cheng and Jian Yang},
	year={2024},
	booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS)},
}

@article{dai2024denodet,
	title={DenoDet: Attention as Deformable Multi-Subspace Feature Denoising for Target Detection in SAR Images},
	author={Dai, Yimian and Zou, Minrui and Li, Yuxuan and Li, Xiang and Ni, Kang and Yang, Jian},
	journal={arXiv preprint arXiv:2406.02833},
	year={2024}
}

Star History

Star History Chart

License

This project is released under the Attribution-NonCommercial 4.0 International.

About

[NeurIPS 2024 spotlight] Offical implementation of MSFA and release of SARDet_100K dataset for Large-Scale Synthetic Aperture Radar (SAR) Object Detection

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published