This repo provides the code of semi-supervised training of large-scale semantic segmentation on the ImageNet-S dataset.
Based on the ImageNet dataset, the ImageNet-S dataset has 1.2 million training images and 50k high-quality semantic segmentation annotations to support unsupervised/semi-supervised semantic segmentation on the ImageNet dataset. ImageNet-S dataset is available on ImageNet-S. More details about the dataset please refer to the project page or paper link.
- Semi-supervised finetuning with pre-trained checkpoints
python -m torch.distributed.launch --nproc_per_node=8 main_segfinetune.py \
--accum_iter 1 \
--batch_size 32 \
--model vit_small_patch16 \
--finetune ${PRETRAIN_CHKPT} \
--epochs 100 \
--nb_classes 920 | 301 | 51 \
--blr 5e-4 --layer_decay 0.50 \
--weight_decay 0.05 --drop_path 0.1 \
--data_path ${IMAGENETS_DIR} \
--output_dir ${OUTPATH} \
--dist_eval
Note: To use one GPU for training, you can change --nproc_per_node=8
to --nproc_per_node=1
and change --accum_iter 1
to --accum_iter 8
.
- Get the zip file for testing set. You can submit it to our online server.
python inference.py --model vit_small_patch16 \
--nb_classes 920 | 301 | 51 \
--output_dir ${OUTPATH}/predictions \
--data_path ${IMAGENETS_DIR} \
--finetune ${OUTPATH}/checkpoint-99.pth \
--mode validation | test
Model Zoo: We provide a model zoo to record the trend of semi-supervised semantic segmentation on the ImageNet-S dataset. For now, this repo supports ViT, and more backbones and pretrained models will be added. Please open a pull request if you want to update your new results.
Supported networks: ViT, ResNet, ConvNext, RF-ConvNext
Supported pretrain: MAE, SERE, PASS
@article{gao2021luss,
title={Large-scale Unsupervised Semantic Segmentation},
author={Gao, Shanghua and Li, Zhong-Yu and Yang, Ming-Hsuan and Cheng, Ming-Ming and Han, Junwei and Torr, Philip},
journal={arXiv preprint arXiv:2106.03149},
year={2021}
}
This codebase is build based on the MAE codebase.