Zheng Ding, Jieke Wang, Zhuowen Tu
For COCO and ADE20k data preparation, please refer to Preparing Datasets in Mask2Former.
Please follow the following codes to set up the environment.
conda create -n maskclip python=3.9
conda activate maskclip
conda install pytorch=1.10 cudatoolkit=11.3 torchvision=0.11 -c pytorch -c conda-forge
python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
pip install setuptools==59.5.0
pip install timm opencv-python scipy einops
pip install git+https://github.com/openai/CLIP.git
pip install git+https://github.com/cocodataset/panopticapi.git
cd mask2former/modeling/pixel_decoder/ops/
sh make.sh
You can train a class-agnostic mask proposal network by removing the classification head of previous segmentation models e.g., Mask2Former, MaskRCNN. We provide our trained class-agnostic mask proposal network here.
With the trained class-agnostic mask proposal network, we can train the MaskCLIP model through the following command. We train our model for 10,000 iterations with a batch size of 8.
python train_net.py --num-gpus 8 --config-file configs/coco/maskformer2_R50_bs16_50ep.yaml
You can test our model on ADE20K dataset to get the results using the trained model. We also provide our trained model here. You need to change the path of MODEL.WEIGHTS
in the yaml file or add to the line
python train_net.py --num-gpus 1 --config-file configs/ade20k/maskformer2_R50_bs16_160k.yaml --eval-only MODEL.WEIGHTS model_final.pth
If you find this work helpful, please consider citing MaskCLIP using the following BibTeX entry.
@inproceedings{ding2023maskclip,
author = {Zheng Ding, Jieke Wang, Zhuowen Tu},
title = {Open-Vocabulary Universal Image Segmentation with MaskCLIP},
booktitle = {International Conference on Machine Learning},
year = {2023},
}
Please also checkout MasQCLIP for our lastest work on open-vocabulary segmentation.
This codebase was built upon and drew inspirations from CLIP and Mask2Former. We thank the authors for making those repositories public.