Our code is implemented in MXNet. The SEC model used in our framework (Seed, Expand, Constrain: Three Principles for Weakly-Supervised Image Segmentation), which was originally implemented in Caffe(Original Github), is also reimplemented in MXNet.
Please refer to the official website of MXNet (HERE) for installation. Also make sure MXNet is compiled with OpenCV support. The other dependent python packages can be found in "dependencies.txt". Please run:
pip install -r dependencies.txt
There will be three datasets involved, PASCAL VOC12(HERE), SBD(HERE) and Web images(HERE). Extract them and put them into folder "dataset", and then run:
python create_dataset.py
- Download Models Please download pretrained models (HERE), which includes vgg16 and resnet50 pretrained on Imagenet. Please extract the file and put the files into folder "models".
- Start Training In cores/config.py, all the parameters are shown. The most important one "BASE_NET", which defines the backbone of the model. Choose between "vgg16" and "resnet50". Please follow "pipeline.sh" to run the programs. It is worth noting that most of the scripts can be executed multiple times to speed things up. Refer to "pipeline.sh" for more details.
Download the trained models, Resnet50 (HERE) or VGG16 (HERE), and put it in the folder "snapshots". In cores/config.py, set "BASE_NET" to "vgg16" or "resnet50" to choose the desired model, and run:
python eval_seg_model.py --model final --gpu 0 --epoch 19
There are other flags:
--savemask
is used to save masks, outputs will be saved in "outputs" folder.
--crf
is used to use CRF as postprocessing
The provided dataset of web images include 76,683 web images searched from Bing. You can also try using your own images as long as it is consistent with our naming convention.
The images should be named as:
ID_XXXXX.jpg
"ID" is the class ID in VOC. Since the background is "0", a valid "ID" is {1,2 ..., 20} in the case of PASCAL VOC. "XXXXX" can be anything.