This repository contains the code for deep auto-encoder-decoder network for few-shot semantic segmentation with state of the art results on FSS 1000 class dataset and Pascal 5i. This method embeds different ferequncy information in the CNN representation to overcome with the texture bias and applies bidirectional convolutional LSTM layers to perform non-liner parametric k shot setting for few-shot semantic segmentation. If this code helps with your research please consider citing the following paper:
R. Azad, Abdur. R. Fayjie, Claude Kauffman, Ismail Ben Ayed, Marco Pedersoli and Jose Dolz "On the Texture Bias for Few-Shot CNN Segmentation", arXiv preprint arXiv, 2020, download link.
- March 7, 2020: Implementation code is available now.
This code has been implemented in python language using Keras libarary with tensorflow backend and tested in ubuntu OS, though should be compatible with related environment. following Environement and Library needed to run the code:
- Python 3
- Keras version 2.2.0
- tensorflow backend version 1.13.1
The implementation code is availabel in Source Code folder.
1- Download the FSS1000 dataset from this link and extract the dataset.
2- Run Train_DOGLSTM.py
for training Scale Space Encoder model using k-shot episodic training. The model will be train for 50 epochs and for each epoch it will itterate 1000 episodes to train the model. The model will saves validation performance history and the best weights for the valiation set. It also will report the MIOU performance on the test set. The model by default will use VGG backbone with combining Block 3,4 and 5 but other combination can be call in creatign the model. It is also possible to use any backbone like Resnet, Inception and etc....
3- Run Train_weak.py
for training Scale Space Encoder model using k-shot episodic training and evaluatign on the weak annotation test set. This code will use weaklly annotated bouning box as a label for the support set on test time.
Notice: parser_utils.py
can be used for hyper parameter setting and defining data set address, k-shot and n-way.
Visual representation of 21 classes from 1000-class dataset with their masks and generated bounding box Download link
For evaluating the performance of the proposed method, Two challenging few-shot semantic segmentaion data sets have been considered. In bellow, results of the proposed approach illustrated.
In order to compare the proposed method with state of the art appraoches on few-shot semantic segmentation, we reported our result using mean Intersection over Unition (mIoU) metric on both 1-shot and 5-shot settings.
Methods | Year | mIoU |
---|---|---|
Shaban et. all OSLSM | 2017 | 70.29% |
Rakelly et. all co-FCN | 2018 | 71.94% |
Wei et. all FSS-1000 | 2019 | 73.47% |
Hendryx et. all FOMAML | 2020 | 75.19% |
Azad et. all Proposed Method (Baseline) | 2020 | 74.19% |
Azad et. all Proposed Method (Baseline + DoG) | 2020 | 78.71% |
Azad et. all Proposed Method (Baseline + DoG + BConvLSTM) | 2020 | 80.83% |
Methods | Year | mIoU |
---|---|---|
Shaban et. all OSLSM | 2017 | 73.02% |
Rakelly et. all co-FCN | 2018 | 74.27% |
Wei et. all FSS-1000 | 2019 | 80.12% |
Hendryx et. all FOMAML+ regularization | 2020 | 80.60% |
Hendryx et. all FOMAML+ regularization | 2020 | 82.19% |
Azad et. all Proposed Method (Baseline + DoG + BConvLSTM) non-parametric fusion | 2020 | 81.65% |
Azad et. all Proposed Method (Baseline + DoG + BConvLSTM) parametric fusion | 2020 | 83.36% |
Sample of 1-shot segmentation result on the FSS-1000 dataset
In order to compare the proposed method with state of the art appraoches on few-shot semantic segmentation, we reported our result using mean Intersection over Unition (mIoU) metric on both 1-shot and 5-shot settings.
Table 1: Results of 1-way 1-shot & 5-shot segmentation on the Pascal 5i data set employing the mIoU metric.
Methods | Fold 1 | Fold 2 | Fold 3 | Fold 4 | Mean | Fold 1 | Fold 2 | Fold 3 | Fold 4 | Mean | 1 to 5 shot Improvement |
---|---|---|---|---|---|---|---|---|---|---|---|
Wei et. all FSS-1000 | - | - | - | - | - | 37.4 | 60.9 | 46.6 | 42.2 | 56.8 | - |
Shaban et. all OSLSM | 33.6 | 55.3 | 40.9 | 33.5 | 40.8 | 35.9 | 58.1 | 42.7 | 39.1 | 43.9 | 3.1 |
Rakelly et. all co-FCN | 36.7 | 50.6 | 44.9 | 32.4 | 41.1 | 37.5 | 50.0 | 44.1 | 33.9 | 41.4 | 0.3 |
Rakelly et. all SG-One | 40.2 | 58.4 | 48.4 | 38.4 | 46.3 | 41.9 | 58.6 | 48.6 | 39.4 | 47.1 | 0.8 |
Rakelly et. all AMP | 41.9 | 50.2 | 46.7 | 34.7 | 43.4 | 41.8 | 55.5 | 50.3 | 39.9 | 46.9 | 3.5 |
Rakelly et. all PANet | 42.3 | 58.0 | 51.1 | 41.2 | 48.1 | 51.8 | 64.6 | 59.8 | 46.5 | 55.7 | 7.6 |
Rakelly et. all Feat Weight | 51.3 | 64.5 | 56.7 | 52.2 | 56.2 | 54.9 | 67.4 | 62.2 | 55.3 | 59.9 | 3.7 |
Rakelly et. all Meta-Seg | 42.2 | 59.6 | 48.1 | 44.4 | 48.6 | 43.1 | 62.5 | 49.9 | 45.3 | 50.2 | 1.6 |
Rakelly et. all MDL | 39.7 | 58.3 | 46.7 | 36.3 | 45.3 | 40.6 | 58.5 | 47.7 | 36.6 | 45.9 | 0.6 |
Rakelly et. all OSAdv | 46.9 | 59.2 | 49.3 | 43.4 | 49.7 | 47.2 | 58.8 | 48.8 | 47.4 | 50.6 | 0.9 |
Rakelly et. all AMCG | - | - | - | - | 61.2 | - | - | - | - | 62.2 | 1.0 |
Rakelly et. all CANet | 52.5 | 65.9 | 51.3 | 51.9 | 55.4 | 55.5 | 67.8 | 51.9 | 53.2 | 57.1 | 1.7 |
Rakelly et. all LTM | 52.8 | 69.6 | 53.2 | 52.3 | 57.0 | 57.9 | 69.9 | 56.9 | 57.5 | 60.6 | 3.6 |
Rakelly et. all PGNet | 56.0 | 66.9 | 50.6 | 50.4 | 56.0 | 57.7 | 68.7 | 52.9 | 54.6 | 58.5 | 2.5 |
Azad et. all Proposed Method | 56.2 | 66.0 | 56.1 | 53.8 | 58.0 | 57.5 | 70.6 | 56.6 | 57.7 | 60.6 | 2.6 |
Sample of 1-shot segmentation result on the Pascal 5i dataset
Visual representation of proposed method performing segmentation on 1000 class dataset with weak annotation (bounding box)
All implementation done by Reza Azad. For any query please contact us for more information.
rezazad68@gmail.com