Semantic Segmentation Algorithms

This repository contains a suite of semantic segmentation algorithms implemented from scratch with Jax and Flax. The main aim is to implement the algorithms in the simplest possible way while maintaining stability during training and reducing training time.

Implementation Details

The segmentation models are trained on RGB images from the scene parse 150 dataset [1], which contains 150 classes. The output of the models will have a four dimensional shape (B, H, W, C), where B is the batch size, H is the image height, W is the image width and C is the number of classes. Some alterations were added to the original models to speed up training and make the models more robust, such as adding Group Norm and Dropout layers. The DICE loss is used as the loss function for training and evaluation. The models are saved using the Orbax checkpointer and will be provided on Huggingface once the training has completed.

Algorithms

U-Net

U-Net uses the same ideas from the Fully Convolutional Network (FCN) and improves upon them. The main idea is to use an encoder-decoder architecture with skip connections from the encoder layers to the decoder layers. This provides global and local information to the final segmentation layers, which improves the classification and localization in the predicted segmentation. U-Net has a symetric architecture, giving it the U shape it was named after. It's simpler to implement than FCN and is also very fast. This made U-Net one of the most popular segmentation models today.

PSPNet

PSPNet was designed to solve the lack of global scene understanding faced by FCN. It uses a pyramid pooling module (PPM) combined with a pretrained resnet backbone to extract global context information. PPM pools the feature map extracted from the backbone into feature maps with difference scales. The scaled feature maps are then fused, upsampled and then processed by a final convolutional module, then upsamepled again to extract the segmentation mask. The lowest resolution PPM feature maps will contain the coarsest information, which is ideal for understanding global information. While the highest resolution feature maps will contain local information, which is ideal for localizing the objects.

Installation Requirements

If you have a GPU you can install Jax by running the following first:

pip install --upgrade "jax[cuda]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

All the requirements are provided below:

pip install datasets
pip install flax
pip install augmax
pip install -qq nest_asyncio
pip install matplotlib
pip install pandas
pip install jupyter
pip install scikit-learn

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
colormaps.py		colormaps.py
loss_functions.py		loss_functions.py
miou_metrics.py		miou_metrics.py
model_functions.py		model_functions.py
plotting_functions.py		plotting_functions.py
preprocessing_functions.py		preprocessing_functions.py
pspnet_model.py		pspnet_model.py
pspnet_train.py		pspnet_train.py
pspnet_train_bn.py		pspnet_train_bn.py
resnet_models.py		resnet_models.py
resnet_models_bn.py		resnet_models_bn.py
train_functions.py		train_functions.py
unet_model.py		unet_model.py
unet_train.py		unet_train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Semantic Segmentation Algorithms

Implementation Details

Algorithms

U-Net

PSPNet

Installation Requirements

References

About

Releases

Packages

Languages

License

ChristianOrr/semantic-segmentation

Folders and files

Latest commit

History

Repository files navigation

Semantic Segmentation Algorithms

Implementation Details

Algorithms

U-Net

PSPNet

Installation Requirements

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages