Skip to content

huss-mo/semantic_segmentation_transfer_learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is an implementation of semantic segmentation transfer learning.

The model implemented here is based on the model developed in the paper "The devil is in the labels: Semantic segmentation from sentences".
The authors claim that their method, while operating in zero-shot setting, achieves results comparable to those of supervised learning; and it was possible by replacing the class labels with embeddings generated from sentences. Furthermore, fine-tuning the model to a given semantic segmentation dataset, should further improve the performance of the model.
The target is to adapt the original model to a new dataset; in this case, the CMP Facade Database

Development Steps:

  • Copy required functions from the original paper code to the Google Colab document (and update as required)
  • Create an iterator for the new dataset to suit the implementation
  • Allow the model to freeze the encoder while updating gradients for the head
  • Give meaningful explanation to the CMP Facade Database labels and use them to build embeddings. The explanations are obtained from merriam-webster while the embeddings are build with the CLIP-ViT model (Same as the paper)
  • Integrate Weights & Biases to track experiment results. This step is done to show the relation between hyper-parameters and model performance
  • Fine-tune the model using the CMP Facade Database
  • Evaluate the performance of the model using the Mean IoU metric

How To Use: Copy the Google Colab document in ssiw_transfer_learning_cmp_implementation to your Google Drive then open it, adjust the required training parameters and run the script.
A fine-tuned model already exists in the models directory.
Predictions from different trained models are available in the predictions directory.

Notes:

  • The CMP Facade Database consists of two datasets, base ane extended. The base dataset is used for training while the extended extended dataset is used for testing
  • The GPU memory, on Google Colab, may sometimes be filled, that is why garbage collection is called on multiple points in the code

Links:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published