Skip to content

Official Implementation of Multiscale Memory Comparator

Notifications You must be signed in to change notification settings

MSiam/MMC-MultiscaleMemory

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MMC: Multiscale Memory

This repository is the official implementation of Multiscale Memory Comparator Transformer for Few-Shot Video Segmentation.

Environment Setup

The used python and libraries:

  • Python 3.7
  • Torch 1.9.0
  • Torchvision 0.10.0

Dataset

We have used three publicly available dataset for evaluation. Download the datasets following the corresponding paper/project. For MoCA we used the 88 videos, preprocessing and the evaluation scheme from the MotionGrouping paper

Pre-trained Models

Download pretrained models here. It includes the weights for the baseline multiscale query and our multiscale memory models with VideoSwin and R101.

Names of models inside the checkpoint for VideoSwin:

  • Baseline - Multiscale Query: ms_qry_5frames
  • Multiscale Memory Bidirectional: ms_qry_memory_5frames
  • Multiscale Memory Stacked: swin-msqry-mem-nobidir

Evaluation

Only use "_adaptive" if GPU memory does not fit. Otherwise remove it for the decoder type to become only "multiscale_query_memory_nobidir". Results in the arxiv paper are reported using val_size 440 to fit in GPU memory for both baseline and our method. However, after our addition of the adaptive technique we can evaluate with 473 similar to SOA methods.

  • Youtube Objects
python inference.py --model_path CKPT_PATH --dataset ytbo --val_size 473 --output_dir OUT_DIR --decoder_type multiscale_query_memory_nobidir_adaptive
  • DAVIS16
python inference.py --model_path CKPT_PATH --dataset davis --val_size 473 --output_dir OUT_DIR --decoder_type multiscale_query_memory_nobidir_adaptive --aug
  • MoCA
python inference.py --model_path CKPT_PATH --dataset moca --val_size 473 --output_dir OUT_DIR --decoder_type multiscale_query_memory_nobidir_adaptive

Results

Our model achieves the following performance w.r.t multiscale query transformer decoder and other SOA methods in terms of mIoU:

Inference method DAVIS'16 MoCA YouTube-Objects
AGS 79.7 - 69.7
COSNet 80.5 50.7 70.5
AGNN 80.7 - 70.8
MATNet 82.4 64.2 69.0
RTNet 85.6 60.7 71.0
Multiscale Query 83.7 77.4 76.8
------------------------ -------------- ------------------ ------------------
Multiscale Memory(Ours) 86.1 80.3 78.2

Acknowledgement

We would like to thank the MED-VT open-source code that was used to build the baseline for our project.

About

Official Implementation of Multiscale Memory Comparator

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages