Skip to content

Latest commit

 

History

History
359 lines (261 loc) · 28.6 KB

video_visualization.md

File metadata and controls

359 lines (261 loc) · 28.6 KB

Video Prediction Visualization

We provide benchmark results of spatiotemporal prediction learning (STL) methods on popular traffic prediction datasets. More STL methods will be supported in the future. Issues and PRs are welcome! Visualization of GIF is released.

Table of Contents

Currently supported spatiotemporal prediction methods
Currently supported MetaFormer models for SimVP

We provide visualization figures of various video prediction methods on various benchmarks. You can plot your own visualization with tested results (e.g., work_dirs/exp_name/saved) by vis_video.py. Note that --vis_dirs denotes visualize all experimental folders under the path, and --vis_channel can select the channel for visualization. For example, run plotting with the script:

python tools/visualizations/vis_video.py -d mmnist -w work_dirs/exp_name --index 0 --save_dirs fig_mmnist_vis

Visualization of Moving MNIST Benchmarks

We provide benchmark results on the popular Moving MNIST dataset using $10\rightarrow 10$ frames prediction setting in configs/mmnist.

ConvLSTM DMVFN
E3D-LSTM MAU
MIM PhyDNet
PredRNN PredRNN++
PredRNN-V2 SimVP-V1
SimVP-V2 TAU
SimVP-ConvMixer SimVP-ConvNeXt
SimVP-HorNet SimVP-MLPMixer
SimVP-MogaNet SimVP-Poolformer
SimVP-Swin SimVP-Uniformer
SimVP-VAN SimVP-ViT

(back to top)

Visualization of Moving FMNIST Benchmarks

Similar to Moving MNIST, we also provide the advanced version of MNIST, i.e., MFMNIST benchmark results, using $10\rightarrow 10$ frames prediction setting in configs/mfmnist.

ConvLSTM
E3D-LSTM MAU
MIM PhyDNet
PredRNN PredRNN++
PredRNN-V2 SimVP-V1
SimVP-V2 TAU
SimVP-ConvMixer SimVP-ConvNeXt
SimVP-HorNet SimVP-MLPMixer
SimVP-MogaNet SimVP-Poolformer
SimVP-Swin SimVP-Uniformer
SimVP-VAN SimVP-ViT

(back to top)

Visualization of Moving MNIST-CIFAR Benchmarks

Similar to Moving MNIST, we further design the advanced version of MNIST with complex backgrounds from CIFAR-10, i.e., MMNIST-CIFAR benchmark, using $10\rightarrow 10$ frames prediction setting in configs/mmnist_cifar.

ConvLSTM
E3D-LSTM MAU
MIM PhyDNet
PredRNN PredRNN++
PredRNN-V2 SimVP-V1
SimVP-V2 TAU
SimVP-ConvMixer SimVP-ConvNeXt
SimVP-HorNet SimVP-MLPMixer
SimVP-MogaNet SimVP-Poolformer
SimVP-Swin SimVP-Uniformer
SimVP-VAN SimVP-ViT

(back to top)

Visualization of KittiCaltech Benchmarks

We provide benchmark results on KittiCaltech Pedestrian dataset using $10\rightarrow 1$ frames prediction setting in configs/kitticaltech.

ConvLSTM DMVFN
E3D-LSTM MAU
MIM PhyDNet
PredRNN PredRNN++
PredRNN-V2 SimVP-V1
SimVP-V2 TAU
SimVP-ConvMixer SimVP-ConvNeXt
SimVP-HorNet SimVP-MLPMixer
SimVP-MogaNet SimVP-Poolformer
SimVP-Swin SimVP-Uniformer
SimVP-VAN SimVP-ViT

(back to top)

Visualization of KTH Benchmarks

We provide long-term prediction benchmark results on KTH Action dataset using $10\rightarrow 20$ frames prediction setting in configs/kth.

ConvLSTM DMVFN
E3D-LSTM MAU
MIM PhyDNet
PredRNN PredRNN++
PredRNN-V2 SimVP-V1
SimVP-V2 TAU
SimVP-ConvMixer SimVP-ConvNeXt
SimVP-HorNet SimVP-MLPMixer
SimVP-MogaNet SimVP-Poolformer
SimVP-Swin SimVP-Uniformer
SimVP-VAN SimVP-ViT

(back to top)

Visualization of Human 3.6M Benchmarks

We further provide high-resolution benchmark results on Human3.6M dataset using $4\rightarrow 4$ frames prediction setting in configs/human.

ConvLSTM DMVFN
E3D-LSTM MAU
MIM PhyDNet
PredRNN PredRNN++
PredRNN-V2 SimVP-V1
SimVP-V2 TAU
SimVP-ConvMixer SimVP-ConvNeXt
SimVP-HorNet SimVP-MLPMixer
SimVP-MogaNet SimVP-Poolformer
SimVP-Swin SimVP-Uniformer
SimVP-VAN SimVP-ViT

(back to top)