Name		Name	Last commit message	Last commit date
parent directory ..
configs		configs
docs/en		docs/en
requirements		requirements
simvp		simvp
tools		tools
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
.style.yapf		.style.yapf
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt
setup.py		setup.py

README.md

Applying MogaNet to Video Prediction

This repo is a PyTorch implementation of applying MogaNet to unsupervised video prediction with SimVP on Moving MNIST. The code is based on SimVPv2 (or its latest version OpenSTL). It is worth noticing that the Translator module in SimVP can be replaced by any MetaFormer block, which can benchmark the video prediction performance of MetaFormers. For more details, see Efficient Multi-order Gated Aggregation Network (ICLR 2024).

Environement Setup

Install SimVPv2 with pipe as follow. It can also be installed with environment.yml.

python setup.py develop

Data preparation

Prepare Moving MNIST with script according to the guidelines.

(back to top)

Results and models on MMNIST

Notes: All the models are trained 200 and 2000 epochs by Adam optimizer and Onecycle learning rate scheduler. The trained models can also be downloaded by Baidu Cloud (z8mf) at MogaNet/MMNIST_VP. The params (M) and FLOPs (G) are measured by non_dist_train.py by setting --fps. Please refer to video_benchmarks for the full results.

Architecture	Setting	Params	FLOPs	FPS	MSE	MAE	SSIM	PSNR	Download
IncepU (SimVPv1)	200 epoch	58.0M	19.4G	209	32.15	89.05	0.9268	21.84	model \| log
gSTA (SimVPv2)	200 epoch	46.8M	16.5G	282	26.69	77.19	0.9402	22.78	model \| log
ViT	200 epoch	46.1M	16.9G	290	35.15	95.87	0.9139	21.67	model \| log
Swin Transformer	200 epoch	46.1M	16.4G	294	29.70	84.05	0.9331	22.22	model \| log
Uniformer	200 epoch	44.8M	16.5G	296	30.38	85.87	0.9308	22.13	model \| log
MLP-Mixer	200 epoch	38.2M	14.7G	334	29.52	83.36	0.9338	22.22	model \| log
ConvMixer	200 epoch	3.9M	5.5G	658	32.09	88.93	0.9259	21.93	model \| log
Poolformer	200 epoch	37.1M	14.1G	341	31.79	88.48	0.9271	22.03	model \| log
ConvNeXt	200 epoch	37.3M	14.1G	344	26.94	77.23	0.9397	22.74	model \| log
VAN	200 epoch	44.5M	16.0G	288	26.10	76.11	0.9417	22.89	model \| log
HorNet	200 epoch	45.7M	16.3G	287	29.64	83.26	0.9331	22.26	model \| log
MogaNet	200 epoch	46.8M	16.5G	255	25.57	75.19	0.9429	22.99	model \| log
IncepU (SimVPv1)	2000 epoch	58.0M	19.4G	209	21.15	64.15	0.9536	23.99	model \| log
gSTA (SimVPv2)	2000 epoch	46.8M	16.5G	282	15.05	49.80	0.9675	25.97	model \| log
ViT	2000 epoch	46.1M	16.9.G	290	19.74	61.65	0.9539	24.59	model \| log
Swin Transformer	2000 epoch	46.1M	16.4G	294	19.11	59.84	0.9584	24.53	model \| log
Uniformer	2000 epoch	44.8M	16.5G	296	18.01	57.52	0.9609	24.92	model \| log
MLP-Mixer	2000 epoch	38.2M	14.7G	334	18.85	59.86	0.9589	24.58	model \| log
ConvMixer	2000 epoch	3.9M	5.5G	658	22.30	67.37	0.9507	23.73	model \| log
Poolformer	2000 epoch	37.1M	14.1G	341	20.96	64.31	0.9539	24.15	model \| log
ConvNeXt	2000 epoch	37.3M	14.1G	344	17.58	55.76	0.9617	25.06	model \| log
VAN	2000 epoch	44.5M	16.0G	288	16.21	53.57	0.9646	25.49	model \| log
HorNet	2000 epoch	45.7M	16.3G	287	17.40	55.70	0.9624	25.14	model \| log
MogaNet	2000 epoch	46.8M	16.5G	255	15.67	51.84	0.9661	25.70	model \| log

Results on OpenSTL Benchmarks

Training

We train the model on a single GPU by default (a batch size of 16 for SimVP). Start training with the bash script as:

python tools/non_dist_train.py -d mmnist -m SimVP --model_type moga -c configs/mmnist/simvp/SimVP_MogaNet.py --lr 1e-3 --ex_name mmnist_simvp_moga

Evaluation

We test the trained model on a single GPU with the bash script as:

python tools/non_dist_test.py -d mmnist -m SimVP --model_type moga -c configs/mmnist/simvp/SimVP_MogaNet.py --ex_name /path/to/exp_name

Citation

If you find this repository helpful, please consider citing:

@inproceedings{iclr2024MogaNet,
  title={Efficient Multi-order Gated Aggregation Network},
  author={Siyuan Li and Zedong Wang and Zicheng Liu and Cheng Tan and Haitao Lin and Di Wu and Zhiyuan Chen and Jiangbin Zheng and Stan Z. Li},
  booktitle={International Conference on Learning Representations},
  year={2024}
}

Acknowledgment

Our segmentation implementation is mainly based on the following codebases. We gratefully thank the authors for their wonderful works.

(back to top)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

video_prediction

video_prediction

README.md

Applying MogaNet to Video Prediction

Environement Setup

Data preparation

Results and models on MMNIST

Results on OpenSTL Benchmarks

Training

Evaluation

Citation

Acknowledgment

Files

video_prediction

Directory actions

More options

Directory actions

More options

Latest commit

History

video_prediction

Folders and files

parent directory

README.md

Applying MogaNet to Video Prediction

Environement Setup

Data preparation

Results and models on MMNIST

Results on OpenSTL Benchmarks

Training

Evaluation

Citation

Acknowledgment