Skip to content

Commit

Permalink
[Feature] Add config file of FLAVR (open-mmlab#867)
Browse files Browse the repository at this point in the history
* [Feature] Add config file of FLAVR

* Update
  • Loading branch information
Yshuo-Li authored and wangruohui committed May 19, 2022
1 parent b77cc3e commit bbd7d95
Show file tree
Hide file tree
Showing 4 changed files with 236 additions and 0 deletions.
39 changes: 39 additions & 0 deletions configs/video_interpolators/flavr/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# FLAVR (arXiv'2020)

> [FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation](https://arxiv.org/pdf/2012.08512.pdf)
<!-- [ALGORITHM] -->

## Abstract

<!-- [ABSTRACT] -->

Most modern frame interpolation approaches rely on explicit bidirectional optical flows between adjacent frames, thus are sensitive to the accuracy of underlying flow estimation in handling occlusions while additionally introducing computational bottlenecks unsuitable for efficient deployment. In this work, we propose a flow-free approach that is completely end-to-end trainable for multi-frame video interpolation. Our method, FLAVR, is designed to reason about non-linear motion trajectories and complex occlusions implicitly from unlabeled videos and greatly simplifies the process of training, testing and deploying frame interpolation models. Furthermore, FLAVR delivers up to 6× speed up compared to the current state-of-the-art methods for multi-frame interpolation while consistently demonstrating superior qualitative and quantitative results compared with prior methods on popular benchmarks including Vimeo-90K, Adobe-240FPS, and GoPro. Finally, we show that frame interpolation is a competitive self-supervised pre-training task for videos via demonstrating various novel applications of FLAVR including action recognition, optical flow estimation, motion magnification, and video object tracking. Code and trained models are provided in the supplementary material.

<!-- [IMAGE] -->

<div align=center >
<img src="https://user-images.githubusercontent.com/56712176/169070212-52acdcea-d732-4441-9983-276e2e40b195.png" width="400"/>
</div >

## Results and models

Evaluated on RGB channels.
The metrics are `PSNR / SSIM` .

| Method | scale | Vimeo90k-triplet | Download |
| :------------------------------------------------------------------------------------------------------------------: | :---: | :---------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| [flavr_in4out1_g8b4_vimeo90k_septuplet](/configs/video_interpolators/flavr/flavr_in4out1_g8b4_vimeo90k_septuplet.py) | x2 | 36.3340 / 0.96015 | [model](https://download.openmmlab.com/mmediting/video_interpolators/flavr/flavr_in4out1_g8b4_vimeo90k_septupli-c2468995.pth) \| [log](https://download.openmmlab.com/mmediting/video_interpolators/flavr/flavr_in4out1_g8b4_vimeo90k_septupli-c2468995.log.json) |

Note: FLAVR for x8 VFI task will supported in the future.

## Citation

```bibtex
@article{kalluri2020flavr,
title={Flavr: Flow-agnostic video representations for fast frame interpolation},
author={Kalluri, Tarun and Pathak, Deepak and Chandraker, Manmohan and Tran, Du},
journal={arXiv preprint arXiv:2012.08512},
year={2020}
}
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
exp_name = 'flavr_in4out1_g8b4_vimeo90k_septuplet'

# model settings
model = dict(
type='BasicInterpolator',
generator=dict(
type='FLAVRNet',
num_input_frames=4,
num_output_frames=1,
mid_channels_list=[512, 256, 128, 64],
encoder_layers_list=[2, 2, 2, 2],
bias=False,
norm_cfg=None,
join_type='concat',
up_mode='transpose'),
pixel_loss=dict(type='L1Loss', loss_weight=1.0, reduction='mean'))
# model training and testing settings
train_cfg = None
test_cfg = dict(metrics=['PSNR', 'SSIM', 'MAE'], crop_border=0)

# dataset settings
train_dataset_type = 'VFIVimeo90K7FramesDataset'
val_dataset_type = 'VFIVimeo90K7FramesDataset'

train_pipeline = [
dict(
type='LoadImageFromFileList',
io_backend='disk',
key='inputs',
channel_order='rgb',
backend='pillow'),
dict(
type='LoadImageFromFileList',
io_backend='disk',
key='target',
channel_order='rgb',
backend='pillow'),
dict(type='FixedCrop', keys=['inputs', 'target'], crop_size=(256, 256)),
dict(
type='Flip',
keys=['inputs', 'target'],
flip_ratio=0.5,
direction='horizontal'),
dict(
type='Flip',
keys=['inputs', 'target'],
flip_ratio=0.5,
direction='vertical'),
dict(
type='ColorJitter',
keys=['inputs', 'target'],
channel_order='rgb',
brightness=0.05,
contrast=0.05,
saturation=0.05,
hue=0.05),
dict(type='TemporalReverse', keys=['inputs'], reverse_ratio=0.5),
dict(type='RescaleToZeroOne', keys=['inputs', 'target']),
dict(type='FramesToTensor', keys=['inputs', 'target']),
dict(
type='Collect',
keys=['inputs', 'target'],
meta_keys=['inputs_path', 'target_path', 'key'])
]

valid_pipeline = [
dict(
type='LoadImageFromFileList',
io_backend='disk',
key='inputs',
channel_order='rgb',
backend='pillow'),
dict(
type='LoadImageFromFileList',
io_backend='disk',
key='target',
channel_order='rgb',
backend='pillow'),
dict(type='RescaleToZeroOne', keys=['inputs', 'target']),
dict(type='FramesToTensor', keys=['inputs', 'target']),
dict(
type='Collect',
keys=['inputs', 'target'],
meta_keys=['inputs_path', 'target_path', 'key'])
]

demo_pipeline = [
dict(
type='LoadImageFromFileList',
io_backend='disk',
key='inputs',
channel_order='rgb',
backend='pillow'),
dict(type='RescaleToZeroOne', keys=['inputs']),
dict(type='FramesToTensor', keys=['inputs']),
dict(type='Collect', keys=['inputs'], meta_keys=['inputs_path', 'key'])
]

root_dir = 'data/vimeo90k'
data = dict(
workers_per_gpu=16,
train_dataloader=dict(samples_per_gpu=4), # 8 gpu
val_dataloader=dict(samples_per_gpu=1),
test_dataloader=dict(samples_per_gpu=1),

# train
train=dict(
type=train_dataset_type,
folder=f'{root_dir}/GT',
ann_file=f'{root_dir}/sep_trainlist.txt',
pipeline=train_pipeline,
input_frames=[1, 3, 5, 7],
target_frames=[4],
test_mode=False),
# val
val=dict(
type=train_dataset_type,
folder=f'{root_dir}/GT',
ann_file=f'{root_dir}/sep_testlist.txt',
pipeline=valid_pipeline,
input_frames=[1, 3, 5, 7],
target_frames=[4],
test_mode=True),
# test
test=dict(
type=train_dataset_type,
folder=f'{root_dir}/GT',
ann_file=f'{root_dir}/sep_testlist.txt',
pipeline=valid_pipeline,
input_frames=[1, 3, 5, 7],
target_frames=[4],
test_mode=True),
)

# optimizer
optimizers = dict(generator=dict(type='Adam', lr=2e-4, betas=(0.9, 0.99)))

# learning policy
total_iters = 1000000 # >=200*64612/64
lr_config = dict(
policy='Reduce',
by_epoch=False,
mode='max',
val_metric='PSNR',
epoch_base_valid=True, # Support epoch base valid in iter base runner.
factor=0.5,
patience=10,
cooldown=20,
verbose=True)

checkpoint_config = dict(interval=2020, save_optimizer=True, by_epoch=False)

evaluation = dict(interval=2020, save_image=False, gpu_collect=True)
log_config = dict(
interval=100,
hooks=[
dict(type='TextLoggerHook', by_epoch=False),
dict(
type='TensorboardLoggerHook',
log_dir=f'work_dirs/{exp_name}/tb_log/',
interval=100,
ignore_last=False,
reset_flag=False,
by_epoch=False),
])
visual_config = None

# runtime settings
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = f'./work_dirs/{exp_name}'
load_from = None
resume_from = None
workflow = [('train', 1)]
22 changes: 22 additions & 0 deletions configs/video_interpolators/flavr/metafile.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
Collections:
- Metadata:
Architecture:
- FLAVR
Name: FLAVR
Paper:
- https://arxiv.org/pdf/2012.08512.pdf
README: configs/video_interpolators/flavr/README.md
Models:
- Config: configs/video_interpolators/flavr/flavr_in4out1_g8b4_vimeo90k_septuplet.py
In Collection: FLAVR
Metadata:
Training Data: VIMEO90K
Name: flavr_in4out1_g8b4_vimeo90k_septuplet
Results:
- Dataset: VIMEO90K
Metrics:
Vimeo90k-triplet:
PSNR: 36.334
SSIM: 0.96015
Task: Video_interpolators
Weights: https://download.openmmlab.com/mmediting/video_interpolators/flavr/flavr_in4out1_g8b4_vimeo90k_septupli-c2468995.pth
1 change: 1 addition & 0 deletions model-index.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,5 @@ Import:
- configs/synthesizers/cyclegan/metafile.yml
- configs/synthesizers/pix2pix/metafile.yml
- configs/video_interpolators/cain/metafile.yml
- configs/video_interpolators/flavr/metafile.yml
- configs/video_interpolators/tof/metafile.yml

0 comments on commit bbd7d95

Please sign in to comment.