Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support loading annotation from file for video SR datasets #423

Merged
merged 2 commits into from
Jul 23, 2021

Conversation

ckkelvinchan
Copy link
Member

Motivation

As raised in #395, some datasets use glob or scandir when loading annotations. These functions are not compatible with ceph. As a result, it is unable to train some models when ceph is used.

Modifications

  1. For SRFolderMultipleGTDataset and SRFolderVideoDataset, an argument ann_file is added. If it is provided, annotations will be loaded from the file.
  2. For SRVimeo90KMultipleGTDataset, glob is removed, and the file paths are replaced by the standard Vimeo-90K file names.


with open(self.ann_file, 'r') as fin:
for line in fin:
key, max_frame_num = line.strip().split(' ')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use rsplit, and restrict the split num to 2.

This way, spaces in paths are supported.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If using .split(' ', 2), for path 1/000.png 10, it will become ['path', 1/000.png', '10'] Need to join it back if we want to support spaces in paths.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the number is num of splits. So 1 split splits to 2 items

note the "r" in rsplit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I missed rsplit. It should work then.

with open(self.ann_file, 'r') as fin:
for line in fin:
key, max_frame_num = line.strip().split(' ')
sequence = key.split('/')[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not hardcode path separator.

gt_paths = sorted(
glob.glob(osp.join(self.gt_folder, key, '*.png')))
lq_paths = [
osp.join(self.lq_folder, key, f'im{i}.png')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this pr, but the hardcoded png format is a headache also.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the official format of Vimeo-90K, I think it is reasonable to take this format (?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. Our codebase is for developers, who will use their own format (whatever their dataset happens uses)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is for Vimeo-90K only. SRFolderVideoDataset or SRFolderMultipleGTDataset should be used for general purposes.

Actually we could support start_idx and filename_tmpl. But this way cannot support general cases. For instance, it fails when the files are named im01.png, im03.png, im05.png, im07.png, im09.png, img11.png, img13.png.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay lets keep it as is for now.

@codecov
Copy link

codecov bot commented Jul 8, 2021

Codecov Report

Merging #423 (d305153) into master (912fad1) will increase coverage by 0.07%.
The diff coverage is 88.95%.

❗ Current head d305153 differs from pull request most recent head bd36131. Consider uploading reports for the commit bd36131 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #423      +/-   ##
==========================================
+ Coverage   80.64%   80.72%   +0.07%     
==========================================
  Files         188      189       +1     
  Lines       10107    10071      -36     
  Branches     1485     1495      +10     
==========================================
- Hits         8151     8130      -21     
+ Misses       1741     1723      -18     
- Partials      215      218       +3     
Flag Coverage Δ
unittests 80.72% <88.95%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ls/components/stylegan2/generator_discriminator.py 83.73% <ø> (-1.21%) ⬇️
mmedit/models/components/stylegan2/modules.py 63.92% <ø> (-0.49%) ⬇️
mmedit/datasets/builder.py 91.07% <66.66%> (+1.59%) ⬆️
mmedit/models/restorers/liif.py 58.10% <80.00%> (-6.98%) ⬇️
mmedit/models/backbones/sr_backbones/liif_net.py 87.59% <87.59%> (ø)
mmedit/datasets/sr_folder_video_dataset.py 98.55% <92.85%> (-1.45%) ⬇️
mmedit/core/misc.py 100.00% <100.00%> (ø)
mmedit/datasets/sr_folder_multiple_gt_dataset.py 100.00% <100.00%> (ø)
mmedit/datasets/sr_vimeo90k_multiple_gt_dataset.py 100.00% <100.00%> (ø)
mmedit/models/backbones/__init__.py 100.00% <100.00%> (ø)
... and 32 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 912fad1...bd36131. Read the comment docs.

data_infos = []

with open(self.ann_file, 'r') as fin:
for line in fin:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use mmcv.list_from_file

self.folders = {}
data_infos = []

with open(self.ann_file, 'r') as fin:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use list_from_file

]
gt_paths = [
osp.join(self.gt_folder, key, f'im{i}.png')
for i in range(1, 8)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 is a magic number. Can it be made into a variable at least?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it can be done. I will update this in the afternoon.

@innerlee innerlee merged commit 6f6ebd6 into open-mmlab:master Jul 23, 2021
Yshuo-Li pushed a commit to Yshuo-Li/mmediting that referenced this pull request Jul 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants