Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhance] Add pipeline for data loading #430

Merged
merged 18 commits into from
Apr 19, 2021

Conversation

Wuziyi616
Copy link
Contributor

Fix issue#412.

In dataset.show/evaluate() functions, we may need to load points/gt. In current implementations, we load them from disk using np.fromfile(), which is incompatible when data is in ceph. To solve this, we add eval_pipeline in configs, and pass it as an argument to these functions. This pipeline purely consists of raw data loading operations (e.g. LoadImage, LoadPoints), eliminating the effects of data augmentation, and can adjust with the file client.

@Wuziyi616
Copy link
Contributor Author

Wuziyi616 commented Apr 9, 2021

The added eval_pipeline will be passed as an argument in the eval_hook.

Also support semseg mask loading in scannet-seg dataset now.

I have carefully checked all the configs and add eval_pipeline to configs with custom data pipelines.

@codecov
Copy link

codecov bot commented Apr 9, 2021

Codecov Report

❗ No coverage uploaded for pull request base (master@2d9b97b). Click here to learn what that means.
The diff coverage is 67.85%.

❗ Current head a7b8d84 differs from pull request most recent head 27f05ad. Consider uploading reports for the commit 27f05ad to get more accurate results
Impacted file tree graph

@@            Coverage Diff            @@
##             master     #430   +/-   ##
=========================================
  Coverage          ?   50.80%           
=========================================
  Files             ?      184           
  Lines             ?    13425           
  Branches          ?     2160           
=========================================
  Hits              ?     6820           
  Misses            ?     6149           
  Partials          ?      456           
Flag Coverage Δ
unittests 50.80% <67.85%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmdet3d/datasets/waymo_dataset.py 10.36% <0.00%> (ø)
mmdet3d/datasets/custom_3d_seg.py 63.75% <18.18%> (ø)
mmdet3d/datasets/lyft_dataset.py 71.28% <70.00%> (ø)
mmdet3d/datasets/nuscenes_dataset.py 41.40% <70.00%> (ø)
mmdet3d/datasets/kitti_dataset.py 75.89% <72.22%> (ø)
mmdet3d/datasets/sunrgbd_dataset.py 76.04% <77.77%> (ø)
mmdet3d/datasets/custom_3d.py 72.80% <87.50%> (ø)
mmdet3d/datasets/scannet_dataset.py 92.13% <87.87%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d9b97b...27f05ad. Read the comment docs.

@Wuziyi616
Copy link
Contributor Author

Actually the commit message shouldn't be "reuse", because I pass into a new pipeline. "Reuse" is for easy understanding since the added eval_pipeline is similar to test_pipeline.

@Wuziyi616 Wuziyi616 requested a review from Tai-Wang April 13, 2021 07:04
@Wuziyi616
Copy link
Contributor Author

Move the tedious if/else conditions to dataset._extract_data(). Set default value to pipeline.

@Wuziyi616
Copy link
Contributor Author

I have tried using tools/misc/visualize_results.py to show results by calling dataset.show() and input config.eval_pipeline. The results are the same as before.

@Wuziyi616
Copy link
Contributor Author

Basic logic now:

  • if pipeline is given in show/evaluate function, directly use this pipeline
  • if pipeline is None, if self.pipeline is not None, get_loading_pipeline(self.pipeline)
  • pipeline and self.pipeline are both None, call _build_default_pipeline() for this dataset

@Wuziyi616 Wuziyi616 requested a review from ZwwWayne April 16, 2021 03:27
Copy link
Contributor Author

@Wuziyi616 Wuziyi616 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have carefully checked all the config files and 5 of them needed not to be modified.

@ZwwWayne
Copy link
Collaborator

PRs can be merged after resolving conflicts.

@ZwwWayne ZwwWayne merged commit 78c29c3 into open-mmlab:master Apr 19, 2021
@Wuziyi616 Wuziyi616 deleted the reuse_pipeline_dataset branch April 19, 2021 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data loading in show and evaluate functions of dataset class
3 participants