Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Kitti and Sintel datasets for optical flow #4845

Merged
merged 10 commits into from
Nov 4, 2021

Conversation

NicolasHug
Copy link
Member

@NicolasHug NicolasHug commented Nov 3, 2021

This PR Adds the Kitti and Sintel datasets for optical flow.

Here are a few design decisions, following our initial discussion (in this PR):

  • Unlike Sintel, Kitti has a built-in valid mask indicating which flow values are returned. As a resut Sintel returns img1, img2, flow while Kitti returns img1, img2, flow, valid.

  • For both these datasets, the targets aren't known if split="test" so we return img1, img2, None and img1, img2, None, None in these cases.

  • For both, flow is a numpy array of shape (2, H, W) and valid is a boolean numpy array of shape (H, W). The images are PIL images.

  • The transforms expect to receive img1, img2, flow, valid, even for Sintel, and even in test mode.

  • add tests

  • write docs

cc @pmeier

@facebook-github-bot
Copy link

facebook-github-bot commented Nov 3, 2021

💊 CI failures summary and remediations

As of commit ac4eaab (more details on the Dr. CI page):


  • 2/2 failures introduced in this PR

2 failures not recognized by patterns:

Job Step Action
CircleCI binary_linux_conda_py3.6_cu111 packaging/build_conda.sh 🔁 rerun
CircleCI binary_win_conda_py3.9_cu113 Build conda packages 🔁 rerun

1 job timed out:

  • binary_linux_conda_py3.6_cu111

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@NicolasHug NicolasHug marked this pull request as draft November 3, 2021 11:30
torchvision/datasets/_optical_flow.py Outdated Show resolved Hide resolved
torchvision/datasets/_optical_flow.py Show resolved Hide resolved
torchvision/datasets/_optical_flow.py Outdated Show resolved Hide resolved
torchvision/datasets/_optical_flow.py Outdated Show resolved Hide resolved
torchvision/datasets/__init__.py Show resolved Hide resolved
Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few thoughts:

torchvision/datasets/_optical_flow.py Show resolved Hide resolved
torchvision/datasets/_optical_flow.py Outdated Show resolved Hide resolved
@NicolasHug NicolasHug marked this pull request as ready for review November 4, 2021 10:06
@NicolasHug
Copy link
Member Author

@datumbox @pmeier @fmassa thanks a lot for your very useful feedback.

I updated the PR accordingly (see original PR message above) and added tests and docs. This is now ready for a proper review.

@NicolasHug
Copy link
Member Author

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@vadimkantorov
Copy link

vadimkantorov commented Nov 4, 2021

There were problems with older datasets that returning tuple doesn't allow for easy adding extra fields in the future, e.g. #3608. I propose to return img, target_dict tuple (even better may be always returning a single dict for the cases of skipping actual image loading). And there is more unity when iterating between different related datasets.

@pmeier
Copy link
Collaborator

pmeier commented Nov 4, 2021

@vadimkantorov We are aware of that and the new prototype dataset API will always return a single dictionary per sample. There is no documentation yet, but if you want to you can have a look into torchvision/prototype/datasets to see how things will look like in the future. Plus, if you have feedback, that is very welcome!

@vadimkantorov
Copy link

vadimkantorov commented Nov 4, 2021

If you're committed to using only tuples, please also return some internal example id, image/video file name or something of this form. Then at least people can hack around and fetch additional information

@NicolasHug
Copy link
Member Author

Thanks for your input @vadimkantorov

I don't think any of the datasets return that kind of info so far (sadly), so returning a dict with extra info would make KittiFlow and Sintel outliers w.r.t. what we currently have. Returning the ids etc. is definitely useful, but such new design is better suited for a complete re-work of the datasets, which we are currently doing as @pmeier pointed out above.

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Definitely interesting to see how the ideas mentioned above can be incorporated on the new API but I think it's worth merging this PR to unblock your work on the flow models.

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @NicolasHug!

@NicolasHug NicolasHug merged commit 50a3571 into pytorch:main Nov 4, 2021
@NicolasHug NicolasHug mentioned this pull request Nov 4, 2021
12 tasks
facebook-github-bot pushed a commit that referenced this pull request Nov 8, 2021
Reviewed By: kazhang

Differential Revision: D32216685

fbshipit-source-id: ec74c2a573eace36bd4a0cf9913ea1dc77fcf260
cyyever pushed a commit to cyyever/vision that referenced this pull request Nov 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants