-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCF101 dataset more *loading* efficient #2475
Conversation
Now the dataset is not working properly because of this line of code `indices = [i for i in range(len(video_list)) if video_list[i][len(self.root) + 1:] in selected_files]`. Performing the `len(self.root) + 1` only make sense if there is no training / to root ``` >>> root = 'data/ucf-101/videos' >>> video_path = 'data/ucf-101/videos/activity/video.avi' >>> video_path [len(root ):] '/activity/video.avi' >>> video_path [len(root ) + 1:] 'activity/video.avi' ``` Appending the root path also to the selected files is a simple solution and make the dataset works with and without a trailing slash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
Thanks for the PR and sorry for taking so much time to get back to you.
The PR looks good, there are just a few linter errors that should be fixed.
But before merging this, I would prefer that we add some tests to UCF101 (that @andfoy will be working on), so that we double-check that we are not missing anything.
Codecov Report
@@ Coverage Diff @@
## master #2475 +/- ##
==========================================
+ Coverage 68.83% 71.58% +2.74%
==========================================
Files 94 94
Lines 7920 8861 +941
Branches 1249 1629 +380
==========================================
+ Hits 5452 6343 +891
+ Misses 2075 2052 -23
- Partials 393 466 +73
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good idea to speed this up!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this PR!
Unfortunately we can't merge this as is because it will break backwards-compatibility with downstream projects that depend on torchvision.
While this PR is definitely an improvement compared to what we had before, we would need to check with ClassyVision team cc @stephenyan1231 @vreis about this change, as it would require a change on ClassyVision side as well.
self.indices = self._select_fold(video_list, annotation_path, fold, train) | ||
self.video_clips = video_clips.subset(self.indices) | ||
self.transform = transform | ||
self.video_clips_metadata = self.video_clips.metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unfortunately a BC-breaking change as other downstream projects rely on this behavior, see ClassyVision for example.
The original thinking behind this approach was that one could cache the metadata and re-use it over different dataset invocations, so that the creation time would be amortized.
It could indeed be possible to from the beginning create separate metadata for each fold, but now that it's been done as is we unfortunately to keep it for backwards-compatibility reasons.
@stephenyan1231: I'll defer to you on this one. Seems like a good change, and in general I don't mind breaking changes. But I don't know if we rely on this behavior in subtle ways. |
Any news on this? |
Thanks @Guillem96 for this contribution and sorry for taking that long to get an update. As you might have read, the dataset API is being reworked and datasets are being moved to the new API (as detailed here). Not yet sure what is the best course of action but here are two possible paths:
In the two paths, I would suggest having two separate PRs so that the work is easier to review. What works best for you @Guillem96? Feel free if you have questions and/or other suggestions. Any other suggestions @pmeier on this? |
We are currently trying to figure out the best way to do this in #5422 cc @bjuncek. The new design will not re-use the |
@pmeier fine with me! Thanks for your comments👌🏼 |
Now when creating UCF101 dataset, you specify both the fold ({1, 2, 3}) and the split (train or test). Due to the current implementation, the video clip generation is agnostic to the selected fold and split, meaning that the
VideoClips
helper class processes all the videos first and afterwards, according to the selected split and fold, the clips are filtered.This is very inefficient. For example, if you want to load only the test set, you have to go through around 900 videos while test set only contains around 350 videos.
To solve this I:
make_dataset
independently from split and fold.VideoClips
object.