[feat
] Addition of popular image retrieval benchmark datasets
#724
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi,
this is a PR for issue #722. I've implemented four benchmark datasets: CUB-200, Cars196, INaturalist2018, and StanfordOnlineProducts. When using any of these datasets they will be downloaded directly and saved to the root directory, similar to the PyTorch dataset handling. Each of the implemented datasets inherits torch.utils.data.Dataset and can be used with dataloaders seamlessly. I've also added docs for each of the datasets implement, and a short overview of what users need to implemented if they want to add their own custom dataset. Tests for each of the datasets are also added. I've deliberately left out the
__init__.py
intests/datasets
, as each of the files has to be downloaded, and these can be pretty big (up to 130Gb).@KevinMusgrave when you have time, please take a look and tell me if something requires changing.