Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I download only the train and test split for full numbers using load_dataset()? #4101

Open
Nakkhatra opened this issue Apr 5, 2022 · 1 comment · May be fixed by #6832
Open

How can I download only the train and test split for full numbers using load_dataset()? #4101

Nakkhatra opened this issue Apr 5, 2022 · 1 comment · May be fixed by #6832
Labels
enhancement New feature or request

Comments

@Nakkhatra
Copy link

How can I download only the train and test split for full numbers using load_dataset()?

I do not need the extra split and it will take 40 mins just to download in Colab. I have very short time in hand. Please help.

@Nakkhatra Nakkhatra added the enhancement New feature or request label Apr 5, 2022
@mariosasko
Copy link
Collaborator

Hi! Can you please specify the full name of the dataset? IIRC full_numbers is one of the configs of the svhn dataset, and its generation is slow due to data being stored in binary Matlab files. Even if you specify a specific split, datasets downloads all of them, but we plan to fix that soon and only download the requested split.

If you are in a hurry, download the svhn script here, remove this code, and run:

from datasets import load_dataset
dset = load_dataset("path/to/your/local/script.py", "full_numbers")

And to make loading easier in Colab, you can create a dataset repo on the Hub and upload the script there. Or push the script to Google Drive and mount the drive in Colab.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants