-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider hosting dataset on Huggingface & source.coop #15
Comments
I will try to add it here https://huggingface.co/allenai/satlas-pretrain but it may take some time due to large size of the dataset. |
Hi @favyen2 I see you got a couple of files up which is great. Can I request you prioritise the following data? Been attempting to download since start of week, still going
|
For the explorer dataset it took most of the week to download the tar and most of the weekend to untar. On reviewing the labelled datasets:
The 3 small datasets could be uploaded as individual datasets - HF has 40GB limit (TBC) per zip/tar so these should be fine. This would be a much faster experience for people who only care about one of those |
The dataset is now available on Hugging Face. The hand-labeled datasets for individual tasks are updated regularly and we are still deciding how to release those on an ongoing basis. |
I'm noting very slow download times for the dataset (my connection is fast):
I've experienced very rapid downloads from Huggingface and suggest it as an additional location to host and distribute the dataset
Additionally https://beta.source.coop/ would be a relevant portal
The text was updated successfully, but these errors were encountered: