Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lots of links are not working? #4

Open
yxchng opened this issue Oct 2, 2022 · 3 comments
Open

Lots of links are not working? #4

yxchng opened this issue Oct 2, 2022 · 3 comments

Comments

@yxchng
Copy link

yxchng commented Oct 2, 2022

16it [21:34, 19.91s/it]worker  - success: 0.244 - failed to download: 0.752 - failed to resize: 0.003 - images per sec: 8 - count: 10000
total   - success: 0.241 - failed to download: 0.755 - failed to resize: 0.003 - images per sec: 124 - count: 160000
@joliver1981
Copy link

Yeah, the vast majority of the links don't appear to work anymore. I think you can download as tfrecords from kaggle. Search for "cc12m" on kaggle datasets. Wonder if anyone knows if there is an updated weights file for download that has been trained on these images.

@yxchng
Copy link
Author

yxchng commented Mar 20, 2023

@joliver1981 Have you tried downloading the Kaggle's cc12m? How many images are there?

@Suhail-BW
Copy link

Due to vast majority of links not working here, I suggest HuggingFace Datasets which offers older version of cc12m which had fewer broken links. Consider using it as an alternative. https://huggingface.co/datasets/pixparse/cc12m-wds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants