Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Places365 #5347

Open
Tracked by #5336
pmeier opened this issue Feb 3, 2022 · 9 comments
Open
Tracked by #5336

Places365 #5347

pmeier opened this issue Feb 3, 2022 · 9 comments

Comments

@pmeier
Copy link
Collaborator

pmeier commented Feb 3, 2022

cc @pmeier @bjuncek

@Amapocho
Copy link

Amapocho commented Feb 5, 2022

Can I take this up?

@pmeier
Copy link
Collaborator Author

pmeier commented Feb 5, 2022

Sure, go ahead.

@Amapocho
Copy link

Amapocho commented Feb 5, 2022

The Places365-Challenge dataset is way too big for me to download and find the sha256 code, is there any other way to find it?

@pmeier
Copy link
Collaborator Author

pmeier commented Feb 5, 2022

I think I habe the files in disk. Send the PR without it. I'll fill them in.

@Amapocho
Copy link

Amapocho commented Feb 6, 2022

I'm sorry but I just realized I will need to download the entire dataset to build the data pipe and I cannot do that for such a huge dataset. Please do unassign this issue, I will look into the other datasets that I will be able to download and try to go forward with them.

Regret any inconvenience caused.

Should I make a PR with resources that I have updated or should I just let it be?

@pmeier
Copy link
Collaborator Author

pmeier commented Feb 7, 2022

I'm sorry but I just realized I will need to download the entire dataset to build the data pipe and I cannot do that for such a huge dataset. Please do unassign this issue, I will look into the other datasets that I will be able to download and try to go forward with them.

Regret any inconvenience caused.

No worries, I could have told you before you got started and this is on me. Still, before I unassign you, would you be willing to implement this for the small variant of the dataset. This limits the download size for all files to about 30GB. IIRC, they have exactly the same structure as their larger "sisters". So the only thing I would have to do after your PR is checking if the implementation still works with the other files.

Should I make a PR with resources that I have updated or should I just let it be?

Depends. Do you have more than just the skeleton? If yes, go ahead and send the PR. Otherwise I don't think it will be of much help for someone taking over.

@Amapocho
Copy link

Amapocho commented Feb 7, 2022

I only have 25GB on the drive which has PyTorch so I won't be able to run it on my device sadly. I have added all the links for the resource download, image type, and all the other links so I'll add a PR as it'll help start off whoever picks up the issue next.

In the meanwhile, I have looked into the RenderedSST2 and I would be able to take that up. Is there any quirk to that dataset I should be wary of before commenting on the issue?

@pmeier
Copy link
Collaborator Author

pmeier commented Feb 7, 2022

In the meanwhile, I have looked into the RenderedSST2 and I would be able to take that up. Is there any quirk to that dataset I should be wary of before commenting on the issue?

No, I don't think so. Go ahead, I can assign you there.

@pmeier
Copy link
Collaborator Author

pmeier commented Feb 7, 2022

For anyone who wants to pick this up: you can find the skeleton for the implementation in #5383.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants