New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Dataset Size Clarifications #13

Open

tyler-hayes opened this issue Feb 27, 2023 · 0 comments

tyler-hayes commented Feb 27, 2023

Hello! Thank you for the interesting work! I have a few clarification questions regarding dataset sizes:

The MetaShift paper indicates that the original GQA dataset contains 113,018 distinct images. However, when I download GQA using wget -c https://nlp.stanford.edu/data/gqa/images.zip and extract the files, there are 148,854 images. Is this the correct GQA file?
After parsing the MetaShift pickle file full-candidate-subsets.pkl there are only 72,596 unique images. Is this the correct size for the final MetaShift dataset?

Thank you in advance for any clarifications you can provide!

The text was updated successfully, but these errors were encountered:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment