Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset Size Clarifications #13

Open
tyler-hayes opened this issue Feb 27, 2023 · 0 comments
Open

Dataset Size Clarifications #13

tyler-hayes opened this issue Feb 27, 2023 · 0 comments

Comments

@tyler-hayes
Copy link

Hello! Thank you for the interesting work! I have a few clarification questions regarding dataset sizes:

  1. The MetaShift paper indicates that the original GQA dataset contains 113,018 distinct images. However, when I download GQA using wget -c https://nlp.stanford.edu/data/gqa/images.zip and extract the files, there are 148,854 images. Is this the correct GQA file?
  2. After parsing the MetaShift pickle file full-candidate-subsets.pkl there are only 72,596 unique images. Is this the correct size for the final MetaShift dataset?

Thank you in advance for any clarifications you can provide!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant