Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset split ids? #7

Open
cjrd opened this issue Aug 17, 2020 · 4 comments
Open

Dataset split ids? #7

cjrd opened this issue Aug 17, 2020 · 4 comments

Comments

@cjrd
Copy link

cjrd commented Aug 17, 2020

Would it be possible to provide the dataset split ids you used for the paper, i.e. train val test?

@akolesnikoff
Copy link
Collaborator

Splits are uniquely defined in our data folder through the tfds subsplit API: https://www.tensorflow.org/datasets/splits.

The easiest solution would be to use our code to load the data (which will produce the exact splits from the paper).

@cjrd
Copy link
Author

cjrd commented Aug 26, 2020

Thanks for your response: I've been able to load the data and output the train/val/test splits.
Is there a particular way to output the train splits for the 1000 example training?

@frkl
Copy link

frkl commented Nov 9, 2020

Dear VTAB team,

I’m Xiao Lin from SRI. We’ve been working on cross-domain few-shot learning solutions and find your VTAB-1000 benchmark very exciting. It’s the large-scale fixed-split benchmark we need, comparing to existing small 5-way k-shot problems and the random-way random-shot meta-dataset, so we hope to try it out.

But I ran into some difficulties downloading the dataset. After installing the pip requirements and try running dataset preparation scripts, TF1.5 tells me that “the version of dataset you want to download requires TF2” and when I try installing TF2 instead of TF1.5, some other errors pop up
“Exporting/importing meta graphs is not supported when eager execution is enabled. No graph exists when eager execution is enabled” which looks like a code compatibility issue. I see that you are still actively making commits to add TF2 support so keep up the good work.

On the other hand, I main pytorch and I’m not very familiar with tensorflow. I think maybe a good common ground is sharing the images/image names/your custom labels in addition to the benchmarking code. Your protocol of train/val/test sounds very clear so people would be able to reproduce across platforms. The exception being the Res50v2 model architecture/weights and the fine-tuning procedures, but both of which are actively being improved in your BigTransfer work. In case there’s a follow up challenge, is it possible for the benchmark side to run dockers for some cross-platform love?

Best,
Xiao Lin

@dukleryoni
Copy link

Hi,

Would it be possible to upload split_ids file giving the ids of the original dataset samples?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants