Questions about hyper-parameters in the finetuning stage #152

falcon-xu · 2022-03-06T09:22:42Z

Hi, it's a wonderful work.
I am trying to replicate the results of the paper that have been fine-tuned to datasets. As you know, it often needs to take a long time to adjust hyper-parameters to get well trained transformer-based models. If you can show all the settings, it will be really helpful.

I saw some settings in #105, #45, but it lacks other size of models and datasets such as DeiT-Ti, iNat.

Could you please help summarize all size of model on different finetuning datasets mentioned in the paper?

Maybe a table form is very clear, just like the following:

model type	pretrained dataset	finetuned dataset	lr	bs	wd	sched	epochs	warmup	...
ViT-B
ViT-L
...
DeiT-Ti
DeiT-S
...

I really appreciate it.

TouvronHugo · 2022-03-29T07:33:56Z

Hi @lostsword,
Thank you for your suggestion.
As soon as I have some time I complete this table and add it to the Readme.
I'll keep you informed.
Best,
Hugo

falcon-xu · 2022-03-29T11:45:18Z

Hi @lostsword, Thank you for your suggestion. As soon as I have some time I complete this table and add it to the Readme. I'll keep you informed. Best, Hugo

OK. Thanks a lot.

HashmatShadab · 2022-07-30T15:33:12Z

Hi!

This would help a lot! @TouvronHugo any update on when it would be possible?

TouvronHugo self-assigned this Mar 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about hyper-parameters in the finetuning stage #152

Questions about hyper-parameters in the finetuning stage #152

falcon-xu commented Mar 6, 2022

TouvronHugo commented Mar 29, 2022

falcon-xu commented Mar 29, 2022

HashmatShadab commented Jul 30, 2022

Questions about hyper-parameters in the finetuning stage #152

Questions about hyper-parameters in the finetuning stage #152

Comments

falcon-xu commented Mar 6, 2022

TouvronHugo commented Mar 29, 2022

falcon-xu commented Mar 29, 2022

HashmatShadab commented Jul 30, 2022