Have you tried to fine-tune the clip model (as official Vit-B-32) in your datasets? #263

xyxxmb · 2021-11-02T12:57:05Z

xyxxmb
Nov 2, 2021

a. How the fine-tune result is？Could you provide a set of fine-tuned parameters？
b. For fine-tuning, what suggestions do you have in parameter settings or training skills?

mitchellnw · 2021-11-03T16:21:01Z

mitchellnw
Nov 3, 2021
Maintainer

Hello, is this issue helpful #18

TL;DR: use --openai-pretrained

0 replies

milmin · 2021-11-24T10:48:35Z

milmin
Nov 24, 2021

Hi, when using a pretrained clip model should we care at unfreezing only some particular layers? Should we fine-tune under model.train(), model.eval(), or something else?

0 replies

mitchellnw · 2021-11-24T21:57:17Z

mitchellnw
Nov 24, 2021
Maintainer

Depends what your aim is, what exactly are you fine-tuning on?

We have another repository for fine-tuning pre-trained CLIP on things like ImageNet/CIFAR etc. https://github.com/mlfoundations/wise-ft and an associated preprint https://arxiv.org/abs/2109.01903.

0 replies

milmin · 2021-11-25T10:29:32Z

milmin
Nov 25, 2021

I would like to fine-tune CLIP on specific datasets (like for example animals, objects, city monuments, ...) in order to get better encodings for images and captions. I'm not adding any new layer, just keeping the original CLIP architecture. In this case do you think it is a good idea to unfreeze all layers (model.train()), only some of them (for example keeping batchnorm and dropout freezed like model.eval())? Otherwise what would you do?

0 replies

mitchellnw · 2021-11-26T06:03:01Z

mitchellnw
Nov 26, 2021
Maintainer

Unfortunately we haven't tried that at this time so don't have a good answer for you

0 replies

iremonur · 2022-02-09T05:02:14Z

iremonur
Feb 9, 2022

Hello, I would like to fine-tune CLIP on my own specific dataset (app. 50k image-text pairs), I used provided ViT-B/32 checkpoints as an initial model but the accuracy starts with %1 and after 32 epochs, it reaches only around %30. (I tried various weight decay and LR combinations, the best of them is weight decay=0.001 and LR=5e-4.) Have you tried to fine-tune CLIP on a small specific dataset, if so how is the performance? @milmin
Have you any experience or suggestions to fine-tune CLIP on a specific dataset of image-text pairs? @mitchellnw

0 replies

mitchellnw · 2022-02-09T05:58:49Z

mitchellnw
Feb 9, 2022
Maintainer

I have not tried this but those hyperparameters seem like they should be good, is their any reason to use our checkpoints and not OpenAI's via --openai-pretrained ? This may produce higher accuracy.

To clarify, the 1% accuracy is on your new task? Or zero-shot performance on ImageNet?

0 replies

iremonur · 2022-02-10T10:16:09Z

iremonur
Feb 10, 2022

Actually, I set the parameters of --openai-pretrained and --model to True and ViT-B/32, respectively. In this way, I think, I use the official ViT-B/32 parameters (is that true ? ), therefore I wrote like that.
The training (during the fine-tune) performance on the new task starts with %1 and reaches around %30 which is surprising for me since I use the pre-trained model parameters as the initial parameters.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have you tried to fine-tune the clip model (as official Vit-B-32) in your datasets? #263

{{title}}

Replies: 8 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Have you tried to fine-tune the clip model (as official Vit-B-32) in your datasets? #263

xyxxmb Nov 2, 2021

Replies: 8 comments

mitchellnw Nov 3, 2021 Maintainer

milmin Nov 24, 2021

mitchellnw Nov 24, 2021 Maintainer

milmin Nov 25, 2021

mitchellnw Nov 26, 2021 Maintainer

iremonur Feb 9, 2022

mitchellnw Feb 9, 2022 Maintainer

iremonur Feb 10, 2022

xyxxmb
Nov 2, 2021

mitchellnw
Nov 3, 2021
Maintainer

milmin
Nov 24, 2021

mitchellnw
Nov 24, 2021
Maintainer

milmin
Nov 25, 2021

mitchellnw
Nov 26, 2021
Maintainer

iremonur
Feb 9, 2022

mitchellnw
Feb 9, 2022
Maintainer

iremonur
Feb 10, 2022