fix coca training #710

gpucce · 2023-10-26T15:27:25Z

@rwightman I was too hasty with the embed_cls thing, I think now the model would not train.
This should make it work with the tokenizer and train ok, fix #715

rom1504 · 2023-10-28T02:54:49Z

Would be good to add coca to the training test. That way we'll catch any future issue Did this current issue make training fail completely or be wrong ? If it made it fail completely, all the test has to do is to run the training, which is what the current test does, just need to include coca there

…

On Thu, Oct 26, 2023, 23:27 Giovanni Puccetti ***@***.***> wrote: @rwightman <https://github.com/rwightman> I was too hasty with the embed_cls thing, I think now the model would not train. This should make it work with the tokenizer and train ok ------------------------------ You can view, comment on, or merge this pull request online at: #710 Commit Summary - 438b5e4 <438b5e4> fix coca training File Changes (1 file <https://github.com/mlfoundations/open_clip/pull/710/files>) - *M* src/open_clip/coca_model.py <https://github.com/mlfoundations/open_clip/pull/710/files#diff-2c6f604038a68c220ab530723839b6716c1ca011291158de70c3f8e17c0a74de> (10) Patch Links: - https://github.com/mlfoundations/open_clip/pull/710.patch - https://github.com/mlfoundations/open_clip/pull/710.diff — Reply to this email directly, view it on GitHub <#710>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAR437XCMZGZGL7JOIIJINDYBJ6OXAVCNFSM6AAAAAA6RND3HWVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE3DGOBSGY3DMNA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

gpucce · 2023-10-28T05:50:20Z

I think coca is in the training tests already, but this does not error. The tokens are not shifted by one, so it learns to copy current token instead of predict the next one

rwightman · 2023-11-01T17:39:07Z

So, been thinking about this one, I really don't like the is_training, it's not done this way elsewhere. The label shift is standard, but why do we need to truncate the text encoder output like that only for training?

gpucce · 2023-11-01T17:47:50Z

So, been thinking about this one, I really don't like the is_training, it's not done this way elsewhere. The label shift is standard, but why do we need to truncate the text encoder output like that only for training?

@rwightman

After shortening the labels one needs to drop the last token in the encoder otherwise there will be a length mismatch and also the last token does have a next_one to use as a label. However in generation one wants to keep also the last token otherwise it does not go forward.

One could also do it something like this self.encode_text(text[:, :-1])

About is_training maybe could also go back to embed_cls though maybe that wasn't the best either

rwightman · 2023-11-01T18:06:27Z

Yeah I don't like embed_cls either. Truncating the text input first, outside of the forward ala self.encode_text(text[:, :-1]) is the 'normal' approach, but wasn't sure if that would impact the contrastive latent?

gpucce · 2023-11-01T18:20:48Z

The reason for this is that it was meant to keep it identical to how it was before (assuming I did it right) and since compared to before the tokenizer has a hidden text[:, :-1] this way the change would not show in the contrastive latent but be there in the generative logits.

However, it probably makes very little difference and the 'normal' way is better.

rwightman · 2024-05-09T22:26:19Z

merged through #877 with minor changes

fix coca training

438b5e4

gpucce mentioned this pull request Oct 27, 2023

coca training doesn't work #715

Closed

Thomas2419 mentioned this pull request Dec 26, 2023

the inference results of finetuned coca model is not as expected #751

Closed

Thomas2419 mentioned this pull request Jan 16, 2024

Questions about using COCa to generate captions #797

Open

gpucce mentioned this pull request Feb 9, 2024

Model coca_ViT-B-32 not found; available models = ['RN50', 'RN50-quickgelu', 'RN101', 'RN101-quickgelu', 'RN50x4', 'RN50x16', 'RN50x64', 'ViT-B-32', 'ViT-B-32-quickgelu', 'ViT-B-16', 'ViT-L-14', 'ViT-L-14-336'] #814

Closed

rwightman closed this May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix coca training #710

fix coca training #710

gpucce commented Oct 26, 2023 •

edited

Loading

rom1504 commented Oct 28, 2023 via email

gpucce commented Oct 28, 2023

rwightman commented Nov 1, 2023

gpucce commented Nov 1, 2023 •

edited

Loading

rwightman commented Nov 1, 2023

gpucce commented Nov 1, 2023

rwightman commented May 9, 2024

fix coca training #710

fix coca training #710

Conversation

gpucce commented Oct 26, 2023 • edited Loading

rom1504 commented Oct 28, 2023 via email

gpucce commented Oct 28, 2023

rwightman commented Nov 1, 2023

gpucce commented Nov 1, 2023 • edited Loading

rwightman commented Nov 1, 2023

gpucce commented Nov 1, 2023

rwightman commented May 9, 2024

gpucce commented Oct 26, 2023 •

edited

Loading

gpucce commented Nov 1, 2023 •

edited

Loading