-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix coca training #710
fix coca training #710
Conversation
Would be good to add coca to the training test. That way we'll catch any
future issue
Did this current issue make training fail completely or be wrong ? If it
made it fail completely, all the test has to do is to run the training,
which is what the current test does, just need to include coca there
…On Thu, Oct 26, 2023, 23:27 Giovanni Puccetti ***@***.***> wrote:
@rwightman <https://github.com/rwightman> I was too hasty with the
embed_cls thing, I think now the model would not train.
This should make it work with the tokenizer and train ok
------------------------------
You can view, comment on, or merge this pull request online at:
#710
Commit Summary
- 438b5e4
<438b5e4>
fix coca training
File Changes
(1 file <https://github.com/mlfoundations/open_clip/pull/710/files>)
- *M* src/open_clip/coca_model.py
<https://github.com/mlfoundations/open_clip/pull/710/files#diff-2c6f604038a68c220ab530723839b6716c1ca011291158de70c3f8e17c0a74de>
(10)
Patch Links:
- https://github.com/mlfoundations/open_clip/pull/710.patch
- https://github.com/mlfoundations/open_clip/pull/710.diff
—
Reply to this email directly, view it on GitHub
<#710>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAR437XCMZGZGL7JOIIJINDYBJ6OXAVCNFSM6AAAAAA6RND3HWVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE3DGOBSGY3DMNA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
I think coca is in the training tests already, but this does not error. The tokens are not shifted by one, so it learns to copy current token instead of predict the next one |
So, been thinking about this one, I really don't like the is_training, it's not done this way elsewhere. The label shift is standard, but why do we need to truncate the text encoder output like that only for training? |
After shortening the labels one needs to drop the last token in the encoder otherwise there will be a length mismatch and also the last token does have a One could also do it something like this About |
Yeah I don't like embed_cls either. Truncating the text input first, outside of the forward ala |
The reason for this is that it was meant to keep it identical to how it was before (assuming I did it right) and since compared to before the tokenizer has a hidden However, it probably makes very little difference and the 'normal' way is better. |
merged through #877 with minor changes |
@rwightman I was too hasty with the
embed_cls
thing, I think now the model would not train.This should make it work with the tokenizer and train ok, fix #715