-
Notifications
You must be signed in to change notification settings - Fork 27.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DPR pooler weights not loading correctly #19111
Comments
@ArthurZucker could you take a look here (happy to answer questions about the model) |
on it! |
Hi @ArthurZucker , do you have any updates on this by chance? I'm getting the same issue, but the results I get when benchmarking do not suggest random initialization of the weights. |
Hey, it seems that the issue comes from the following line, where it is hardcoded that |
Hey! So as mentioned here in a previous PR, the optional pooling layer was removed as no checkpoints use it. My first question would then be : do you need to have the We have to ways to go about this :
|
Hi Arthur, I think I understand. Thanks for getting back so quickly! Its
performance suggests that the model is loading correctly so that must be
it! Thanks!
…On Thu, Oct 20, 2022 at 3:42 AM Arthur ***@***.***> wrote:
Hey! So as mentioned here in a previous PR
<95eaf44>,
the optional pooling layer was removed as no checkpoints use it.
My first question would then be : do you need to have the BERTPoolerLayer?
It is a bit confusing indeed that the pooling output do not come from the
BertPoolerLayer. Have a look at #14486
<#14486>, I think it
explains pretty well what's going on here.
We have to ways to go about this :
1. We add an argument in the config of DPR, and take care about
updating the online config to have no breaking changes.
2. If you don't need it, then we just add a warning/update the online
weights doing from_pretrained then push_to_hub and the checkpoints
will then not include the pooler weights 😄
—
Reply to this email directly, view it on GitHub
<#19111 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AXTXAGMEKV3HT2J3QSWTHF3WEDZVZANCNFSM6AAAAAAQQLMVMQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
System Info
tested on multiple versions
transformers
version: 4.12.3another environment
transformers
version: 4.16.2Who can help?
@patrickvonplaten @lhoestq
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
results in the following message
Expected behavior
Model loads successfully without re-intitializing weights
The text was updated successfully, but these errors were encountered: