Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to finetune with ANCE? #10

Open
Victoriaheiheihei opened this issue Feb 7, 2023 · 4 comments
Open

How to finetune with ANCE? #10

Victoriaheiheihei opened this issue Feb 7, 2023 · 4 comments

Comments

@Victoriaheiheihei
Copy link

Hellow,
Is that the code publiced in code is for finetuning the model by dpr? Where can i find the code for finetuning with ANCE?
And there's another question that confused me. Is that the code in code is used to distill the retriever with a teacher model,the reault corresponding to (0.416/0.709/0.927/0.988)

@staoxiao
Copy link
Owner

staoxiao commented Feb 7, 2023

Thanks for your interest in RetroMAE!

We fine-tune the model with hard negatives by changing the argument neg_file (noted we don't change the hard negatives dynamically in the training process like the original ANCE). You can use our provided hard_negs.txt to reproduce the results or generate new hard negatives following our commands.

The cross-encoder example will fine-tune a teacher model, whose prediction scores will be used in distillation. To distill the retriever, you need to generate the teacher_score_files by cross-encoder and add the argument to the training command in the bi_encoder example.

@Victoriaheiheihei
Copy link
Author

Thank you for your reply!
I tried to finetune the ANCE model (using the hard_negs.txt on Shitao/RetroMAE_MSMARCO),but the results is lower than the reported one. Do you finetune the ANCE model on the Shitao/RetroMAE_MSMARCO) or on the model finetuned by DPR? I trained the model with batch_size 32 epoch 8 as mentioned in paper (but the batch_size and epoch in the script is 16 and 4, I'm not sure which group of parameter is used in finetuning)

@staoxiao
Copy link
Owner

staoxiao commented Feb 7, 2023

For ANCE, we finetune the Shitao/RetroMAE_MSMARCO model. Please use the hyper-parameters in our script, which we found is better.

@Victoriaheiheihei
Copy link
Author

Thank you. I will try again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants