-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
where is " [LSEP]" located in? #3
Comments
Hi, I used the '[unused5]' token to represent '[LSEP]' in BERT dictionary, hence the id of '[LSEP]' should be "5" in the model. This modification don't need code change in model architecture, I directly modified data file to construct the training data. However, it is required to distinguish the position of the '[LSEP]' token during inference stage, this part of code is in the file "src/models/predictor.py". |
Thanks for your response sincerely! |
Hi, thanks for asking! My implementation is pretty personal and perhaps not elegant enough. For the NCLS+MS scenario, since Presumm used the sharded_compute_loss trick and it's hard to modify it, I directly concat the two outputs of the monolingual decoder and the cross-lingual decoder into one tensor. This part of code is in L291 model_builder.py (produce the monolingual output), and L271 in trainer.py (concatenate the outputs). For the MCLAS loss, the output is S^A + S^B, the ground truth label is also S^A + S^B, so calculating the normal NLLloss between them is just fine. Hope this can help you! |
When running the command of "Model Evaluation",my generated resulet only contains "
The text was updated successfully, but these errors were encountered: