-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducing results for IWSLT En-De #6
Comments
Hi, Thank you for your interest in the paper. There're few possible reasons.
For correct experiments, I expect to achieve performance at least 1 BLEU higher than the transformer baseline, when compared with consistent BLEU implementation. I suggests you simply reimplement the model-part of the codebase here in your own training and eval pipeline with latest SacreBLEU implementation for your work. If you observe lost divergence, significantly lower performance, then likely there is a bug in the code or incorrect setup. Hope this helps. Sorry for the inconvenience. |
Hi @nxphi47, Thank you for your response. I am running the code in this repo and am using the instructions provided in the README file. The only modifications I have made are to fix the issues where the code breaks. But I could not reproduce the results for IWSLT En-De. Let me recheck the settings and see if the above-mentioned points are taken care of. I will get back to you with the results. Thanks again for your suggestions. Appreciate the quick response! |
Hi @nxphi47, I exported the model provided in this repository and ran it with fairseq v0.12.3. I used the model For training, I used a batch size of 4096 tokens and ran the training for 61000 steps with other parameters as mentioned in the README file and kept the parameters the same between Transformer and Tree Transformer. I also tried five different seeds to ensure the random initialization was not an issue. Evaluating the best checkpoint resulted in a BLEU score of I am using the config mentioned in the README file, and I am not sure if I am missing any other configuration you used for the results in the paper. Any suggestions or input will be appreciated. Thank you! |
Hi,
I am trying to reproduce the results on IWSLT En-De. I followed the instructions mentioned in the README file but was not able to achieve the BLEU score mentioned in the paper. To run the code, I made some fixes:
Changes: https://github.com/neerajgangwar/tree_transformer/tree/fixes
Diff: #5
It would be great if you could help me reproduce the results mentioned in the paper.
Thank you!
The text was updated successfully, but these errors were encountered: