-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use right range and threshold for showing "bad" words/sentences #370
Use right range and threshold for showing "bad" words/sentences #370
Conversation
If these are the correct thresholds we can also lose the "these thresholds are just examples" comments |
I just want to confirm with @mfomicheva @abarbosa94 @felipesantosk once more that the threshold of I can remove the comment after their confirmation. Thanks for pointing out 👍🏾 |
I responded on slack |
Just documenting what @mfomicheva shared: For the supervised models that were fitted on annotated data (En-Es, En-Cs and En-Et language pairs), you should use the threshold that corresponds to the log of 0.5, which is around -0.6931 (here For the unsupervised case where the returned value is just the average log-prob coming directly from the MT model, I think you should still start with the same threshold and experiment further with it The range [-0.6931, 0] means better quality I will modify PR to reflect these changes. |
Higher QE scores means better quality.
Changed the threshold to
-0.5
ln(0.5) => -0.6931
as per discussions in QE meetings.@mfomicheva @abarbosa94 @felipesantosk Please let me know if any of the above is wrong 👍🏾