-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bleuscore from torchmetric give different results compare to NLTK #1074
Comments
Hi! thanks for your contribution!, great first issue! |
cc: @stancld is probably the best to answer this but from out code I can tell this is the comparison function we use for our implementation: from nltk.translate.bleu_score import corpus_bleu |
Hello @icedpanda, thanks for raising this issue. The difference lies in the fact that Bleu from torchmetrics: tensor(0.4595)
Bleu from nltk: 0.45946931172542343 Question for @Borda & @SkafteNicki -> Don't we wanna allow to set weights manually, with the default behaviour following the current implementation? |
@stancld I would be fine with adding a new argument: weights: Optional[List[float]] = None where |
Thanks for the prompt reply, make sense now |
@SkafteNicki I'll send a PR |
🐛 Bug
To Reproduce
Bleu's score from
torchmetric
andnltk
is different.I can only get the same result if
k
= 1. Otherwise, it returns a different Bleu scoreCode sample
Expected behavior
I would expect the same bleu score from
torchmetric
andnltk
Environment
Additional context
The text was updated successfully, but these errors were encountered: