Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BLEU max_ngram_order to signature #251

Open
BramVanroy opened this issue Nov 27, 2023 · 2 comments
Open

Add BLEU max_ngram_order to signature #251

BramVanroy opened this issue Nov 27, 2023 · 2 comments

Comments

@BramVanroy
Copy link
Contributor

Currently, when you calculate BLEU with different max_ngram_order's and everything the same, they will have the same signature when you use bleu_metric.get_signature().format(short=True). Something like #:1|c:mixed|e:no|tok:13a|s:exp|v:2.3.1. Should an argument be added to the signature to specify the max ngram order, like with ChrF where both nc and nw are specified?

If you agree I can do a PR.

@martinpopel
Copy link
Collaborator

The original chrF papers report results with different n-gram orders and also other researchers have tried (and reported) chrF with different orders and I think no order has been actually selected as the default in the papers, so it is natural that nc and nw are part of the signature. (The very first chrF paper mentions that "The best correlations are obtained for 6-gram", but the correlations are not shown.)
However, the original BLEU paper reports BLEU scores only with max ngram order=N=4, which has been considered the default/standard value for BLEU since then. (The paper reports n-gram precisions for N=1...4 in Figure 2, but not the final BLEU nor its correlation with humans. That is reported only for N=4.)
So I would suggest to keep not reporting max_ngram_order in the BLEU signature if the value is the default N=4. That said, I have nothing against adding max_ngram_order into the signature if the value is different. What about @mjpost and @ozancaglayan?

@ozancaglayan
Copy link
Collaborator

I think if the value is different, it could be added as you suggested. So if the value is not changed, at least the signatures are backwards-compatible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants