Use right range and threshold for showing "bad" words/sentences #370

abhi-agg · 2022-03-03T12:48:56Z

Higher QE scores means better quality.
Changed the threshold to ~~-0.5~~ ln(0.5) => -0.6931 as per discussions in QE meetings.

@mfomicheva @abarbosa94 @felipesantosk Please let me know if any of the above is wrong 👍🏾

jelmervdl · 2022-03-03T13:06:15Z

If these are the correct thresholds we can also lose the "these thresholds are just examples" comments

abhi-agg · 2022-03-03T13:12:12Z

I just want to confirm with @mfomicheva @abarbosa94 @felipesantosk once more that the threshold of -0.5 is a good one as a starting point for all the language pairs irrespective of whether the quality scores are returned using translation models or supervised QE models under the hood.

I can remove the comment after their confirmation. Thanks for pointing out 👍🏾

mfomicheva · 2022-03-03T14:33:54Z

I just want to confirm with @mfomicheva @abarbosa94 @felipesantosk once more that the threshold of -0.5 is a good one as a starting point for all the language pairs irrespective of whether the quality scores are returned using translation models or supervised QE models under the hood.

I can remove the comment after their confirmation. Thanks for pointing out 👍🏾

I responded on slack

abhi-agg · 2022-03-03T14:55:18Z

Just documenting what @mfomicheva shared:

For the supervised models that were fitted on annotated data (En-Es, En-Cs and En-Et language pairs), you should use the threshold that corresponds to the log of 0.5, which is around -0.6931 (here log means ln)

For the unsupervised case where the returned value is just the average log-prob coming directly from the MT model, I think you should still start with the same threshold and experiment further with it

The range [-0.6931, 0] means better quality

I will modify PR to reflect these changes.
Updating the description of the PR as well.

Use right range and threshold for showing "bad" words/sentences

adcea20

abhi-agg requested a review from jelmervdl March 3, 2022 12:49

Modified threshold values for QE as per QE team recommendation

3c58dbd

jelmervdl approved these changes Mar 3, 2022

View reviewed changes

abhi-agg merged commit 89a96bf into browsermt:main Mar 3, 2022

abhi-agg deleted the wasm-test-page-qe-thresholds branch March 3, 2022 16:24

abhi-agg mentioned this pull request Mar 4, 2022

[meta] Implement Basic Quality Estimation mozilla/firefox-translations#26

Closed

4 tasks

jelmervdl mentioned this pull request Mar 15, 2022

Add metrics for quality estimation mozilla/firefox-translations#161

Merged

jerinphilip mentioned this pull request Mar 15, 2022

JS: Using supervised QE models for available language pairs #378

Merged

abhi-agg mentioned this pull request Mar 25, 2022

Show colors for in-page translation from the quality scores returned by the engine mozilla/firefox-translations#179

Closed

abhi-agg mentioned this pull request Apr 6, 2022

Make QE styling change query the dom less mozilla/firefox-translations#218

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use right range and threshold for showing "bad" words/sentences #370

Use right range and threshold for showing "bad" words/sentences #370

abhi-agg commented Mar 3, 2022 •

edited

Loading

jelmervdl commented Mar 3, 2022 •

edited

Loading

abhi-agg commented Mar 3, 2022 •

edited

Loading

mfomicheva commented Mar 3, 2022

abhi-agg commented Mar 3, 2022 •

edited

Loading

Use right range and threshold for showing "bad" words/sentences #370

Use right range and threshold for showing "bad" words/sentences #370

Conversation

abhi-agg commented Mar 3, 2022 • edited Loading

jelmervdl commented Mar 3, 2022 • edited Loading

abhi-agg commented Mar 3, 2022 • edited Loading

mfomicheva commented Mar 3, 2022

abhi-agg commented Mar 3, 2022 • edited Loading

abhi-agg commented Mar 3, 2022 •

edited

Loading

jelmervdl commented Mar 3, 2022 •

edited

Loading

abhi-agg commented Mar 3, 2022 •

edited

Loading

abhi-agg commented Mar 3, 2022 •

edited

Loading