fix Bug in BertScore calculation: pred target misalignment #2347

gxy-gxy · 2024-02-03T16:34:01Z

fix Bug in BertScore calculation: pred target misalignment

Fixes bug in BertScore cal.
This pull request addresses a bug identified in the BertScore calculation within the TextDataset class in src/torchmetrics/functional/text/helper_embedding_metric.py.
The class is designed with a preprocess function automatically sorts input text by length to optimize batch encoding efficiency. However, this behavior introduces an issue during the BertScore calculation process, as predictions (preds) and targets (targets) are initialized in separate datasets. This results in a mismatched ordering of text pairs, which is problematic given the pairwise nature of BertScore's calculation. To ensure accurate scoring, it is critical to re-align the datasets to their original order before computing the scores. The proposed fix involves ensuring that the datasets for predictions and targets are processed in a way that maintains their original pairing throughout the calculation process.

Here is the fixed code:

preds_embeddings = preds_embeddings[preds_loader.dataset.sorting_indices]
target_embeddings = target_embeddings[target_loader.dataset.sorting_indices]

This change is essential for preserving the integrity of the BertScore evaluation, ensuring that each prediction is accurately compared against its corresponding target.

📚 Documentation preview 📚: https://torchmetrics--2347.org.readthedocs.build/en/2347/

gxy-gxy · 2024-02-03T16:40:38Z

Here is the test code:

from torchmetrics.text.bert import BERTScore

score_model = BERTScore(model_name_or_path='roberta-large', batch_size=2)

text1 = [ "Claim A from machine", "Claim A from machine"]
text2 = ["Claim A from machine", "Claim B"]
similarities = score_model(text1, text2)
print(similarities)

Borda · 2024-03-15T18:21:48Z

@stancld could you help here, pls?

baskrahmer

Good catch! I can reproduce the bug using your test code snippet. Although the actual metric values seem to be correct, the ordering is not always valid.

It might be nice to somehow integrate this case with the current test suite. E.g. an assertation that reversing the targets/preds also reverses the scores.

src/torchmetrics/functional/text/bert.py

Borda · 2024-07-16T10:46:10Z

Here is the test code:

from torchmetrics.text.bert import BERTScore

score_model = BERTScore(model_name_or_path='roberta-large', batch_size=2)

text1 = [ "Claim A from machine", "Claim A from machine"]
text2 = ["Claim A from machine", "Claim B"]
similarities = score_model(text1, text2)
print(similarities)

@gxy-gxy can you pls add it as a test?

codecov · 2024-07-16T10:55:10Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 39%. Comparing base (30b3fd5) to head (dc3b758).
Report is 100 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff            @@
##           master   #2347     +/-   ##
========================================
- Coverage      69%     39%    -30%     
========================================
  Files         316     316             
  Lines       17878   17874      -4     
========================================
- Hits        12329    7030   -5299     
- Misses       5549   10844   +5295

Borda

adding test - #2347 (comment)

baskrahmer

LGTM! Thanks for the addition. I can quickly add the test

Borda · 2024-07-19T16:02:57Z

@baskrahmer, would mind also adding an entry to the changelog?

* fix pred target misalignment * add test --------- Co-authored-by: Xinyan Guan <xinyan@xinyan.local> Co-authored-by: Bas Krahmer <baskrahmer@gmail.com> (cherry picked from commit 75c33ea)

fix pred target misalignment

92db9cb

gxy-gxy requested review from SkafteNicki, Borda, justusschock and stancld as code owners February 3, 2024 16:34

github-actions bot added the topic: Text label Feb 3, 2024

Merge branch 'master' into master

a9f049a

Borda assigned stancld Feb 14, 2024

Borda added the bug / fix Something isn't working label Feb 14, 2024

Borda added 2 commits February 14, 2024 19:41

Merge branch 'master' into master

5a48dd9

Merge branch 'master' into master

2d1f65e

Borda added this to the v1.3.x milestone Feb 19, 2024

Borda added 2 commits February 19, 2024 16:08

Merge branch 'master' into master

7031138

Merge branch 'master' into master

1f3ed76

Borda force-pushed the master branch from 306bb3d to 4ed43e6 Compare March 14, 2024 12:36

Merge branch 'master' into master

8251fbf

Borda added 4 commits March 19, 2024 10:21

Merge branch 'master' into master

2465cc8

Merge branch 'master' into master

0bc80a7

Merge branch 'master' into master

41d782d

Merge branch 'master' into master

6ef1de9

baskrahmer reviewed May 6, 2024

View reviewed changes

src/torchmetrics/functional/text/bert.py Show resolved Hide resolved

unsort idf

cfd73d3

Merge branch 'master' into master

942f31d

Borda requested a review from baskrahmer July 16, 2024 11:20

Borda requested changes Jul 16, 2024

View reviewed changes

Merge branch 'master' into master

5ce8f06

baskrahmer approved these changes Jul 19, 2024

View reviewed changes

baskrahmer added 2 commits July 19, 2024 17:45

Merge branch 'master' into master

844240d

Add test

b522ae5

Borda approved these changes Jul 19, 2024

View reviewed changes

Borda enabled auto-merge (squash) July 19, 2024 16:02

chlog

dc3b758

mergify bot added the ready label Jul 19, 2024

Borda merged commit 75c33ea into Lightning-AI:master Jul 19, 2024
65 checks passed

This was referenced Sep 9, 2024

Fix test_bertscore_sorting bug + validate idf arg #2727

Merged

BERT score: maximum at self-comparison, symmetry, invariance to additional items #2728

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix Bug in BertScore calculation: pred target misalignment #2347

fix Bug in BertScore calculation: pred target misalignment #2347

gxy-gxy commented Feb 3, 2024 •

edited

Loading

gxy-gxy commented Feb 3, 2024 •

edited by Borda

Loading

Borda commented Mar 15, 2024

baskrahmer left a comment

Borda commented Jul 16, 2024

codecov bot commented Jul 16, 2024 •

edited

Loading

Borda left a comment •

edited

Loading

baskrahmer left a comment •

edited

Loading

Borda commented Jul 19, 2024

fix Bug in BertScore calculation: pred target misalignment #2347

fix Bug in BertScore calculation: pred target misalignment #2347

Conversation

gxy-gxy commented Feb 3, 2024 • edited Loading