Update wer to standard definition #63

ChristianGeng · 2024-12-04T12:29:10Z

closes #62

Probably it is debatable whether the test organization is ok: I have split into two separate tests depending on the value of symmetric. Probably one could have have a single test that incorporates the value of symmetric into the parametrization. However I find that the long parameterizations are also hard to read. So I think this is justified.

Summary by Sourcery

Update the word_error_rate function to support symmetric normalization and add corresponding test cases. Introduce a temporary wrapper for WER calculation using the jiwer library.

New Features:

Introduce a new parameter 'symmetric' to the word_error_rate function, allowing for symmetric normalization by the maximum length of truth and prediction.

Enhancements:

Add a temporary wrapper function _wer_jiwer to implement Word Error Rate (WER) using the jiwer library.

Tests:

Add new test cases for the word_error_rate function to validate both symmetric and asymmetric normalization scenarios.

sourcery-ai · 2024-12-04T12:29:15Z

Reviewer's Guide by Sourcery

This PR updates the Word Error Rate (WER) implementation to align with the standard definition by introducing an asymmetric normalization option. The changes modify the core WER calculation logic and add comprehensive test cases for both symmetric and asymmetric calculations.

No diagrams generated as the changes look simple and do not need a visual representation.

File-Level Changes

Change	Details	Files
Updated Word Error Rate (WER) calculation to support both symmetric and asymmetric normalization	Added symmetric parameter with default value False for backward compatibility Modified normalization factor to use reference length for asymmetric mode Updated documentation with detailed mathematical explanations Added example showing asymmetric normalization behavior	`audmetric/core/api.py`
Enhanced test coverage for Word Error Rate calculations	Split existing test into symmetric-specific test case Added new test cases for asymmetric WER calculation Added temporary jiwer WER implementation wrapper for reference Included edge cases with empty sequences and None values	`tests/test_api.py`

Possibly linked issues

Add initial version #1: PR updates word error rate to match standard definition as requested in the issue.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time. You can also use
this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

codecov · 2024-12-04T12:30:02Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.0%. Comparing base (49f0e4c) to head (5016444).
Report is 3 commits behind head on main.

Additional details and impacted files

Files with missing lines	Coverage Δ
audmetric/core/api.py	`100.0% <100.0%> (ø)`

sourcery-ai

Hey @ChristianGeng - I've reviewed your changes - here's some feedback:

Overall Comments:

The _wer_jiwer() function appears to be unused. Consider either removing it or adding tests that utilize it.

Here's what I looked at during the review

🟢 General issues: all looks good
🟢 Security: all looks good
🟢 Testing: all looks good
🟢 Complexity: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

ChristianGeng · 2024-12-04T12:34:21Z

Hey @ChristianGeng - I've reviewed your changes - here's some feedback:

Overall Comments:

The _wer_jiwer() function appears to be unused. Consider either removing it or adding tests that utilize it.

Here's what I looked at during the review
Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

I has been removed.

hagenw

I reviewed the documentation part of the pull request. The actual implementation is better review by @ureichel. I will assign him as well.

The current update sets symmetric=False as default, which will change the current behavior of audmetric.word_error_rate(). But as this is in line with the default definition in Wikipedia, it seems to make sense to me.

audmetric/core/api.py

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

reichelu · 2024-12-04T14:08:39Z

hi @hagenw @ChristianGeng

thanks a lot for these updates! I'd have 2 comments:

meaning of "mean edit distance" in docstring

In the docstring it says:

    The mean edit distance of each (truth, prediction)-pair is computed
    as an average of the edit distance over a normalization factor n.

To my understanding I would not speak of "mean edit distance" for single sequence pairs, but rather of "normalized edit distance". This normalized edit distance can be seen as the mean editing cost per sequence item. Thus the mean is calculated for edit costs rather than for the edit distance, which is a single value only.

"symmetric" argument

Further it says:

    The value of n depends on the symmetric parameter.
    If ``symmetric`` is ``False`` (default):
    n is set to the reference (truth) length, following the Wikipedia formulation
    where n is the number of words in the reference (N=S+D+C). This means WER can
    be greater than 1 if the prediction sequence is longer than the reference:

Here I have to apologize: if I remember correctly the choice of this name is based on my suggestion, which on second thought was not well chosen. Actually, all non zero-substitution edit costs been equal, the WER is always symmetric, regardless of the normalization. Thus "symmetric" as an argument name is a bit misleading.

Don't yet have a good alternative naming, maybe something like

maxlennorm=False
norm="reference" | "longest"

or similar.

hagenw · 2024-12-04T14:32:08Z

norm="reference" | "longest"

looks good to me.

reichelu · 2024-12-04T14:35:36Z

norm="reference" | "longest"

looks good to me.

cool. Evtl. even better norm = "truth" | "longest" (default "truth") since you are using "truth" as an argument name.

hagenw · 2024-12-04T14:38:37Z

Yes, truth seems also better to me, as we use the name already.

ChristianGeng · 2024-12-05T12:38:46Z

Good question, I thought word_error_rate() would use event_error_rate() under the hood, but this is not the case.

Looking at the previous implementation, the only difference seems to be that word_error_rate() checks which words are unique before, whereas event_error_rate() does not. But I couldn't find an example yet, where they differ.

Should this become a follow up issue that deals with this clarification.

There is one final question: the changes are api-breaking (the default truth now follows the wikipedia = jiwer conventions). How is that taken care of?

hagenw · 2024-12-05T12:52:49Z

There is one final question: the changes are api-breaking (the default truth now follows the wikipedia = jiwer conventions). How is that taken care of?

The normal way would be to start with norm="longest" and add a warning that the default will change in a future version of audmetric. But as we expected word_error_rate() would follow the standard definition, I would say the current behavior is a bug, and I would vote to directly swtich to norm="truth" as you did here. In the changelog I would then write that this is fixed, but users could get the old behavior with using norm="longest".

hagenw · 2024-12-05T12:53:51Z

Good question, I thought word_error_rate() would use event_error_rate() under the hood, but this is not the case.

Looking at the previous implementation, the only difference seems to be that word_error_rate() checks which words are unique before, whereas event_error_rate() does not. But I couldn't find an example yet, where they differ.

Should this become a follow up issue that deals with this clarification.

Yes, please open an issue on this.

hagenw

I added two comments suggesting to remove (default) from the docstring as it is better to have a single source of truth (the function signature), t indicate default values.

In addition, I made a few suggestions to use semantic line breaks.

Besides those, everything looks good to me now.

audmetric/core/api.py

tests/test_api.py

audmetric/core/api.py

reichelu · 2024-12-09T10:08:49Z

I reviewed the documentation part of the pull request. The actual implementation is better review by @ureichel. I will assign him as well.

the implementation looks good! Tested it against jiwer.wer and got the same results for norm=truth. All fine!

reichelu

the implementation looks good! Tested it against jiwer.wer and got the same results for norm=truth. All fine here!
Only thing I'd still change is to skip or explain (N=S+D+C) from the word_error_rate docstring. But I don't insist.

audmetric/core/api.py

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

reichelu · 2024-12-10T15:31:45Z

btw, this fellow ureichel of whom we are awaiting requested review, could be removed from the list of reviewers. I cannot do it it seems.

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

ChristianGeng · 2024-12-11T09:10:20Z

btw, this fellow ureichel of whom we are awaiting requested review, could be removed from the list of reviewers. I cannot do it it seems.

I have removed him. This guy is entirely unresponsive anyway.

audmetric/core/api.py

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

audmetric/core/api.py

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

hagenw · 2024-12-17T08:12:39Z

@ChristianGeng I know you wanted to add error handling, but I would propose to first merge this pull request and address error handling in another pull request.

ChristianGeng · 2024-12-19T12:05:32Z

@ChristianGeng I know you wanted to add error handling, but I would propose to first merge this pull request and address error handling in another pull request.

D'accord. I didn't receive a notification that this is approved. So I will merge first.

ChristianGeng added 13 commits December 2, 2024 11:08

add temporary jwer wer implementation.

abbf9ce

Add initial implementation of symmetric argument.

95b0e77

Add working tests for word error rate.

140269f

Update Expected results for failing tests.

043a66e

Polish implementation.

168511f

Fix polarity in tests.

144fdb1

change default.

4765c75

symmetric back to default to False.

e05d80a

Format assertion.

d21ff06

Remove temporary comments.

62db819

docstring initial version.

4bcc1f0

Add a doctest.

40625f7

Add reference to wikipedia.

e9f8baf

sourcery-ai bot reviewed Dec 4, 2024

View reviewed changes

ChristianGeng added 2 commits December 4, 2024 13:31

remove jiwer function.

ff230ab

fix ruff error.

c72188f

ChristianGeng requested a review from hagenw December 4, 2024 12:38

ChristianGeng self-assigned this Dec 4, 2024

hagenw reviewed Dec 4, 2024

View reviewed changes

audmetric/core/api.py Outdated Show resolved Hide resolved

audmetric/core/api.py Outdated Show resolved Hide resolved

audmetric/core/api.py Outdated Show resolved Hide resolved

hagenw requested a review from ureichel December 4, 2024 13:20

ChristianGeng and others added 3 commits December 4, 2024 14:41

Update audmetric/core/api.py

89ea83f

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

Update audmetric/core/api.py

72add1d

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

Update audmetric/core/api.py

5ae5ad9

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

Try to fix long line issues.

bbfbfcc

ChristianGeng mentioned this pull request Dec 5, 2024

Compare event_error_rate and word_error_rate #64

Open

ChristianGeng requested a review from hagenw December 9, 2024 08:19

hagenw reviewed Dec 9, 2024

View reviewed changes

reichelu approved these changes Dec 9, 2024

View reviewed changes

audmetric/core/api.py Outdated Show resolved Hide resolved

ChristianGeng and others added 4 commits December 9, 2024 13:57

Update audmetric/core/api.py

87b7e5e

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

Update audmetric/core/api.py

fc1f27e

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

Update audmetric/core/api.py

a7e8e70

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

Update audmetric/core/api.py

8657bf1

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

ChristianGeng and others added 3 commits December 11, 2024 09:51

Update tests/test_api.py

5c53cb8

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

Update audmetric/core/api.py

b7e6ac8

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

Omitting some mathematical notation. Docstring conform to SLB.

a718185

ChristianGeng removed the request for review from ureichel December 11, 2024 09:09

hagenw reviewed Dec 11, 2024

View reviewed changes

audmetric/core/api.py Outdated Show resolved Hide resolved

hagenw reviewed Dec 11, 2024

View reviewed changes

audmetric/core/api.py Outdated Show resolved Hide resolved

Update audmetric/core/api.py

90564b7

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

hagenw reviewed Dec 11, 2024

View reviewed changes

audmetric/core/api.py Outdated Show resolved Hide resolved

ChristianGeng and others added 2 commits December 11, 2024 10:34

Update audmetric/core/api.py

f3ea704

Co-authored-by: Hagen Wierstorf <hwierstorf@audeering.com>

pep-format code.

5016444

hagenw approved these changes Dec 11, 2024

View reviewed changes

ChristianGeng merged commit 2b09acb into main Dec 19, 2024
10 checks passed

ChristianGeng deleted the update-wer-to-standard-definition branch December 19, 2024 12:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update wer to standard definition #63

Update wer to standard definition #63

ChristianGeng commented Dec 4, 2024 •

edited

Loading

sourcery-ai bot commented Dec 4, 2024 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

codecov bot commented Dec 4, 2024 •

edited

Loading

sourcery-ai bot left a comment

ChristianGeng commented Dec 4, 2024

hagenw left a comment

reichelu commented Dec 4, 2024 •

edited

Loading

hagenw commented Dec 4, 2024

reichelu commented Dec 4, 2024

hagenw commented Dec 4, 2024

ChristianGeng commented Dec 5, 2024

hagenw commented Dec 5, 2024

hagenw commented Dec 5, 2024

hagenw left a comment

reichelu commented Dec 9, 2024

reichelu left a comment

reichelu commented Dec 10, 2024

ChristianGeng commented Dec 11, 2024

hagenw commented Dec 17, 2024

ChristianGeng commented Dec 19, 2024

Update wer to standard definition #63

Update wer to standard definition #63

Conversation

ChristianGeng commented Dec 4, 2024 • edited Loading

Summary by Sourcery

sourcery-ai bot commented Dec 4, 2024 • edited Loading

Reviewer's Guide by Sourcery

File-Level Changes

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

codecov bot commented Dec 4, 2024 • edited Loading

Codecov Report

sourcery-ai bot left a comment

Choose a reason for hiding this comment

ChristianGeng commented Dec 4, 2024

hagenw left a comment

Choose a reason for hiding this comment

reichelu commented Dec 4, 2024 • edited Loading

meaning of "mean edit distance" in docstring

"symmetric" argument

hagenw commented Dec 4, 2024

reichelu commented Dec 4, 2024

hagenw commented Dec 4, 2024

ChristianGeng commented Dec 5, 2024

hagenw commented Dec 5, 2024

hagenw commented Dec 5, 2024

hagenw left a comment

Choose a reason for hiding this comment

reichelu commented Dec 9, 2024

reichelu left a comment

Choose a reason for hiding this comment

reichelu commented Dec 10, 2024

ChristianGeng commented Dec 11, 2024

hagenw commented Dec 17, 2024

ChristianGeng commented Dec 19, 2024

ChristianGeng commented Dec 4, 2024 •

edited

Loading

sourcery-ai bot commented Dec 4, 2024 •

edited

Loading

codecov bot commented Dec 4, 2024 •

edited

Loading

reichelu commented Dec 4, 2024 •

edited

Loading