Subtitle score is not correlating with matching results #821

ronaldheft · 2020-02-14T23:40:19Z

Describe the bug

I’ve been having an ongoing issue where the subtitle selected by Bazarr is not the ideal subtitle. Often it grabs a subtitle that matches a different release, even when the exact matching release is available.

This often occurs when multiple subtitles show a 100% score. It appears like Bazarr isn’t attempting to match all of the metadata fields. For example, sometimes the scenename is available to match on the release_group, and that doesn’t appear to be used.

I’m documented this all below with screenshots.

To Reproduce

Manually search for a subtitle, where the scenename is available.
Notice multiple results with a 100% score.
Notice the release_group is not being used to match, as the release_group is showing a false match.
When pulling the subtitles automatically, the incorrect result is chosen, as multiple results have a 100% score rating.

Expected behavior

All available metadata is used to calculate the score.

Screenshots

Software (please complete the following information):
Bazarr Version: 0.8.4.1
Sonarr Version: 2.0.0.5338
Radarr Version: 0.2.0.1450
Operating System: Linux-4.4.59+-x86_64-with (Docker)

morpheus65535 · 2020-02-14T23:43:22Z

If hash match, we don't look for other criteria (except hearing impaired). That's the expected behavior and it's the same with Sub-Zero (we share base code).

ronaldheft · 2020-02-14T23:44:45Z

So is that bad data on the subtitle provider? Some of these subtitles definitely do not match the release and are out of sync. Selecting the version with the correct release group returns subtitles in sync.

morpheus65535 · 2020-02-14T23:47:56Z

Unfortunately some subtitles uploader are adding hash even if it doesn't match. We have no control over this.

ronaldheft · 2020-02-14T23:53:10Z

That’s understandable. Could logic be added if multiple results return a matching hash, that addition metadata fields are used instead of selecting the first subtitle result?

ronaldheft · 2020-02-14T23:54:26Z

Essentially calculate the score again on the subset of results matching the hash, but ignoring the hash and calculating off metadata only?

morpheus65535 · 2020-02-15T01:29:27Z

@pannal something that could be done?

GermanG · 2020-02-15T08:18:17Z

@morpheus65535 Didn't look at the actual code, but looks like the other matches impact sorting, I'll play with bsplayer (which also has hash matching) and I'll let you know.
EDIT: I can reproduce it with bsplayer.

GermanG · 2020-02-15T09:16:28Z

I've ~~stealed~~ taken inspiration from subdivx for matching, and modified subliminal_patch.score with:

--- a/libs/subliminal_patch/score.py
+++ b/libs/subliminal_patch/score.py
@@ -81,7 +81,7 @@ def compute_score(matches, subtitle, video, hearing_impaired=None):
                     matches -= {"hash"}
     elif 'hash' in matches:
         logger.debug('%r: Hash not verifiable for this provider. Keeping it', subtitle)
-        matches &= {'hash'}
+        matches |= {'hash'}
 
     # handle equivalent matches
     if is_episode:

Now I have the right preference, but with crazy scores:

GermanG · 2020-02-15T13:39:59Z

Yup, confusing.
Well, there are many alternatives:

Take the current approach, and this is a known bug.
Use my brutal aproach and deal with >100% scoring when hashes are matched (as a known new bug)
Same as previous but 'Disguise' the >100% as 100% in the UI
Return hash matching as an attribute and not part of the scoring, then ordering by (hash, score) descending.

EDIT: @morpheus65535 I'll leave the decision up to you, let me know if it's not the first one, so I can give it a try coding it.

morpheus65535 · 2020-02-15T14:26:30Z

What about making hash optional? Something like use scenename?

pannal · 2020-02-15T14:40:09Z

Wait, there is already code in place to counter this, because OpenSubtitles had the same issue YEARS ago: https://github.com/pannal/Sub-Zero.bundle/blob/master/Contents/Libraries/Shared/subliminal_patch/score.py#L60

If the provider has the necessary metadata to support hash checking ("series", "season", "episode", "format" for TV, "video_codec", "format" for movies), just enable the hash_verifiable flag for that provider and the subtitle class, and this gets fixed automatically.

GermanG · 2020-02-15T16:13:11Z

@pannal that might be the case for bsplayer, but the OP is about OpenSubtitles, and it looks like {"series", "season", "episode", "format"} matches but won't pick the desired subtitle.

pannal · 2020-02-15T16:22:53Z

That's something to look into, then.
The scoring might not be ideal for such cases. Maybe we should ultimately revise it, but that's not an easy feat.

Edit: Well, when two subtitles have the same score, Bazarr could prioritize the one that matches the most metadata, which would be quite simple.

GermanG · 2020-02-15T16:25:12Z

@pannal but it's dropped when

matches &= {'hash'}

EDIT: ignore this comment, I think I got what you mean.

…in hash score is the same; morpheus65535/bazarr#821

pannal · 2020-02-16T04:55:58Z

I've added a secondary scoring method to latest bazarr development, that changes the sorting of subtitles based on (score_with_hash, score_without_hash). This might fix the issue.

ronaldheft · 2020-02-16T19:52:43Z

Just pulled down the latest development release, and my results are way better! I'm now seeing the correct subtitle selected if there is an exact match.

I like the approach of doing a secondary sort and keeping the UI at 100% score. If you're considering a hash match a 100% match, then yeah, it makes sense to keep the score at 100% and then from there just pick the best of the bunch.

Thanks for the quick resolution!

rigas40 · 2020-02-23T00:57:26Z

also we can add subsync if have low score will help
or subsync for check if subs are good

morpheus65535 added the help wanted label Feb 15, 2020

pannal added a commit to pannal/Sub-Zero.bundle that referenced this issue Feb 16, 2020

core: scoring: reorder subtitles based on second non-hash-score if ma…

1f0a713

…in hash score is the same; morpheus65535/bazarr#821

morpheus65535 closed this as completed Feb 16, 2020

GermanG mentioned this issue Feb 22, 2020

BSPLAYER always results in 100% match #829

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subtitle score is not correlating with matching results #821

Subtitle score is not correlating with matching results #821

ronaldheft commented Feb 14, 2020

morpheus65535 commented Feb 14, 2020

ronaldheft commented Feb 14, 2020

morpheus65535 commented Feb 14, 2020

ronaldheft commented Feb 14, 2020

ronaldheft commented Feb 14, 2020

morpheus65535 commented Feb 15, 2020

GermanG commented Feb 15, 2020 •

edited

Loading

GermanG commented Feb 15, 2020 •

edited

Loading

GermanG commented Feb 15, 2020 •

edited

Loading

morpheus65535 commented Feb 15, 2020

pannal commented Feb 15, 2020 •

edited

Loading

GermanG commented Feb 15, 2020

pannal commented Feb 15, 2020 •

edited

Loading

GermanG commented Feb 15, 2020 •

edited

Loading

pannal commented Feb 16, 2020

ronaldheft commented Feb 16, 2020

rigas40 commented Feb 23, 2020

Subtitle score is not correlating with matching results #821

Subtitle score is not correlating with matching results #821

Comments

ronaldheft commented Feb 14, 2020

morpheus65535 commented Feb 14, 2020

ronaldheft commented Feb 14, 2020

morpheus65535 commented Feb 14, 2020

ronaldheft commented Feb 14, 2020

ronaldheft commented Feb 14, 2020

morpheus65535 commented Feb 15, 2020

GermanG commented Feb 15, 2020 • edited Loading

GermanG commented Feb 15, 2020 • edited Loading

GermanG commented Feb 15, 2020 • edited Loading

morpheus65535 commented Feb 15, 2020

pannal commented Feb 15, 2020 • edited Loading

GermanG commented Feb 15, 2020

pannal commented Feb 15, 2020 • edited Loading

GermanG commented Feb 15, 2020 • edited Loading

pannal commented Feb 16, 2020

ronaldheft commented Feb 16, 2020

rigas40 commented Feb 23, 2020

GermanG commented Feb 15, 2020 •

edited

Loading

GermanG commented Feb 15, 2020 •

edited

Loading

GermanG commented Feb 15, 2020 •

edited

Loading

pannal commented Feb 15, 2020 •

edited

Loading

pannal commented Feb 15, 2020 •

edited

Loading

GermanG commented Feb 15, 2020 •

edited

Loading