-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Add "iw" variant of Hebrew language code #7752
Conversation
Also reformats the map for cleaner future diffs.
Incremental code coverage: 100.00% |
['heb', 'he'], | ||
['heb', 'iw'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a map after all, so only heb -> iw
mapping will be used. Is that what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope! Not at all. Good catch.
I'll restructure this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, wait a sec, I see the root of my confusion now.
In Shaka Player, we only ever map from 3-letter codes to 2-letter codes, to canonicalize inputs for comparison. So this extra mapping is not needed.
In Shaka Packager, we initially copied the map from Player. But Packager uses it in both directions, because to create an mp4, you need the 3-letter code in all cases. So one input with "he" and another with "iw" both need to map to "heb" when Packager creates the MP4. But then when Packager creates the MPD, it needs to follow BCP-47 and map down to the shortest possible tag.
There's an edge case here for the player, where one stream says "he" and another says "iw", and the player would currently treat them as different. This could also happen for other languages with multiple 2-letter codes. However, only one 2-letter code is canonical. In this case, "he" is the official ISO-639-2 code for Hebrew, whereas "iw" was used before standardization and still appears in some places. So I would argue that we could ignore this edge case, and leave it to packaging software to create DASH manifests with canonical language codes.
With that, I'm closing this PR.
We already mapped Hebrew's "heb" to "he", but not to the alternative 2-letter code "iw".
Also reformats the map for cleaner future diffs.