Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The letter ß (Eszett) should be normalized to ss #137

Closed
bmarotta opened this issue Sep 15, 2023 · 2 comments · Fixed by #138 or #140
Closed

The letter ß (Eszett) should be normalized to ss #137

bmarotta opened this issue Sep 15, 2023 · 2 comments · Fixed by #138 or #140

Comments

@bmarotta
Copy link

Description

The letter ß should be normalized to ss and not s => https://en.wikipedia.org/wiki/%C3%9F

Expected outcome

normalizeSync("Muße") => "musse"

Actual outcome

normalizeSync("Muße") => "muse"

Versions of affected system

  • 4.0.0
motss added a commit that referenced this issue Sep 25, 2023
Signed-off-by: Rong Sen Ng (motss) <wes.ngrongsen@gmail.com>
@motss motss mentioned this issue Sep 25, 2023
motss added a commit that referenced this issue Sep 25, 2023
* fix: fix #137

---------

Signed-off-by: Rong Sen Ng (motss) <wes.ngrongsen@gmail.com>
@bmarotta
Copy link
Author

Actually your implementation is the opposite of what was expected. ß should be convert to ss and not s. You only needed to remove \u00DF from the "s" diacritics and leave the "ss" there.

See different articles and discussions about it:
https://www.lumenvox.com/knowledgebase/index.php?/article/AA-01895/0/TTS1-German-Text-Normalization.html
https://en.wikipedia.org/wiki/%C3%9F
https://groups.google.com/g/public-dns-discuss/c/G2Mk9Jn08FE?pli=1
https://www.denic.de/en/faqs/faqs-about-idns-ss#code-365
https://www.loc.gov/aba/pcc/naco/normrule-2.html

@motss
Copy link
Owner

motss commented Sep 25, 2023

@bmarotta Thanks for pointing the mistake out and all the references 😄 . Let me fix that real quick.

@motss motss reopened this Sep 25, 2023
motss added a commit that referenced this issue Sep 25, 2023
Signed-off-by: Rong Sen Ng (motss) <wes.ngrongsen@gmail.com>
@motss motss mentioned this issue Sep 25, 2023
motss added a commit that referenced this issue Sep 25, 2023
Signed-off-by: Rong Sen Ng (motss) <wes.ngrongsen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants