Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly replace umlauts and Eszett for German lang #2616

Merged
merged 5 commits into from
Feb 26, 2024

Conversation

floze
Copy link
Contributor

@floze floze commented Jan 7, 2024

I propose this change to properly replace umlauts and Eszett for German language strings. It replaces ß with ss and any combining diaeresis ◌̈ with e, i.e. ä becomes ae, ö oe and ü ue, which appears to be the SEO-friendliest replacement. This might be incorrect for other languages, as e.g. Finnish seems to replace ä with a and ö with o, so it might be sensible to add i18n to this function.

I propose this change to properly replace umlauts and Eszett for German language strings. It replaces "ß" with "ss" and any combining diaeresis "◌̈" with "e", i.e. "ä" becomes "ae", "ö" "oe" and "ü" "ue", which appears to be the SEO-friendliest replacement. This might be incorrect for other languages, as e.g. Finnish seems to replace "ä" with "a" and "ö" with "o", so it might be sensible to add i18n to this function.
Copy link

netlify bot commented Jan 7, 2024

Deploy Preview for effervescent-donut-4977b2 canceled.

Name Link
🔨 Latest commit 50cc6f0
🔍 Latest deploy log https://app.netlify.com/sites/effervescent-donut-4977b2/deploys/65d62347a3af7d0008260fda

@michaelbromley
Copy link
Member

I agree with the intent of this but I wonder if there could be an implementation that is more general. The issue you raise with the differing conventions between German and Finnish are a perfect example - and I'd bet there are similar subtle issues with other diacritics shared between different languages.

Ideally something like the Intl API would be best where we can rely on a correct reference of i18n data to always get the correct result for the given language. But on a brief search I could not find such a built-in web API. Otherwise we'd have to maintain this mapping data ourselves.

Maybe that's OK, since your solution probably covers quite a large proportion of use-cases based on what I know of the Vendure user base.

@floze
Copy link
Contributor Author

floze commented Jan 9, 2024

Yes you are right of course, a more general approach would be very much desirable. simov/slugify looks like a nice example, it seems to be pretty slim, extendable, and the only language-code aware js slugifier I have found so far. It might be worth investigating: https://github.com/simov/slugify

@michaelbromley
Copy link
Member

Thanks for doing some further research on this. That lib does indeed seem like it would account for locale-specific replacements. There's one major issue with this I just realized: the normalizeString function is used in both the browser and in node.js. In the browser we could use the navigator.language API to determine the current browser locale, but this API does not exist in Node.

Furthermore, even the browser locale might not be what we want because the admin can set any locale they like in the Admin UI language settings.

For these reasons I think we should just go with your solution and handle the majority case.

Could you please add tests to the normalize-string.spec.ts file to test this new behaviour?

@michaelbromley
Copy link
Member

Thanks for the changes & tests. Looks like the final thing is that this e2e test file needs to be updated because right now it is expecting "Zubehör" to be normalized to "zubehor" so there are a few tests failing. If you replace the assertions that expect "zubehor" with "zubehoer" then we should be good 👍

@floze
Copy link
Contributor Author

floze commented Feb 21, 2024

Looks like the tests are OK now, not sure about that mariadb fail though, seems to be a test setup / concurrency problem...

@Draykee
Copy link
Contributor

Draykee commented Feb 23, 2024

@floze did you check https://www.npmjs.com/package/limax ?

@michaelbromley michaelbromley merged commit 84ba64f into vendure-ecommerce:master Feb 26, 2024
17 of 18 checks passed
@floze floze deleted the patch-3 branch March 2, 2024 15:50
@floze
Copy link
Contributor Author

floze commented Mar 20, 2024

@floze did you check https://www.npmjs.com/package/limax ?

This looks very nice and comprehensive indeed, haven't come across it yet. As with the problem described in this thread and the implications pointed out by @michaelbromley, my pragmatism sensors kind of tingle though...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants