Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [Bug] Wrong German umlaut handling for usernames at account creation #3504

Closed
Tirokk opened this issue Oct 5, 2020 · 10 comments · Fixed by #4104
Closed

🐛 [Bug] Wrong German umlaut handling for usernames at account creation #3504

Tirokk opened this issue Oct 5, 2020 · 10 comments · Fixed by #4104
Assignees
Labels
bug Something isn't working Github Import good first issue Good for newcomers service: backend

Comments

@Tirokk
Copy link
Member

Tirokk commented Oct 5, 2020

sushidave Authored by sushidave
Apr 25, 2020


🐛 Bugreport

At account creation, usernames are not encoded correctly if the profile name contains one or more German umlauts:
Ä/ä -> a
Ö/ö -> o
Ü/ü -> u

Examples:

  • Sandra Märtens -> sandra-martens
  • Oliver Stöckel -> oliver-stockel
  • Günter Hürst -> gunter-hurst

Steps to reproduce the behavior

  1. Create a new account using one or more German umlauts in the profile name.
  2. Check the username.

Expected behavior

German umlauts should be encoded like this:
Ä/ä -> ae
Ö/ö -> oe
Ü/ü -> ue

Examples:

  • Sandra Märtens -> sandra-maertens
  • Oliver Stöckel -> oliver-stoeckel
  • Günter Hürst -> guenter-huerst

Version & Environment

Type: [any]

  • OS: [any]
  • Browser: [any]
  • Version [any]
  • Device: [any]
@Tirokk
Copy link
Member Author

Tirokk commented Oct 7, 2020

ATOMktn Authored by ATOMktn
Jun 18, 2020


Hi! I'm new to open source and would like to take a crack at this!

@Tirokk
Copy link
Member Author

Tirokk commented Oct 7, 2020

sushidave Authored by sushidave
Jun 18, 2020


Hello @ATOMktn and welcome on board! Thank you for your interest in this project. @Tirokk Please provide ATOMktn with further info. Thanks!

@Tirokk
Copy link
Member Author

Tirokk commented Oct 7, 2020

Tirokk Authored by Tirokk
Jun 22, 2020


Hello @ATOMktn ,

I have invited you to our repository as volunteer via an e-mail invitation you can except.
After you excepted, please push your feature branch directly to this repo and create a PR using master as the base branch.

I assigned you to this issue …

Our conversation runs over our discord channels:
http://hc.world/discord
Another possibility to get to there:
https://discord.com/invite/DFSjPaX

On Discord we have kind of every day a Daily Standup at UTC 09:30 am (CEST 11:30 am) in the Conference Room.

@ulfgebhardt
Copy link
Member

#3676

@Brandon-G-Tripp
Copy link
Contributor

@Tirokk Hey Wolle. I want to discuss taking on this issue. In discord you said we should chat about how to go about this. I have the app up and running again. Somehow it got corrupted and I was missing some modules so I cloned the app down again.

@Tirokk
Copy link
Member Author

Tirokk commented Dec 12, 2020

Hey @Brandon-G-Tripp ,

we have already the PR #3676 for this issue done by @ATOMktn which you should overtake.
But for me this solution was too specific for German.

I would like to have a very general regex solution for all languages and all characters. I mean a regex where every non ANSII characters gets converted to an ANSII string.

Have a quick investigation and If you don’t find anything, please let me know. I think I have seen a solution last year somewhere.

@Brandon-G-Tripp
Copy link
Contributor

@Tirokk I found some ways of replacing all the diacritics in the characters with their standard keyboard equivalent but not the changing of ü -> ue and so on. I will take a look at @ATOMktn pull request and see if I can combine them so we have to correct handling of the umlaut and then just remove the diacritics for other languages.

@ulfgebhardt ulfgebhardt added this to the 20/12 December milestone Dec 14, 2020
@Tirokk
Copy link
Member Author

Tirokk commented Dec 14, 2020

I have investigated a bit and couldn’t find what I meant to have seen.
Yeah, just have a table of umlaut etc. (Unicode) characters to replace and delete the rest or replace them by underlines or dashes. @Brandon-G-Tripp 👍🏼

PS: In case this could be of some help:
https://javascript.info/regexp-unicode

@Brandon-G-Tripp
Copy link
Contributor

@Tirokk Hey I see that the previous work on this issue was just doing a charmap for the characters. Would you prefer this in regex or should I just expand the work there?

@Mogge
Copy link
Contributor

Mogge commented Jan 11, 2021

I have this post title Tösting ße ßlüg ççç ñññ ÄÖÜ and I get the slug: tosting-sse-sslug-ccc-nnn-aou

Brandon-G-Tripp added a commit that referenced this issue Jan 21, 2021
…ut-slug

[WIP] fix: 🍰 Issue #3504 umlaut encoding #3676
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment