-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Char.toUpper('ß') returns two characters as Char #1001
Comments
Comparing to other implementations, in Haskell |
It seems that German speakers had and solved the same problem.
I think Elm can |
That would be incompatible with Unicode standard, so if Elm wants to follow Unicode standard it can't do this. Also ß isn't the only Unicode character for which case-conversion results in more characters - there are well over 100 such characters. So "fixing" this one case in a way that is not compatible with Unicode would be wrong in my opinion, as that won't fix the whole problem and also makes Elm Unicode incompatible. (I updated OP with a comment that ß isn't the only such character.) |
Then it seems that we need to accept that casing cannot be
|
If it shouldn't be |
It might be best to keep the existing |
Char
is defined to be a single Unicode character andChar.toUpper
is defined as returning a singleChar
.But
Char.toUpper('ß')
returns two Unicode characters as singleChar
. While returned value is correct according to Unicode specification ('ß' is uppercased to two characters), it is not correct according to Elm specification ofChar
andChar.toUpper
.UPDATE: There are also other such characters where case conversion results in different number of characters, for example
Char.toUpper('\u{FB02}')
returns'FL' : Char
. Full list seems to be available at ftp://ftp.unicode.org/Public/UCD/latest/ucd/SpecialCasing.txtThe text was updated successfully, but these errors were encountered: