-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct code point format in Base/Char/show function #33291
Conversation
Two minor changes (both on line 307) to conform to the Unicode Standard. Unicode code points currently display with: 1. Lowercase letters, a - f, when present 2. A leading 0 for 5-digit code point values (i.e. 10000 - 9ffff) However, the Unicode Standard specifies that when using the "U+" notation, you should use: 1. Uppercase letters 2. Leading zeros only when the code point would have fewer than four digits (i.e. 0000 - 0FFF) For reference, the Unicode Standard (two versions to show consistency over time) * [(Version 12.1, 2019) Appendix A: Notational Conventions ⇒ Code Points](http://www.unicode.org/versions/Unicode12.0.0/appA.pdf) * [(Version 4.0.0, 2003) Preface: Notational Conventions ⇒ Code Points](http://www.unicode.org/versions/Unicode4.0.0/Preface.pdf) states: > In running text, an individual Unicode code point is expressed as U+n, where n is four to six hexadecimal digits, using the digits 0–9 and uppercase letters A–F (for 10 through 15, respectively). Leading zeros are omitted, unless the code point would have fewer than four hexadecimal digits—for example, U+0001, U+0012, U+0123, U+1234, U+12345, U+102345.
Looks good to me, but could use a test. |
Hi @stevengj . Fair enough, though I am not sure how formal of a test you are requesting. I do not currently have the ability to compile Julia. However, I did verify the syntax using the Julia command-line (i.e. julia.exe) as shown below. Does this suffice for a trivial change such as this one? Old Syntax
New Syntax
|
Basically we would want something like @test repr("text/plain", 'α') == "'α': Unicode U+03B1 (category Ll: Letter, lowercase)"
@test repr("text/plain", '🐨') == "'🐨': Unicode U+1F428 (category So: Symbol, other)" in You can just edit |
Hi @stevengj . Test file has been updated and added to this PR as requested. Please let me know if there is anything I need to change regarding the tests. I just added a new testset to the end of that file. |
This could use a NEWS entry since it's an observable behavior change. |
Fixed in 64d8ca4 |
Two minor changes (both on line 307) to conform to the Unicode Standard.
Unicode code points currently display with:
However, the Unicode Standard specifies that when using the "U+" notation, you should use:
For reference, the Unicode Standard (two versions to show consistency over time)
states: