Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify documentation on char::MAX #93392

Merged
merged 2 commits into from
Jan 31, 2022
Merged

Clarify documentation on char::MAX #93392

merged 2 commits into from
Jan 31, 2022

Conversation

GKFX
Copy link
Contributor

@GKFX GKFX commented Jan 27, 2022

As mentioned in #91836 (comment), the documentation on char::MAX is not quite correct – USVs are not "only ones within a certain range", they are code points outside a certain range. I have corrected this and given the actual numbers as there is no reason to hide them.

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @scottmcm (or someone else) soon.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 27, 2022
Copy link
Member

@scottmcm scottmcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think part of the problem is that this is the wrong place to be having a conversation about USVs/UCPs. How about just removing that from the documentation of this const, since it's already there at the top of the documentation for char https://doc.rust-lang.org/std/primitive.char.html?

Here's a first stab towards that direction, feel free to take directly or make your own:

/// The highest possible value a `char` can have, `'\u{10FFFF}'`.
///
/// # Examples
///
/// ```
/// # fn something_which_returns_char() -> char { 'a' }
/// let c: char = something_which_returns_char();
/// assert_eq!(c.max(char::MAX), char::MAX);
///
/// let value_after_max = char::MAX as u32 + 1;
/// assert_eq!(char::from_u32(value_after_max), None);
/// ```

And then maybe another PR could refactor the documentation on the primitive to have more about USV vs UCP vs grapheme vs whatever, as well as mention validity invariants, exhaustive matching on char, ...

@GKFX
Copy link
Contributor Author

GKFX commented Jan 27, 2022

It's an unusual constant because for all numeric types, all values in the range type::MIN..=type::MAX are valid values of that type. char breaks that expectation which I think is what brings the conversation in there. Certainly I think you could drop code points from this section and cut it to just stating the permitted range of values with:

The highest possible value a char can have. As a Unicode Scalar Value, a char may be any value in the ranges '\0' to '\u{D7FF}' and '\u{E000}' to '\u{10FFFF}' inclusive.

I like your examples although I would phrase the first assertion as assert!(c <= char::MAX); which means the same thing but is shorter.

@scottmcm
Copy link
Member

a char may be any value in the ranges

I just think it's better to have that be a section on char, rather than saying it here, on char::from_u32, etc.

which means the same thing but is shorter

Good point; that's much better.

@GKFX
Copy link
Contributor Author

GKFX commented Jan 30, 2022

I've updated the documentation pretty much as in your version, I'll make a new PR for the documentation directly on char.

@scottmcm
Copy link
Member

Thanks for making these PRs!

@bors r+ rollup=always

@bors
Copy link
Contributor

bors commented Jan 31, 2022

📌 Commit 9aaf52b has been approved by scottmcm

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 31, 2022
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 31, 2022
…askrgr

Rollup of 8 pull requests

Successful merges:

 - rust-lang#90277 (Improve terminology around "after typeck")
 - rust-lang#92918 (Allow eliding GATs in expression position)
 - rust-lang#93039 (Don't suggest inaccessible fields)
 - rust-lang#93155 (Switch pretty printer to block-based indentation)
 - rust-lang#93214 (Respect doc(hidden) when suggesting available fields)
 - rust-lang#93347 (Make `char::DecodeUtf16::size_hist` more precise)
 - rust-lang#93392 (Clarify documentation on char::MAX)
 - rust-lang#93444 (Fix some CSS warnings and errors from VS Code)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit c03bf54 into rust-lang:master Jan 31, 2022
@rustbot rustbot added this to the 1.60.0 milestone Jan 31, 2022
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 2, 2022
Document valid values of the char type

As discussed at rust-lang#93392, the current documentation on what constitutes a valid char isn't very detailed and is partly on the MAX constant rather than the type itself.

This PR expands on that information, stating the actual numerical range, giving examples of what won't work, and also mentions how a `char` might be a valid USV but still not be a defined character (terminology checked against [Unicode 14.0, table 2-3](https://www.unicode.org/versions/Unicode14.0.0/ch02.pdf#M9.61673.TableTitle.Table.22.Types.of.Code.Points)).
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 2, 2022
Document valid values of the char type

As discussed at rust-lang#93392, the current documentation on what constitutes a valid char isn't very detailed and is partly on the MAX constant rather than the type itself.

This PR expands on that information, stating the actual numerical range, giving examples of what won't work, and also mentions how a `char` might be a valid USV but still not be a defined character (terminology checked against [Unicode 14.0, table 2-3](https://www.unicode.org/versions/Unicode14.0.0/ch02.pdf#M9.61673.TableTitle.Table.22.Types.of.Code.Points)).
@GKFX GKFX deleted the char-docs branch January 2, 2024 22:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants