Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: add core::ascii::Char::from_digit #118963

Closed
wants to merge 2 commits into from
Closed

Conversation

mina86
Copy link
Contributor

@mina86 mina86 commented Dec 15, 2023

Since core::ascii::Char::digit returns Some for decimal digits only,
introduce core::ascii::Char::from_digit method which handles values up
to 35. It mimics char::from_digit though it forgoes the radix
parameter.

Issue: rust-lang/libs-team#179
Issue: #110998

@rustbot
Copy link
Collaborator

rustbot commented Dec 15, 2023

r? @cuviper

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Dec 15, 2023
@mina86
Copy link
Contributor Author

mina86 commented Dec 15, 2023

cc @scottmcm

@rust-log-analyzer

This comment has been minimized.

Since core::ascii::Char::digit returns Some for decimal digits only,
introduce core::ascii::Char::from_digit method which handles values
up to 35.  It mimics char::from_digit though it forgoes the radix
parameter.

Issue: rust-lang/libs-team#179
Issue: rust-lang#110998
@@ -474,7 +474,8 @@ impl AsciiChar {
/// When passed the *number* `0`, `1`, …, `9`, returns the *character*
/// `'0'`, `'1'`, …, `'9'` respectively.
///
/// If `d >= 10`, returns `None`.
/// If `d >= 10`, returns `None`. To get a digit up to `d == 35`, use
/// [`from_digit`](Self::from_digit) instead.
#[unstable(feature = "ascii_char", issue = "110998")]
#[inline]
pub const fn digit(d: u8) -> Option<Self> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, that only reason I added this method was to use it for things like '0' + d in

radix! { Binary, 2, "0b", x @ 0 ..= 1 => b'0' + x }
radix! { Octal, 8, "0o", x @ 0 ..= 7 => b'0' + x }

But then it looks like it's not actually used. So it's possible that maybe it shouldn't have existed at all; I don't know.

(The idea was to give something to point to when people say "but '0' + d was so nice". But for hex there was never that shorthand, so having people use "…".as_ascii().unwrap()[d] might be fine for those non-decimal uses.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem at the moment is unwrap not being const, so one has to
write some annoying code:

    const DIGITS: &[AsciiChar; 16] = match "...".as_ascii() {
        Some(digits) => digits,
        None => panic!()
    };

It’s not the end of the world of course, but it’s certainly less
convenient that a AsciiChar::from_digit call.

Though I can see reasons to wait for const unwrap or a"..." syntax
(proposed in the tracking issue) before considering adding this
function.


By the way, maybe for b'0' + d the solution is to implement Add?
In debug builds it would panic on overflow while on release it would
wrap (i.e. mask out most significant bit).

#[inline]
pub const fn from_digit(d: u32) -> Option<Self> {
const DIGITS: [AsciiChar; 36] =
*b"0123456789abcdefghijklmnopqrstuvwxyz".as_ascii().unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I don't know how to solve here is that people might also want upper-case. Or base-64 style where a is zero. Why is lower-case special?

Oh, I guess char::from_digit does it this way. Fair enough, then, I guess.

/// ```
#[unstable(feature = "ascii_char", issue = "110998")]
#[inline]
pub const fn from_digit(d: u32) -> Option<Self> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is u32 the best option here? char takes u32, but char is a 4-byte type. Should this take u8 since it's a 1-byte type? Should digit also take u32, if u32 is somehow better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was motivated by char::from_digit. I really don’t know what makes more sense here.

pub const fn from_digit(d: u32) -> Option<Self> {
const DIGITS: [AsciiChar; 36] =
*b"0123456789abcdefghijklmnopqrstuvwxyz".as_ascii().unwrap();
if d < 36 { Some(DIGITS[d as usize]) } else { None }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this more efficient than simply DIGITS.get(d as usize)? Because to me it seems like there is a nother implicit range check in DIGITS[] that could be avoided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get isn’t const so it cannot be used here. The compiler should be smart enough to remove the second range check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

@cuviper
Copy link
Member

cuviper commented Jan 10, 2024

It mimics char::from_digit though it forgoes the radix parameter.

Why do you have this difference?

Anyway, between that and the choice of u8 or u32 input, I think this needs API eyes:
r? libs-api

@rustbot rustbot added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jan 10, 2024
@rustbot rustbot assigned dtolnay and unassigned cuviper Jan 10, 2024
@dtolnay dtolnay removed the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Jan 16, 2024
@dtolnay
Copy link
Member

dtolnay commented Jan 16, 2024

I have the same question as Josh. How about using the same signature as char::from_digit?

@dtolnay dtolnay added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 16, 2024
@bors
Copy link
Contributor

bors commented Jan 20, 2024

☔ The latest upstream changes (presumably #120157) made this pull request unmergeable. Please resolve the merge conflicts.

@rustbot rustbot added the has-merge-commits PR has merge commits, merge with caution. label Jan 20, 2024
@rustbot
Copy link
Collaborator

rustbot commented Jan 20, 2024

There are merge commits (commits with multiple parents) in your changes. We have a no merge policy so these commits will need to be removed for this pull request to be merged.

You can start a rebase with the following commands:

$ # rebase
$ git rebase -i master
$ # delete any merge commits in the editor that appears
$ git push --force-with-lease

The following commits are merge commits:

@mina86
Copy link
Contributor Author

mina86 commented Jan 20, 2024

It’s mostly because I think char::from_digit interface is over-engineered. To be honest I’m now of the opinion that as commented above it’s going to be better to implement Add.

@dtolnay
Copy link
Member

dtolnay commented Jan 21, 2024

Sounds good, I'll close this in favor of a different PR to pursue an Add approach.

@dtolnay dtolnay closed this Jan 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants