Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for const_unicode_case_lookup #101400

Open
1 of 3 tasks
mx00s opened this issue Sep 3, 2022 · 9 comments
Open
1 of 3 tasks

Tracking Issue for const_unicode_case_lookup #101400

mx00s opened this issue Sep 3, 2022 · 9 comments
Labels
A-unicode Area: Unicode C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@mx00s
Copy link
Contributor

mx00s commented Sep 3, 2022

Feature gate: #![feature(const_unicode_case_lookup)]

This is a tracking issue for making char::is_lowercase and char::is_uppercase const.

Public API

Example:

const CAPITAL_DELTA_IS_UPPERCASE: bool = 'Δ'.is_uppercase();

Steps / History

Unresolved Questions

  • None yet.

Footnotes

  1. https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html

@mx00s mx00s added C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Sep 3, 2022
mx00s added a commit to mx00s/rust that referenced this issue Sep 4, 2022
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Sep 4, 2022
Make `char::is_lowercase` and `char::is_uppercase` const

Implements rust-lang#101400.
@workingjubilee workingjubilee added the A-unicode Area: Unicode label Jul 22, 2023
@DaniPopes
Copy link
Contributor

DaniPopes commented Sep 28, 2023

This has been unstable for more than a year, and since there are no reported issues or blockers I think this can be stabilized.

The full API is making char::{is_lowercase,is_uppercase} const-stable.

@dtolnay
Copy link
Member

dtolnay commented Oct 2, 2023

From skimming the implementation PR, I am apprehensive about seeing it convert many Unicode tables from static to const. Consts are defined to behave as if their definition is copy-pasted at every use. Statics are not; they appear at most once, at a single address. See also #82676.

I would not want to regret being forced to define Unicode tables as const when we later find that static's semantics are more appropriate for them.

It would be good to get reassurance from a compiler perspective about whether committing to representing Unicode tables as const is going to be problematic.

@dtolnay dtolnay added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. and removed T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Oct 2, 2023
@RalfJung
Copy link
Member

From skimming the implementation PR, I am apprehensive about seeing it convert many Unicode tables from static to const. Consts are defined to behave as if their definition is copy-pasted at every use. Statics are not; they appear at most once, at a single address.

#131641 resolves that. So I think this is ready for const-stabilization?

@RalfJung RalfJung added the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Oct 13, 2024
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Oct 13, 2024
…dtolnay

switch unicode-data bitsets back to 'static'

Back in rust-lang#101401, these were changed to `const` to make some functions `const fn`. However, `@dtolnay` was [not happy](rust-lang#101400 (comment)) about this. Meanwhile, `const fn` can access immutable statics like these, so we can change this back.

Part of rust-lang#101400.
@dtolnay dtolnay removed the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Oct 13, 2024
@rfcbot
Copy link

rfcbot commented Oct 13, 2024

Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Oct 13, 2024
@dtolnay dtolnay removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Oct 13, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Oct 13, 2024
Rollup merge of rust-lang#131641 - RalfJung:unicode-bitset-static, r=dtolnay

switch unicode-data bitsets back to 'static'

Back in rust-lang#101401, these were changed to `const` to make some functions `const fn`. However, `@dtolnay` was [not happy](rust-lang#101400 (comment)) about this. Meanwhile, `const fn` can access immutable statics like these, so we can change this back.

Part of rust-lang#101400.
@joshtriplett
Copy link
Member

This looks great. Could we do the same for the other char methods, like is_digit and is_alphabetic?

@RalfJung
Copy link
Member

RalfJung commented Nov 2, 2024

@Amanieu @BurntSushi @m-ou-se friendly FCP checkbox reminder. :) This function is long stable, so making it const-stable should be uncontroversial.

@RalfJung
Copy link
Member

RalfJung commented Nov 2, 2024

This looks great. Could we do the same for the other char methods, like is_digit and is_alphabetic?

is_digit is tracked at #132241. Looks like nobody tried making is_alphabetic const-callable yet.

EDIT: is_alphabetic is tricky since it needs a quite complicated function that currently uses a whole bunch of non-const machinery, such as binary search and iterators. So that one is out of reach currently.

EDIT2: The only other one we can currently make const is is_whitespace: #132500

@rfcbot rfcbot added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label Nov 2, 2024
@rfcbot
Copy link

rfcbot commented Nov 2, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Nov 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-unicode Area: Unicode C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants