-
Notifications
You must be signed in to change notification settings - Fork 1
P2314 Character sets and encodings #998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This paper will be discussed in SG16's telecon scheduled for 2021-02-24. |
P2314R1 Character sets and encodings (Jens Maurer) |
SG16 discussed this paper and a competing paper, P2297, during our February 24th, 2021, March 10th, 2021, and March 24th telecons. The following polls were conducted: March 10th, 2021: Poll: Introduce the concept of a 'translation character set' which synthesizes characters for unassigned UCS scalar values.
March 24th, 2021: Poll: Introduce the concept of a 'translation character set' which synthesizes characters for unassigned UCS scalar values.
Poll: Forward D2314R2 as presented on 2021-03-24 to EWG for inclusion in C++23.
Eight of the nine attendees were common across all of the polls. Note that the first poll conducted during each of the March 10th and March 24th telecons polls the same question. The subsequent poll suggests that consensus on the question shifted, but I don't believe that is really the case. Rather, I believe that consensus for what the paper achieves is unanimous in SG16, but disagreement over wording details remain, and the shift suggested by the polls is due more to a desire not to hold up progress of the paper over those details than it is that opinions changed. Since the disagreements concern wording details that do not impact intended behavior (more on that below) and since wording details are more of a CWG concern than an SG16 concern, I'm comfortable forwarding this paper. The wording disagreement concerns the introduction of the "translation character set" abstraction and perceived conflicts with Unicode terminology in its definition (the materialization of a "character" for unassigned UCS scalar values). Some SG16 participants would prefer that translation be specified directly in terms of ISO 10646 UCS scalar values both because that would avoid introducing the "translation character set" abstraction and because it more closely matches how real world implementations operate; compilers recognize characters by associated integer values. Proponents of the "translation character set" abstraction preferred that translation be specified in terms of "characters" rather than the integer values denoted by UCS scalar values since the former more closely matches how people think about lexing behavior (that we tend to think about lexing the character "A" as opposed to the integer value encoding of the character "A"). Proponents on both sides of the disagreement agreed that the desired behavior can be expressed either way; the concern is strictly about presentation within the standard. Removing the SG16 label; this paper is ready for EWG review. |
(SG22 administrivia note: this paper needs to be added to the next WG14 omnibus paper in May) |
See at the 2021-05-06 EWG telecon. POLL: Send P2314 to Electronic Polling, with the intent of going to Core for C++23.
|
P2314R2 Character sets and encodings (Jens Maurer) |
EWG poll results from https://wg21.link/P1018R13
Poll votes:
|
This paper was discussed at the Sep 2021 SG22 meeting and does not need to be seen by SG22 again. |
CWG 2021-09-14: Review started; will continue next time. |
The library changes (completely minor) were reviewed and approved at 2021-09-17 telecon. Removed the LWG label https://wiki.edg.com/bin/view/Wg21telecons2021/P2314-20210917 poll: Library approves of the changes to the Library clauses (16, 27, and 28)
|
P2314R3 Character sets and encodings (Jens Maurer) |
P2314R4 Character sets and encodings (Jens Maurer) |
Adopted 2021-10. |
P2314R0 Character sets and encodings (Jens Maurer)
The text was updated successfully, but these errors were encountered: