P2314 Character sets and encodings #998

wg21bot · 2021-02-22T10:44:10Z

P2314R0 Character sets and encodings (Jens Maurer)

tahonermann · 2021-02-22T15:34:03Z

This paper will be discussed in SG16's telecon scheduled for 2021-02-24.

wg21bot · 2021-03-25T06:37:01Z

P2314R1 Character sets and encodings (Jens Maurer)

tahonermann · 2021-03-26T05:12:47Z

SG16 discussed this paper and a competing paper, P2297, during our February 24th, 2021, March 10th, 2021, and March 24th telecons. The following polls were conducted:

March 10th, 2021:

Poll: Introduce the concept of a 'translation character set' which synthesizes characters for unassigned UCS scalar values.

SF	F	N	A	SA
1	4	0	2	2

Attendance: 9
No consensus

March 24th, 2021:

Poll: Introduce the concept of a 'translation character set' which synthesizes characters for unassigned UCS scalar values.

SF	F	N	A	SA
2	4	1	0	1

Attendance: 9
Consensus in favor

Poll: Forward D2314R2 as presented on 2021-03-24 to EWG for inclusion in C++23.

SF	F	N	A	SA
3	5	0	0	0

Attendance: 9
Consensus in favor

Eight of the nine attendees were common across all of the polls.

Note that the first poll conducted during each of the March 10th and March 24th telecons polls the same question. The subsequent poll suggests that consensus on the question shifted, but I don't believe that is really the case. Rather, I believe that consensus for what the paper achieves is unanimous in SG16, but disagreement over wording details remain, and the shift suggested by the polls is due more to a desire not to hold up progress of the paper over those details than it is that opinions changed. Since the disagreements concern wording details that do not impact intended behavior (more on that below) and since wording details are more of a CWG concern than an SG16 concern, I'm comfortable forwarding this paper.

The wording disagreement concerns the introduction of the "translation character set" abstraction and perceived conflicts with Unicode terminology in its definition (the materialization of a "character" for unassigned UCS scalar values). Some SG16 participants would prefer that translation be specified directly in terms of ISO 10646 UCS scalar values both because that would avoid introducing the "translation character set" abstraction and because it more closely matches how real world implementations operate; compilers recognize characters by associated integer values. Proponents of the "translation character set" abstraction preferred that translation be specified in terms of "characters" rather than the integer values denoted by UCS scalar values since the former more closely matches how people think about lexing behavior (that we tend to think about lexing the character "A" as opposed to the integer value encoding of the character "A"). Proponents on both sides of the disagreement agreed that the desired behavior can be expressed either way; the concern is strictly about presentation within the standard.

Removing the SG16 label; this paper is ready for EWG review.

AaronBallman · 2021-05-06T18:14:13Z

(SG22 administrivia note: this paper needs to be added to the next WG14 omnibus paper in May)

jfbastien · 2021-05-11T16:37:33Z

See at the 2021-05-06 EWG telecon.

POLL:

Send P2314 to Electronic Polling, with the intent of going to Core for C++23.

SF	F	N	A	SA
5	6	0	0	0

wg21bot · 2021-05-21T14:48:54Z

P2314R2 Character sets and encodings (Jens Maurer)

jfbastien · 2021-09-08T15:48:21Z

EWG poll results from https://wg21.link/P1018R13

🗳 Poll: Forward P2314r2 "Character sets and encodings" to Core for C++23.

Poll votes:

SF	F	N	A	SA
14	12	2	0	0

Poll outcome: ✅ consensus.

AaronBallman · 2021-09-14T12:01:40Z

This paper was discussed at the Sep 2021 SG22 meeting and does not need to be seen by SG22 again.

jensmaurer · 2021-09-14T19:54:26Z

CWG 2021-09-14: Review started; will continue next time.

JeffGarland · 2021-09-18T21:37:17Z

The library changes (completely minor) were reviewed and approved at 2021-09-17 telecon. Removed the LWG label

https://wiki.edg.com/bin/view/Wg21telecons2021/P2314-20210917

poll: Library approves of the changes to the Library clauses (16, 27, and 28)

F	A	N
10	0	0

wg21bot · 2021-09-20T06:50:26Z

P2314R3 Character sets and encodings (Jens Maurer)

wg21bot · 2021-10-26T07:13:56Z

P2314R4 Character sets and encodings (Jens Maurer)

wg21bot · 2021-10-26T07:45:23Z

Adopted 2021-10.

wg21bot added EWG Evolution SG16 Text processing labels Feb 22, 2021

wg21bot added this to the 2021-telecon milestone Feb 22, 2021

tahonermann mentioned this issue Feb 22, 2021

P2194 The character set of the internal representation should be Unicode #916

Closed

tahonermann removed the SG16 Text processing label Mar 26, 2021

jensmaurer mentioned this issue Mar 26, 2021

[lex.ccon] What is the single code unit for an ordinary character literal or wide character literal? CWG2779 cplusplus/draft#4517

Open

AaronBallman added the SG22 C / C++ liaison label May 6, 2021

jfbastien added the EWG-vote-on-me EWG can vote on this label May 11, 2021

jensmaurer added LWG Library lwg-pending LWG Chair needs to disposition labels May 11, 2021

jfbastien added CWG Core and removed EWG Evolution EWG-vote-on-me EWG can vote on this labels Sep 8, 2021

AaronBallman removed the SG22 C / C++ liaison label Sep 14, 2021

JeffGarland added the C++23 Targeted at C++23 label Sep 18, 2021

JeffGarland removed LWG Library lwg-pending LWG Chair needs to disposition labels Sep 18, 2021

jensmaurer added the straw-poll Straw poll label Oct 6, 2021

jensmaurer mentioned this issue Oct 6, 2021

P2314R4 Character sets and encodings cplusplus/draft#5004

Merged

tkoeppe closed this as completed in cplusplus/draft#5004 Oct 18, 2021

jensmaurer added plenary-approved Papers approved for inclusion in their target vehicle by plenary vote. and removed straw-poll Straw poll labels Oct 26, 2021

tahonermann mentioned this issue Nov 8, 2021

String literal concatenation in translation phase 5 and 6 contradicts [lex.string] sg16-unicode/sg16#47

Closed

jensmaurer added this to CWG Jul 15, 2024

jensmaurer moved this to Approved for plenary vote in CWG Jul 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

P2314 Character sets and encodings #998

P2314 Character sets and encodings #998

wg21bot commented Feb 22, 2021

tahonermann commented Feb 22, 2021 •

edited

Loading

wg21bot commented Mar 25, 2021

tahonermann commented Mar 26, 2021

AaronBallman commented May 6, 2021

jfbastien commented May 11, 2021

wg21bot commented May 21, 2021

jfbastien commented Sep 8, 2021

AaronBallman commented Sep 14, 2021

jensmaurer commented Sep 14, 2021

JeffGarland commented Sep 18, 2021 •

edited

Loading

wg21bot commented Sep 20, 2021

wg21bot commented Oct 26, 2021

wg21bot commented Oct 26, 2021

P2314 Character sets and encodings #998

P2314 Character sets and encodings #998

Comments

wg21bot commented Feb 22, 2021

tahonermann commented Feb 22, 2021 • edited Loading

wg21bot commented Mar 25, 2021

tahonermann commented Mar 26, 2021

AaronBallman commented May 6, 2021

jfbastien commented May 11, 2021

wg21bot commented May 21, 2021

jfbastien commented Sep 8, 2021

AaronBallman commented Sep 14, 2021

jensmaurer commented Sep 14, 2021

JeffGarland commented Sep 18, 2021 • edited Loading

wg21bot commented Sep 20, 2021

wg21bot commented Oct 26, 2021

wg21bot commented Oct 26, 2021

tahonermann commented Feb 22, 2021 •

edited

Loading

JeffGarland commented Sep 18, 2021 •

edited

Loading