Skip to content

P2314R4 Character sets and encodings #5004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 18, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 8 additions & 9 deletions source/basic.tex
Original file line number Diff line number Diff line change
Expand Up @@ -2978,9 +2978,10 @@
\indextext{memory model|(}%
The fundamental storage unit in the \Cpp{} memory model is the
\defn{byte}.
A byte is at least large enough to contain any member of the basic
\indextext{character set!basic execution}%
execution character set\iref{lex.charset}
A byte is at least large enough to contain
the ordinary literal encoding of any element of the basic
\indextext{character set!basic literal}%
literal character set\iref{lex.charset}
and the eight-bit code units of the Unicode
\begin{footnote}
Unicode\textregistered\ is a registered trademark of Unicode, Inc.
Expand Down Expand Up @@ -4880,8 +4881,6 @@
Type \keyword{char} is a distinct type
that has an \impldef{underlying type of \tcode{char}} choice of
``\tcode{\keyword{signed} \keyword{char}}'' or ``\tcode{\keyword{unsigned} \keyword{char}}'' as its underlying type.
The values of type \keyword{char} can represent distinct codes
for all members of the implementation's basic character set.
The three types \keyword{char}, \tcode{\keyword{signed} \keyword{char}}, and \tcode{\keyword{unsigned} \keyword{char}}
are collectively called
\defnadjx{ordinary character}{types}{type}.
Expand Down Expand Up @@ -4942,10 +4941,10 @@

\pnum
\indextext{type!integral}%
Types
\keyword{bool},
\keyword{char}, \keyword{wchar_t},
\keyword{char8_t}, \keyword{char16_t}, \keyword{char32_t}, and
The types \keyword{char}, \keyword{wchar_t},
\keyword{char8_t}, \keyword{char16_t}, and \keyword{char32_t}
are collectively called \defnadjx{character}{types}{type}.
The character types, \keyword{bool},
the signed and unsigned integer types,
and cv-qualified versions\iref{basic.type.qualifier} thereof,
are collectively termed
Expand Down
4 changes: 2 additions & 2 deletions source/compatibility.tex
Original file line number Diff line number Diff line change
Expand Up @@ -801,8 +801,8 @@
semantics in this revision of \Cpp{}. Implementations may choose to
translate trigraphs as specified in \CppXIV{} if they appear outside of a raw
string literal, as part of the \impldef{mapping from physical source file characters
to basic source character set} mapping from physical source file characters to
the basic source character set.
to translation character set} mapping from physical source file characters to
the translation character set.

\diffref{lex.ppnumber}
\change
Expand Down
4 changes: 2 additions & 2 deletions source/expressions.tex
Original file line number Diff line number Diff line change
Expand Up @@ -1181,10 +1181,10 @@
\pnum
\indextext{literal}%
\indextext{constant}%
A \grammarterm{literal} is a primary expression.
The type of a \grammarterm{literal}
is determined based on its form as specified in \ref{lex.literal}.
A \grammarterm{string-literal} is an lvalue,
A \grammarterm{string-literal} is an lvalue
designating a corresponding string literal object\iref{lex.string},
a \grammarterm{user-defined-literal}
has the same value category
as the corresponding operator call expression described in \ref{lex.ext},
Expand Down
9 changes: 2 additions & 7 deletions source/intro.tex
Original file line number Diff line number Diff line change
Expand Up @@ -420,13 +420,8 @@

\indexdefn{character!multibyte}%
\definition{multibyte character}{defns.multibyte}
sequence of one or more bytes representing a member of the extended
character set of either the source or the execution environment

\begin{defnote}
The extended character set is a superset of the basic character
set\iref{lex.charset}.
\end{defnote}
sequence of one or more bytes representing
the code unit sequence for an encoded character of the execution character set

\definition{NTCTS}{defns.ntcts}
\defncontext{library}
Expand Down
2 changes: 1 addition & 1 deletion source/iostreams.tex
Original file line number Diff line number Diff line change
Expand Up @@ -13167,7 +13167,7 @@
for pathnames\iref{fs.class.path}.
The \defn{native encoding} for wide character strings is
the implementation-defined execution
wide-character set encoding\iref{lex.charset}.
wide-character set encoding\iref{character.seq}.

\pnum
For member function arguments that take character sequences representing
Expand Down
Loading