From 317860fd424571b335342ab41b1125f933ad1e1a Mon Sep 17 00:00:00 2001 From: bizzy Date: Mon, 12 Dec 2022 13:26:08 +0000 Subject: [PATCH 1/6] improve str.encode and byte.decode --- Doc/library/stdtypes.rst | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 73debe5ceeaf3a..e772e764335cc3 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -1625,13 +1625,15 @@ expression support in the :mod:`re` module). .. method:: str.encode(encoding="utf-8", errors="strict") Return an encoded version of the string as a bytes object. Default encoding - is ``'utf-8'``. *errors* may be given to set a different error handling scheme. - The default for *errors* is ``'strict'``, meaning that encoding errors raise - a :exc:`UnicodeError`. Other possible - values are ``'ignore'``, ``'replace'``, ``'xmlcharrefreplace'``, - ``'backslashreplace'`` and any other name registered via - :func:`codecs.register_error`, see section :ref:`error-handlers`. For a - list of possible encodings, see section :ref:`standard-encodings`. + is ``'utf-8'``. And for a list of possible encodings, see section + :ref:`standard-encodings`. + + *errors* may be given to set a different error handling scheme. The + default for *errors* is ``'strict'``, meaning that encoding errors raise + a :exc:`UnicodeError`. Other possible values are ``'ignore'``, + ``'replace'``, ``'xmlcharrefreplace'``, ``'backslashreplace'`` and any + other name registered via :func:`codecs.register_error`, see section + :ref:`error-handlers`. By default, the *errors* argument is not checked for best performances, but only used at the first encoding error. Enable the :ref:`Python Development @@ -2760,12 +2762,14 @@ arbitrary binary data. bytearray.decode(encoding="utf-8", errors="strict") Return a string decoded from the given bytes. Default encoding is - ``'utf-8'``. *errors* may be given to set a different - error handling scheme. The default for *errors* is ``'strict'``, meaning - that encoding errors raise a :exc:`UnicodeError`. Other possible values are - ``'ignore'``, ``'replace'`` and any other name registered via - :func:`codecs.register_error`, see section :ref:`error-handlers`. For a - list of possible encodings, see section :ref:`standard-encodings`. + ``'utf-8'``. And for a list of possible encodings, see section + :ref:`standard-encodings`. + + *errors* may be given to set a different error handling scheme. The + default for *errors* is ``'strict'``, meaning that encoding errors raise a + :exc:`UnicodeError`. Other possible values are ``'ignore'``, ``'replace'`` + and any other name registered via :func:`codecs.register_error`, see + section :ref:`error-handlers`. By default, the *errors* argument is not checked for best performances, but only used at the first decoding error. Enable the :ref:`Python Development From 86c44a4634059360fbd52a41ddb872cf874a9cd6 Mon Sep 17 00:00:00 2001 From: bizzy Date: Mon, 12 Dec 2022 19:00:44 +0000 Subject: [PATCH 2/6] remove trailing whitespace (to pass workflow) --- Doc/library/stdtypes.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index e772e764335cc3..46c3cd2037f002 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -1627,7 +1627,7 @@ expression support in the :mod:`re` module). Return an encoded version of the string as a bytes object. Default encoding is ``'utf-8'``. And for a list of possible encodings, see section :ref:`standard-encodings`. - + *errors* may be given to set a different error handling scheme. The default for *errors* is ``'strict'``, meaning that encoding errors raise a :exc:`UnicodeError`. Other possible values are ``'ignore'``, @@ -2764,7 +2764,7 @@ arbitrary binary data. Return a string decoded from the given bytes. Default encoding is ``'utf-8'``. And for a list of possible encodings, see section :ref:`standard-encodings`. - + *errors* may be given to set a different error handling scheme. The default for *errors* is ``'strict'``, meaning that encoding errors raise a :exc:`UnicodeError`. Other possible values are ``'ignore'``, ``'replace'`` From 06aaf7b4ced80d10b8f3b82406c4b4d481bbc56f Mon Sep 17 00:00:00 2001 From: Bisola Olasehinde Date: Sat, 17 Dec 2022 11:20:24 +0000 Subject: [PATCH 3/6] Apply suggestions from code review: make docs for str.encode and bytes.decode more clear Co-authored-by: C.A.M. Gerlach --- Doc/library/stdtypes.rst | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 46c3cd2037f002..2c8ce358ef4bbb 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -1624,16 +1624,16 @@ expression support in the :mod:`re` module). .. method:: str.encode(encoding="utf-8", errors="strict") - Return an encoded version of the string as a bytes object. Default encoding - is ``'utf-8'``. And for a list of possible encodings, see section - :ref:`standard-encodings`. + Return the string encoded to :class:`bytes`. + *encoding* defaults to ``'utf-8'``; + see :ref:`standard-encodings` for possible values. - *errors* may be given to set a different error handling scheme. The - default for *errors* is ``'strict'``, meaning that encoding errors raise - a :exc:`UnicodeError`. Other possible values are ``'ignore'``, + *errors* controls how encoding errors are handled. + If ``'strict'`` (the default), a :exc:`UnicodeError` exception is raised. + Other possible values are ``'ignore'``, ``'replace'``, ``'xmlcharrefreplace'``, ``'backslashreplace'`` and any - other name registered via :func:`codecs.register_error`, see section - :ref:`error-handlers`. + other name registered via :func:`codecs.register_error`. + See :ref:`error-handlers` for details. By default, the *errors* argument is not checked for best performances, but only used at the first encoding error. Enable the :ref:`Python Development @@ -2761,15 +2761,15 @@ arbitrary binary data. .. method:: bytes.decode(encoding="utf-8", errors="strict") bytearray.decode(encoding="utf-8", errors="strict") - Return a string decoded from the given bytes. Default encoding is - ``'utf-8'``. And for a list of possible encodings, see section - :ref:`standard-encodings`. + Return the bytes decoded to a :class:`str`. + *encoding* defaults to ``'utf-8'``; + see :ref:`standard-encodings` for possible values. - *errors* may be given to set a different error handling scheme. The - default for *errors* is ``'strict'``, meaning that encoding errors raise a - :exc:`UnicodeError`. Other possible values are ``'ignore'``, ``'replace'`` - and any other name registered via :func:`codecs.register_error`, see - section :ref:`error-handlers`. + *errors* controls how decoding errors are handled. + If ``'strict'`` (the default), a :exc:`UnicodeError` exception is raised. + Other possible values are ``'ignore'``, ``'replace'``, + and any other name registered via :func:`codecs.register_error`. + See :ref:`error-handlers` for details. By default, the *errors* argument is not checked for best performances, but only used at the first decoding error. Enable the :ref:`Python Development From 4330d08074255a2e6a5e9d6dc83836447fd9c705 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Sat, 17 Dec 2022 05:52:24 -0600 Subject: [PATCH 4/6] Add back stripped newline seperating args in encode/decode --- Doc/library/stdtypes.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 2c8ce358ef4bbb..b81269d3c161a9 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -1625,6 +1625,7 @@ expression support in the :mod:`re` module). .. method:: str.encode(encoding="utf-8", errors="strict") Return the string encoded to :class:`bytes`. + *encoding* defaults to ``'utf-8'``; see :ref:`standard-encodings` for possible values. @@ -2762,6 +2763,7 @@ arbitrary binary data. bytearray.decode(encoding="utf-8", errors="strict") Return the bytes decoded to a :class:`str`. + *encoding* defaults to ``'utf-8'``; see :ref:`standard-encodings` for possible values. From dab7ee1df3f4b1365d410a815c349a448ce24897 Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Sat, 17 Dec 2022 05:38:28 -0600 Subject: [PATCH 5/6] Clarify and refine wording of errors encode/decode arg desc --- Doc/library/stdtypes.rst | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index b81269d3c161a9..7e6cf41c5d9da4 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -1636,10 +1636,10 @@ expression support in the :mod:`re` module). other name registered via :func:`codecs.register_error`. See :ref:`error-handlers` for details. - By default, the *errors* argument is not checked for best performances, but - only used at the first encoding error. Enable the :ref:`Python Development - Mode `, or use a :ref:`debug build ` to check - *errors*. + For performance reasons, the value of *errors* is not checked for validity + unless an encoding error actually occurs, + :ref:`devmode` is enabled + or a :ref:`debug build ` is used. .. versionchanged:: 3.1 Support for keyword arguments added. @@ -2773,9 +2773,9 @@ arbitrary binary data. and any other name registered via :func:`codecs.register_error`. See :ref:`error-handlers` for details. - By default, the *errors* argument is not checked for best performances, but - only used at the first decoding error. Enable the :ref:`Python Development - Mode `, or use a :ref:`debug build ` to check *errors*. + For performance reasons, the value of *errors* is not checked for validity + unless a decoding error actually occurs, + :ref:`devmode` is enabled or a :ref:`debug build ` is used. .. note:: From 17e9503e0819e183069fd5b0af871e3530d9a97d Mon Sep 17 00:00:00 2001 From: "C.A.M. Gerlach" Date: Sat, 17 Dec 2022 05:51:33 -0600 Subject: [PATCH 6/6] Refine text of notes and versionchanged for encode & decode --- Doc/library/stdtypes.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 7e6cf41c5d9da4..d9f3f206777be2 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -1642,10 +1642,10 @@ expression support in the :mod:`re` module). or a :ref:`debug build ` is used. .. versionchanged:: 3.1 - Support for keyword arguments added. + Added support for keyword arguments. .. versionchanged:: 3.9 - The *errors* is now checked in development mode and + The value of the *errors* argument is now checked in :ref:`devmode` and in :ref:`debug mode `. @@ -2781,13 +2781,13 @@ arbitrary binary data. Passing the *encoding* argument to :class:`str` allows decoding any :term:`bytes-like object` directly, without needing to make a temporary - bytes or bytearray object. + :class:`!bytes` or :class:`!bytearray` object. .. versionchanged:: 3.1 Added support for keyword arguments. .. versionchanged:: 3.9 - The *errors* is now checked in development mode and + The value of the *errors* argument is now checked in :ref:`devmode` and in :ref:`debug mode `.