From 362c4698ed3bab2ea11cc8da8ed849dee4395e3d Mon Sep 17 00:00:00 2001 From: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Date: Thu, 6 Feb 2025 23:44:04 +0800 Subject: [PATCH 01/11] Document the tail-calling interpreter --- Doc/using/configure.rst | 10 ++++++++++ Doc/whatsnew/3.14.rst | 21 +++++++++++++++++++++ 2 files changed, 31 insertions(+) diff --git a/Doc/using/configure.rst b/Doc/using/configure.rst index 629859e36cb654..7d0d0236209310 100644 --- a/Doc/using/configure.rst +++ b/Doc/using/configure.rst @@ -618,6 +618,16 @@ also be used to improve performance. Enable computed gotos in evaluation loop (enabled by default on supported compilers). +.. option:: --with-tail-calling-interp + + Enable interpreters using tail calls in CPython. If enabled, enabling PGO + (:option:`--enable-optimizations`) is highly recommended. This option specifically + requires a C compiler with proper tail call support, and the + `preserve_none `_ + calling convention. + + .. versionadded:: 3.14 + .. option:: --without-mimalloc Disable the fast :ref:`mimalloc ` allocator diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index 59c432d30a342b..195214c62bbda8 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -209,6 +209,27 @@ configuration mechanisms). :pep:`741`. +A new tail-calling interpreter +------------------------------ + +A new type of interpreter based on tail calls has been added to CPython. +For certain newer compilers. This interpreter provides +significantly better performance. Preliminary numbers on our machines suggest +anywhere from -3% to 40% faster Python code, and a geometric mean of 9-15% +faster depending on platform and architecture. + +This interpreter currently only works with `clang-19` and newer +on x86-64 and AArch64 architectures. However, we expect +that a future release of GCC will support this as well. + +This feature is opt-in for now. Based on our own testing, +this new interpreter only works well when profile-guided optimization is enabled. +For further information on how to build Python, please see +:option:`--with-tail-calling-interp`. + +(Contributed by Ken Jin in :gh:`128718`, with ideas on how to implement this +in CPython by Mark Shannon, Garret Gu, Haoran Xu, and Josh Haberman.) + Other language changes ====================== From 566ae0979d2773ab3563d5601ae1082e6ca93802 Mon Sep 17 00:00:00 2001 From: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Date: Thu, 6 Feb 2025 23:45:00 +0800 Subject: [PATCH 02/11] keep within 79 --- Doc/whatsnew/3.14.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index 195214c62bbda8..fa154dfb56fe5b 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -223,8 +223,8 @@ on x86-64 and AArch64 architectures. However, we expect that a future release of GCC will support this as well. This feature is opt-in for now. Based on our own testing, -this new interpreter only works well when profile-guided optimization is enabled. -For further information on how to build Python, please see +this new interpreter only works well when profile-guided optimization is +enabled. For further information on how to build Python, please see :option:`--with-tail-calling-interp`. (Contributed by Ken Jin in :gh:`128718`, with ideas on how to implement this From b463a49de8b973e90daccc7f6dcab7d83f6b943c Mon Sep 17 00:00:00 2001 From: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Date: Thu, 6 Feb 2025 23:45:52 +0800 Subject: [PATCH 03/11] Specify pyperformance --- Doc/whatsnew/3.14.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index fa154dfb56fe5b..18b6e580ed0a99 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -216,7 +216,7 @@ A new type of interpreter based on tail calls has been added to CPython. For certain newer compilers. This interpreter provides significantly better performance. Preliminary numbers on our machines suggest anywhere from -3% to 40% faster Python code, and a geometric mean of 9-15% -faster depending on platform and architecture. +faster on ``pyperformance`` depending on platform and architecture. This interpreter currently only works with `clang-19` and newer on x86-64 and AArch64 architectures. However, we expect From 680d849986511f39c65f508d3ccfd1e91c8f189e Mon Sep 17 00:00:00 2001 From: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Date: Thu, 6 Feb 2025 23:47:06 +0800 Subject: [PATCH 04/11] grammar mistake --- Doc/whatsnew/3.14.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index 18b6e580ed0a99..a54e31f4494d8b 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -213,7 +213,7 @@ A new tail-calling interpreter ------------------------------ A new type of interpreter based on tail calls has been added to CPython. -For certain newer compilers. This interpreter provides +For certain newer compilers, this interpreter provides significantly better performance. Preliminary numbers on our machines suggest anywhere from -3% to 40% faster Python code, and a geometric mean of 9-15% faster on ``pyperformance`` depending on platform and architecture. From 913f8b450ecc3770bc652a1813abef587f60aea7 Mon Sep 17 00:00:00 2001 From: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Date: Thu, 6 Feb 2025 23:47:36 +0800 Subject: [PATCH 05/11] lint --- Doc/whatsnew/3.14.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index a54e31f4494d8b..542230c7271ac0 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -218,7 +218,7 @@ significantly better performance. Preliminary numbers on our machines suggest anywhere from -3% to 40% faster Python code, and a geometric mean of 9-15% faster on ``pyperformance`` depending on platform and architecture. -This interpreter currently only works with `clang-19` and newer +This interpreter currently only works with ``clang-19`` and newer on x86-64 and AArch64 architectures. However, we expect that a future release of GCC will support this as well. From 1994d22d5b9fb5a284ed244201f3001dbb589c28 Mon Sep 17 00:00:00 2001 From: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Date: Fri, 7 Feb 2025 08:52:49 +0800 Subject: [PATCH 06/11] Apply review suggestions by Hugo --- Doc/using/configure.rst | 4 ++-- Doc/whatsnew/3.14.rst | 7 ++++--- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/Doc/using/configure.rst b/Doc/using/configure.rst index 7d0d0236209310..bf89013c28a6bb 100644 --- a/Doc/using/configure.rst +++ b/Doc/using/configure.rst @@ -624,9 +624,9 @@ also be used to improve performance. (:option:`--enable-optimizations`) is highly recommended. This option specifically requires a C compiler with proper tail call support, and the `preserve_none `_ - calling convention. + calling convention. For example, Clang 19 and newer supports this feature. - .. versionadded:: 3.14 + .. versionadded:: next .. option:: --without-mimalloc diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index 542230c7271ac0..34dc5eb5aff74b 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -68,7 +68,7 @@ Summary -- release highlights * :ref:`PEP 649: deferred evaluation of annotations ` * :ref:`PEP 741: Python Configuration C API ` * :ref:`PEP 761: Discontinuation of PGP signatures ` - +* :ref:`A new tail-calling interpreter ` New features ============ @@ -208,6 +208,7 @@ configuration mechanisms). .. seealso:: :pep:`741`. +.. _whatsnew314-tail-call: A new tail-calling interpreter ------------------------------ @@ -215,10 +216,10 @@ A new tail-calling interpreter A new type of interpreter based on tail calls has been added to CPython. For certain newer compilers, this interpreter provides significantly better performance. Preliminary numbers on our machines suggest -anywhere from -3% to 40% faster Python code, and a geometric mean of 9-15% +anywhere from -3% to 30% faster Python code, and a geometric mean of 9-15% faster on ``pyperformance`` depending on platform and architecture. -This interpreter currently only works with ``clang-19`` and newer +This interpreter currently only works with Clang 19 and newer on x86-64 and AArch64 architectures. However, we expect that a future release of GCC will support this as well. From 28e8590dc396d5298426a529c31785b3da03ccf6 Mon Sep 17 00:00:00 2001 From: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Date: Fri, 7 Feb 2025 08:56:33 +0800 Subject: [PATCH 07/11] Fix whitespace --- Doc/whatsnew/3.14.rst | 2 ++ 1 file changed, 2 insertions(+) diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index 34dc5eb5aff74b..a12ce633c92139 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -70,6 +70,7 @@ Summary -- release highlights * :ref:`PEP 761: Discontinuation of PGP signatures ` * :ref:`A new tail-calling interpreter ` + New features ============ @@ -231,6 +232,7 @@ enabled. For further information on how to build Python, please see (Contributed by Ken Jin in :gh:`128718`, with ideas on how to implement this in CPython by Mark Shannon, Garret Gu, Haoran Xu, and Josh Haberman.) + Other language changes ====================== From 1091fd750ad1341c28463e85a80fa123d8143b12 Mon Sep 17 00:00:00 2001 From: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Date: Fri, 7 Feb 2025 16:14:29 +0800 Subject: [PATCH 08/11] Improve wording by Shantanu --- Doc/whatsnew/3.14.rst | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index a12ce633c92139..e1c115fc19cc27 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -224,9 +224,10 @@ This interpreter currently only works with Clang 19 and newer on x86-64 and AArch64 architectures. However, we expect that a future release of GCC will support this as well. -This feature is opt-in for now. Based on our own testing, -this new interpreter only works well when profile-guided optimization is -enabled. For further information on how to build Python, please see +This feature is opt-in for now. We highly recommend enabling profile-guided +optimization with the new interpreter as it is the only configuration we have +fully tested and can validate its improved performance. +For further information on how to build Python, please see :option:`--with-tail-calling-interp`. (Contributed by Ken Jin in :gh:`128718`, with ideas on how to implement this From 7ee75173c7495bbed6908e624f3fa5b91f6db740 Mon Sep 17 00:00:00 2001 From: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Date: Fri, 7 Feb 2025 19:23:40 +0800 Subject: [PATCH 09/11] Address review --- Doc/using/configure.rst | 2 +- Doc/whatsnew/3.14.rst | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Doc/using/configure.rst b/Doc/using/configure.rst index bf89013c28a6bb..101c43576b0314 100644 --- a/Doc/using/configure.rst +++ b/Doc/using/configure.rst @@ -618,7 +618,7 @@ also be used to improve performance. Enable computed gotos in evaluation loop (enabled by default on supported compilers). -.. option:: --with-tail-calling-interp +.. option:: --with-tail-call-interp Enable interpreters using tail calls in CPython. If enabled, enabling PGO (:option:`--enable-optimizations`) is highly recommended. This option specifically diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index e1c115fc19cc27..db7ffbe4c9f68a 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -228,7 +228,7 @@ This feature is opt-in for now. We highly recommend enabling profile-guided optimization with the new interpreter as it is the only configuration we have fully tested and can validate its improved performance. For further information on how to build Python, please see -:option:`--with-tail-calling-interp`. +:option:`--with-tail-call-interp`. (Contributed by Ken Jin in :gh:`128718`, with ideas on how to implement this in CPython by Mark Shannon, Garret Gu, Haoran Xu, and Josh Haberman.) From 1d5b7f45b17cb32b5a5378e1990943a1baec5954 Mon Sep 17 00:00:00 2001 From: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Date: Fri, 7 Feb 2025 19:32:13 +0800 Subject: [PATCH 10/11] remove fully --- Doc/whatsnew/3.14.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index db7ffbe4c9f68a..41eec20721b2d4 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -226,7 +226,7 @@ that a future release of GCC will support this as well. This feature is opt-in for now. We highly recommend enabling profile-guided optimization with the new interpreter as it is the only configuration we have -fully tested and can validate its improved performance. +tested and can validate its improved performance. For further information on how to build Python, please see :option:`--with-tail-call-interp`. From d2637d0002e72c3cba3ee3ef690620a9ec0129ca Mon Sep 17 00:00:00 2001 From: Ken Jin Date: Fri, 7 Feb 2025 19:49:20 +0800 Subject: [PATCH 11/11] Update Doc/whatsnew/3.14.rst Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> --- Doc/whatsnew/3.14.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst index 41eec20721b2d4..b4aa691c968f7c 100644 --- a/Doc/whatsnew/3.14.rst +++ b/Doc/whatsnew/3.14.rst @@ -227,7 +227,7 @@ that a future release of GCC will support this as well. This feature is opt-in for now. We highly recommend enabling profile-guided optimization with the new interpreter as it is the only configuration we have tested and can validate its improved performance. -For further information on how to build Python, please see +For further information on how to build Python, see :option:`--with-tail-call-interp`. (Contributed by Ken Jin in :gh:`128718`, with ideas on how to implement this