gh-139888: Add PyTupleWriter C API #139891

vstinner · 2025-10-10T08:05:11Z

Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() helper functions.
Modify PySequence_Tuple() to use PyTupleWriter API.
Soft deprecate _PyTuple_Resize().

Issue: [C API] Add PyTupleWriter API #139888

📚 Documentation preview 📚: https://cpython-previews--139891.org.readthedocs.build/

* Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() helper functions. * Modify PySequence_Tuple() to use PyTupleWriter API. * Soft deprecate _PyTuple_Resize().

markshannon · 2025-10-10T08:55:37Z

Please don't add any APIs for tracking. Tracking, or untracking, is the job of the VM. We might not even have tracking in the future. FT already tracks objects differently.

Deprecate _PyTuple_Resize() as hard as you like 🙂; it is nonsense as should removed as soon as possible.
Please deprecate PyTuple_New as well.

I think the most useful new API we could add is PyTuple_MakePair(). Making a tuple from two objects is very common.

vstinner · 2025-10-10T08:59:18Z

Please don't add any APIs for tracking. Tracking, or untracking, is the job of the VM. We might not even have tracking in the future. FT already tracks objects differently.

Are you talking about _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack()? These functions are not usable outside tupleobject.c, they are declared as static.

vstinner · 2025-10-10T09:04:30Z

Deprecate _PyTuple_Resize() as hard as you like 🙂; it is nonsense as should removed as soon as possible.

For now, I prefer to only soft deprecate it. It's documented and used by too many C extensions.

Please deprecate PyTuple_New as well.

Well, I'm open to soft deprecate it. But deprecating it would affect too many C extensions IMO.

I think the most useful new API we could add is PyTuple_MakePair(). Making a tuple from two objects is very common.

We might add PyTuple_Pack2() function. But that should be a separated issue.

PyTupleWriter is mostly useful when you don't know the tuple size in advance. For example, when you consume an iterator.

markshannon · 2025-10-10T09:06:41Z

Where is the API specified? It seems rather inefficient, needing to heap allocate the writer.
It should to be as efficient as possible, or we won't be able to persuade people to switch away from using PyTuple_New.

There should be no need for a method to create a tuple writer, it can be a small object that can stack allocated and be zero initialized.

    PyTupleWriter writer = { 0 };

It also needs a function to consume the reference of the item, like PyTuple_SETITEM but safer.

    PyTupleWriter_AddConsumeRef(&writer, item);

Maybe add bulk adds as well?

    PyTupleWriter_AddArray(PyTupleWriter *writer, PyObject **array, intptr_t count);

markshannon · 2025-10-10T09:08:44Z

Well, I'm open to soft deprecate it. But deprecating it would affect too many C extensions IMO.

It is unfortunate that so many extensions use it, but it is still broken. The sooner we deprecate it, the better, as we can give people more warning. We do need a good story for how to replace it.

markshannon · 2025-10-10T09:11:10Z

PyTupleWriter is mostly useful when you don't know the tuple size in advance. For example, when you consume an iterator.

If you are consuming an iterator, PySequence_Tuple is much simpler thanPyTupleWriter.
TBH, if you're interacting with Python objects at that level your best option is probably Python not C.

markshannon · 2025-10-10T09:13:16Z

I see the value in this as a nice, safe replacement for the PyTuple_New PyTuple_SET_ITEM combo.
So the API needs to be efficient, and easy to port to.

vstinner · 2025-10-10T09:16:17Z

Where is the API specified? It seems rather inefficient, needing to heap allocate the writer.

The API is:

PyTupleWriter* PyTupleWriter_Create(Py_ssize_t size);
int PyTupleWriter_Add(PyTupleWriter *writer, PyObject *item);
PyObject* PyTupleWriter_Finish(PyTupleWriter *writer);
void PyTupleWriter_Discard(PyTupleWriter *writer);

PyTupleWriter_Add() creates a new reference, it doesn't take the ownership of item.

It seems rather inefficient, needing to heap allocate the writer.

I designed the API to be compatible with the stable ABI later. So the writer is allocated on the heap to hide the structure members from the public C API.

The implementation uses a free list which makes the allocation basically free in terms of performance.

It also needs a function to consume the reference of the item, like PyTuple_SETITEM but safer.

I can add int PyTupleWriter_AddSteal(PyTupleWriter *writer, PyObject *item) variant which takes the ownership of the item. The C API Working Group recently expressed its preference for the Steal term for such API.

Maybe add bulk adds as well?

That sounds like a good idea, it would be similar to PyTuple_FromArray().

markshannon · 2025-10-10T09:46:02Z

So the writer is allocated on the heap to hide the structure members from the public C API.

As long as setting all the fields to zero initializes it, then only the size need be fixed.

The implementation uses a free list which makes the allocation basically free in terms of performance.

That's not true. Free lists can have poor locality of reference, and the code can be quite branchy. Plus there's the overhead of the function call.

vstinner · 2025-10-10T09:51:54Z

I updated the PR to add PyTupleWriter_AddSteal() and PyTupleWriter_AddArray() functions, and hard deprecate _PyTuple_Resize().

Change also the exception to SystemError for this error.

vstinner · 2025-10-10T10:49:39Z

UPDATE: There was a bug in my benchmark. I fixed it and reran the benchmark. Now it's faster instead of slower for tuple-1000 😁

Benchmark comparing tuple to writer:

tuple: PyTuple_New() and PyTuple_SetItem()
writer: PyTupleWriter_Create(), PyTupleWriter_AddSteal() and PyTupleWriter_Finish().

Benchmark	tuple	writer
tuple-1	37.4 ns	41.3 ns: 1.10x slower
tuple-5	65.7 ns	68.8 ns: 1.05x slower
tuple-10	99.9 ns	102 ns: 1.02x slower
tuple-100	800 ns	762 ns: 1.05x faster
tuple-1000	7.68 us	7.28 us: 1.05x faster
Geometric mean	(ref)	1.01x slower

tuple-1 is the worst case scenario, measure the overhead of the abstraction: it's only 3.9 nanoseconds slower.

Benchmark:

Patch:

diff --git a/Modules/_testcapimodule.c b/Modules/_testcapimodule.c
index 4e73be20e1b..27c3c02c7fc 100644
--- a/Modules/_testcapimodule.c
+++ b/Modules/_testcapimodule.c
@@ -2562,6 +2562,76 @@ toggle_reftrace_printer(PyObject *ob, PyObject *arg)
     Py_RETURN_NONE;
 }
 
+static PyObject *
+bench_tuple(PyObject *ob, PyObject *args)
+{
+    Py_ssize_t size, loops;
+    if (!PyArg_ParseTuple(args, "nn", &size, &loops)) {
+        return NULL;
+    }
+
+    PyTime_t t1, t2;
+    PyTime_PerfCounterRaw(&t1);
+    for (Py_ssize_t i=0; i < loops; i++) {
+        PyObject *tuple = PyTuple_New(size);
+        if (tuple == NULL) {
+            return NULL;
+        }
+
+        for (int i=0; i < size; i++) {
+            PyObject *item = PyLong_FromLong(i);
+            if (item == NULL) {
+                return NULL;
+            }
+            if (PyTuple_SetItem(tuple, i, item) < 0) {
+                Py_DECREF(tuple);
+                return NULL;
+            }
+        }
+
+        Py_DECREF(tuple);
+    }
+    PyTime_PerfCounterRaw(&t2);
+    return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
+static PyObject *
+bench_writer(PyObject *ob, PyObject *args)
+{
+    Py_ssize_t size, loops;
+    if (!PyArg_ParseTuple(args, "nn", &size, &loops)) {
+        return NULL;
+    }
+
+    PyTime_t t1, t2;
+    PyTime_PerfCounterRaw(&t1);
+    for (Py_ssize_t i=0; i < loops; i++) {
+        PyTupleWriter *writer = PyTupleWriter_Create(size);
+        if (writer == NULL) {
+            return NULL;
+        }
+
+        for (int i=0; i < size; i++) {
+            PyObject *item = PyLong_FromLong(i);
+            if (item == NULL) {
+                return NULL;
+            }
+            if (PyTupleWriter_AddSteal(writer, item) < 0) {
+                PyTupleWriter_Discard(writer);
+                return NULL;
+            }
+        }
+
+        PyObject *tuple = PyTupleWriter_Finish(writer);
+        if (tuple == NULL) {
+            return NULL;
+        }
+        Py_DECREF(tuple);
+    }
+    PyTime_PerfCounterRaw(&t2);
+    return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
 static PyMethodDef TestMethods[] = {
     {"set_errno",               set_errno,                       METH_VARARGS},
     {"test_config",             test_config,                     METH_NOARGS},
@@ -2656,6 +2726,8 @@ static PyMethodDef TestMethods[] = {
     {"test_atexit", test_atexit, METH_NOARGS},
     {"code_offset_to_line", _PyCFunction_CAST(code_offset_to_line), METH_FASTCALL},
     {"toggle_reftrace_printer", toggle_reftrace_printer, METH_O},
+    {"bench_tuple", bench_tuple, METH_VARARGS},
+    {"bench_writer", bench_writer, METH_VARARGS},
     {NULL, NULL} /* sentinel */
 };

bench_tuple.py:

import pyperf
import _testcapi
import functools
runner = pyperf.Runner()
for size in (1, 5, 10, 100, 1000):
    func = functools.partial(_testcapi.bench_tuple, size)
    runner.bench_time_func(f'tuple-{size}', func)

bench_writer.py:

import pyperf
import _testcapi
import functools
runner = pyperf.Runner()
for size in (1, 5, 10, 100, 1000):
    func = functools.partial(_testcapi.bench_writer, size)
    runner.bench_time_func(f'tuple-{size}', func)

zooba · 2025-10-10T11:23:01Z

Opposition posted on the issue.

Objects/abstract.c

Objects/tupleobject.c

sergey-miryanov · 2025-10-11T19:15:16Z

JFYI, tuple size distribution (from pyperformance):

By X - size of the tuple
By Y - percent of tuples with this size over all tuples

source_dataset.csv

plot_dataset.csv

vstinner · 2025-10-11T21:07:06Z

The API is being actively discussed. And I'm still making changes in the API. So I prefer to mark this PR as a draft for now.

vstinner · 2025-10-11T21:33:43Z

To make the API nicer to user, I propose to accept NULL in PyTupleWriter_Add() and PyTupleWriter_AddSteal(): return -1 (error) in this case.

It allows replacing code like:

    PyObject *item = PyLong_FromSsize_t(value);
    if (!item)
        goto error;
    PyTuple_SET_ITEM(tuple, 0, item);

with:

    PyObject *item = PyLong_FromSsize_t(value);
    if (PyTupleWriter_AddSteal(tuple, item) < 0) {
        goto error;
    }

instead of having to check for error twice:

    PyObject *item = PyLong_FromSsize_t(value);
    if (!item) {
        goto error;
    }
    if (PyTupleWriter_AddSteal(tuple, item) < 0) {
        goto error;
    }

Checking a function return value is a common pattern when creating a tuple.

Add private _PyTupleWriter_GetItems() helper function.

vstinner · 2025-10-12T20:22:30Z

I changed the allocation strategy which makes the benchmark faster for tuple-10 and reduces the overhead for tuple-1 and tuple-5:

Benchmark	tuple	writer
tuple-1	37.3 ns	40.0 ns: 1.07x slower
tuple-5	65.2 ns	67.1 ns: 1.03x slower
tuple-10	99.7 ns	98.8 ns: 1.01x faster
tuple-100	807 ns	761 ns: 1.06x faster
tuple-1000	7.68 us	7.29 us: 1.05x faster
Geometric mean	(ref)	1.00x faster

vstinner · 2025-10-12T21:06:33Z

Micro-benchmark on PySequence_Tuple():

import pyperf
runner = pyperf.Runner()
for size in (1, 5, 10, 50, 100, 1_000, 10_000):
    runner.timeit(f'tuple-{size:,}',
        setup=f'from _testlimitedcapi import sequence_tuple; seq = range({size})',
        stmt='sequence_tuple(seq)')

Benchmark	ref	writer
tuple-1	129 ns	101 ns: 1.27x faster
tuple-5	132 ns	134 ns: 1.01x slower
tuple-10	218 ns	179 ns: 1.22x faster
tuple-50	753 ns	829 ns: 1.10x slower
tuple-1,000	11.3 us	10.7 us: 1.05x faster
tuple-10,000	260 us	256 us: 1.02x faster
Geometric mean	(ref)	1.06x faster

PyTupleWriter made the function slower.

vstinner · 2025-10-13T12:04:43Z

Mark asked to redo the benchmark to compare PyTuple_SET_ITEM() to PyTupleWriter_AddSteal(). Here you have:

Benchmark	tuple	writer
tuple-1	33.0 ns	42.4 ns: 1.28x slower
tuple-5	51.4 ns	70.3 ns: 1.37x slower
tuple-10	74.1 ns	105 ns: 1.42x slower
tuple-100	567 ns	802 ns: 1.41x slower
tuple-1000	5.43 us	7.71 us: 1.42x slower
Geometric mean	(ref)	1.38x slower

PyTupleWriter_AddSteal() is 1.28x to 1.42x slower than PyTuple_SET_ITEM().

Note: PyTuple_SET_ITEM() is not available in the limited C API.

pythongh-139888: Add PyTupleWriter C API

542c4c4

* Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() helper functions. * Modify PySequence_Tuple() to use PyTupleWriter API. * Soft deprecate _PyTuple_Resize().

vstinner requested review from AA-Turner, ZeroIntensity and ericsnowcurrently as code owners October 10, 2025 08:05

bedevere-app bot added the awaiting core review label Oct 10, 2025

bedevere-app bot mentioned this pull request Oct 10, 2025

[C API] Add PyTupleWriter API #139888

Open

vstinner added 2 commits October 10, 2025 11:38

Add AddSteal() and AddArray() functions

86d529b

Fix refleak in PySequence_Tuple()

afbe7bf

Hard deprecate _PyTuple_Resize()

7787017

vstinner added 4 commits October 10, 2025 12:16

Optimize _PyTupleWriter_SetSize()

a9a6aab

Add test on PyTupleWriter_Create(-2)

8719f11

Change also the exception to SystemError for this error.

Fix compiler warning

9e42598

_PyTupleWriter_Create() doesn't need to copy small_tuple

eef02ff

Fix assertion

5cf5915

sergey-miryanov reviewed Oct 10, 2025

View reviewed changes

Objects/abstract.c Show resolved Hide resolved

sergey-miryanov reviewed Oct 10, 2025

View reviewed changes

Objects/tupleobject.c Outdated Show resolved Hide resolved

Fix leak in PyTupleWriter_Discard()

871ee85

Accept NULL in Add() and AddSteal()

8e7025f

vstinner marked this pull request as draft October 11, 2025 21:07

bedevere-app bot removed the awaiting core review label Oct 11, 2025

Use PyTupleWriter in multiple functions

d27df40

Update What's New

537989d

vstinner added the topic-C-API label Oct 11, 2025

vstinner added 2 commits October 12, 2025 20:35

Convert genericalias _Py_make_parameters() to writer

01ae040

Add private _PyTupleWriter_GetItems() helper function.

Optimize allocation

05758b7

Overallocate by 50%

b124698

Revert _tkinter._flatten()

bf39d69

PyTupleWriter made the function slower.

vstinner mentioned this pull request Oct 14, 2025

gh-139888: Add PyTupleWriter C API (stack flavor) #140129

Draft

Uh oh!

gh-139888: Add PyTupleWriter C API #139891

Are you sure you want to change the base?

gh-139888: Add PyTupleWriter C API #139891

Uh oh!

Conversation

vstinner commented Oct 10, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markshannon commented Oct 10, 2025

Uh oh!

vstinner commented Oct 10, 2025

Uh oh!

vstinner commented Oct 10, 2025

Uh oh!

markshannon commented Oct 10, 2025

Uh oh!

markshannon commented Oct 10, 2025

Uh oh!

markshannon commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markshannon commented Oct 10, 2025

Uh oh!

vstinner commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markshannon commented Oct 10, 2025

Uh oh!

vstinner commented Oct 10, 2025

Uh oh!

vstinner commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zooba commented Oct 10, 2025

Uh oh!

Uh oh!

Uh oh!

sergey-miryanov commented Oct 11, 2025

Uh oh!

vstinner commented Oct 11, 2025

Uh oh!

vstinner commented Oct 11, 2025

Uh oh!

vstinner commented Oct 12, 2025

Uh oh!

vstinner commented Oct 12, 2025

Uh oh!

vstinner commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vstinner commented Oct 10, 2025 •

edited by github-actions bot

Loading

markshannon commented Oct 10, 2025 •

edited

Loading

vstinner commented Oct 10, 2025 •

edited

Loading

vstinner commented Oct 10, 2025 •

edited

Loading