Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeVar with bad arguments segfault/misbehavior #118814

Closed
bast0006 opened this issue May 9, 2024 · 15 comments
Closed

TypeVar with bad arguments segfault/misbehavior #118814

bast0006 opened this issue May 9, 2024 · 15 comments
Assignees
Labels
topic-argument-clinic topic-typing type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@bast0006
Copy link

bast0006 commented May 9, 2024

Crash report

What happened?

import typing
typing.TypeVar(name="X", bound=type)

Triggers a segfault in python 3.12.3 on all platforms.

Does not reproduce on 3.11 or below.

This bug is similar to #110787, but that fix was never backported to 3.12, and additionally has a flaw that causes some arguments to be shifted in the previously-crashing cases.

When _PyArg_UnpackKeywordsWithVararg gets an input with insufficient positional parameters (which have been provided as keyword arguments) the varargs slot is positioned at the end of the mandatory positional parameters slots. But in the test case, the varargs slot is being overwritten by the bound slot because the keyword argument copy loop that begins here isn't aware of varargs.

If the minimal positionals are provided in the positionals tuple, https://github.com/sobolevn/cpython/blob/c4ca210f12a18edbe30b91aeb6e1915d74936caf/Python/getargs.c#L2525 line in 3.12 (missing the !=) is always true, and the keyword arguments are offset by 1, pushing them to the end of the array and leaving the varargs slot alone. But if there aren't and they need to be backfilled from the keyword arguments, nargs doesn't change in the loop, causing it to overwrite the varargs slot and additionally fail to completely fill the array (causing a segfault when the parent function tries to use that last garbage slot).

This can be fixed by changing the nargs to i so the line reads if (i < vararg) {, then keyword arguments that look up before the varargs entry are not offset, and those that look up after are offset, leaving that slot untouched and ensuring the array is properly filled.

Because i always begins at the end of where the provided positional arguments start, this will hopefully never accidentally overwrite positional arguments, and should solve the problem entirely.

The current fix with != is insufficient because if you provide a third parameter, the != becomes true again, and it reuses a slot. Thanks to a null check it doesn't segfault, but it does result in unexpected behavior:

import typing
T = typing.TypeVar(name="T", bound=type, covariant=True)
assert T.__covariant__

fails with an AssertionError in the 3.13 tag and main.

(TypeVar("T", bound=type, covariant=True).__covariant__ is true, however)

CPython versions tested on:

3.11, 3.12, CPython main branch

Operating systems tested on:

macOS, Windows

Output from running 'python -VV' on the command line:

Python 3.14.0a0 (heads/main-dirty:cb6f75a32ca, May 8 2024, 20:35:11) [Clang 15.0.0 (clang-1500.3.9.4)]

Linked PRs

@bast0006 bast0006 added the type-crash A hard crash of the interpreter, possibly with a core dump label May 9, 2024
@JelleZijlstra JelleZijlstra self-assigned this May 9, 2024
@bast0006 bast0006 changed the title TypeVar with bad arguments triggers segmentation fault TypeVar with bad arguments segfault/misbehavior May 9, 2024
@bast0006
Copy link
Author

bast0006 commented May 9, 2024

I now realize I'm linking to a 3.12 PR that was merged when I said "wasn't backported", sorry. Part of the fix was backported, but the getargs.c change wasn't. The behavior is additionally a bit messy. My 3.12 install from pyenv accepts the test case with a TypeError: constraints must be a tuple that implies it's not hitting the array-fill segfault (but is hitting bad slot-filling behavior).

From my pyenv non-debug install:

>>> typing.TypeVar(name="T", bound=type, covariant=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: constraints must be a tuple
>>> typing.TypeVar(name="T", bound=(type,), covariant=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: A single constraint is not allowed
>>> typing.TypeVar(name="T", bound=(type, int), covariant=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Constraints cannot be combined with bound=...
>>> typing.TypeVar(name="T", bound=(type, int))
T
>>> typing.TypeVar(name="T", bound=(type, int)).__bound__
>>> typing.TypeVar(name="T", bound=(type, int)).__constraints__
((...), <class 'int'>)
>>> typing.TypeVar(name="T", bound=tuple(), covariant=True)
zsh: segmentation fault  python

The debug compile segfaults when given the two-item testcase, but accepts a simple TypeVar(name="T") correctly.

And after restarting python it isn't consistent in crashing, but it is consistent in behavior, making me think I hit on the right mechanism of action:

typing.TypeVar(name="T", bound=tuple(), covariant="OhNo").__bound__
ForwardRef('OhNo')

@JelleZijlstra
Copy link
Member

The crash is in Argument Clinic code, cc @erlend-aasland.

>>> from typing import *
>>> TypeVar(name="x", bound=type)
Process 22706 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x5c)
    frame #0: 0x00000001000dbc6c python.exe`PyObject_IsTrue(v=0x000000016fdfe6b0) at object.c:1813:40 [opt]
   1810	    if (v == Py_None)
   1811	        return 0;
   1812	    else if (Py_TYPE(v)->tp_as_number != NULL &&
-> 1813	             Py_TYPE(v)->tp_as_number->nb_bool != NULL)
   1814	        res = (*Py_TYPE(v)->tp_as_number->nb_bool)(v);
   1815	    else if (Py_TYPE(v)->tp_as_mapping != NULL &&
   1816	             Py_TYPE(v)->tp_as_mapping->mp_length != NULL)
Target 0: (python.exe) stopped.
warning: python.exe was compiled with optimization - stepping may behave oddly; variables may not be available.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x5c)
  * frame #0: 0x00000001000dbc6c python.exe`PyObject_IsTrue(v=0x000000016fdfe6b0) at object.c:1813:40 [opt]
    frame #1: 0x0000000100117524 python.exe`typevar_new(type=<unavailable>, args=<unavailable>, kwargs=<unavailable>) at typevarobject.c.h:107:22 [opt]
    frame #2: 0x0000000100104dd0 python.exe`type_call(self=0x0000000101813c30, args=0x000000010048c7b0, kwds=0x00000001028d7d70) at typeobject.c:1897:11 [opt]

@bast0006
Copy link
Author

bast0006 commented May 9, 2024

@JelleZijlstra that crash is the result of passing 0x1 into PyObject_IsTrue, which is what happens when typevar_new gets the incorrectly parsed argument array back from _PyArg_UnpackKeywordsWithVararg.

 $ ./python.exe -VV
Python 3.12.3+ (heads/3.12:530c3bb271d, May  8 2024, 21:35:17) [Clang 15.0.0 (clang-1500.3.9.4)]
(gdb) up
#1  0x0000000100129b72 in typevar_new (type=<optimized out>, args=<optimized out>, kwargs=<optimized out>) at Objects/clinic/typevarobject.c.h:100
100	    infer_variance = PyObject_IsTrue(fastargs[5]);
(gdb) info locals
_kwtuple = {_this_is_not_used = {_gc_next = 0, _gc_prev = 0}, ob_base = {ob_base = {{ob_refcnt = 4294967295, ob_refcnt_split = {4294967295, 0}}, ob_type = 0x100520c60}, 
    ob_size = 5}, ob_item = {'name', 'bound', 'covariant', 'contravariant', 'infer_variance'}}
_keywords = {0x100376e78 "name", 0x1003a352c "bound", 0x1003a3532 "covariant", 0x1003a353c "contravariant", 0x1003a354a "infer_variance", 0x0}
_parser = {initialized = -1, format = 0x0, keywords = 0x10048fd80, fname = 0x1003a3559 "typevar", custom_msg = 0x0, pos = 0, min = 0, max = 0, 
  kwtuple = ('name', 'bound', 'covariant', 'contravariant', 'infer_variance'), next = 0x10051a1e8}
argsbuf = {'T', <type at remote 0x100521130>, 0x0, 0x0, 0x0, <unknown at remote 0x1>}
return_value = 0x0
nargs = <optimized out>
noptargs = <optimized out>
constraints = <type at remote 0x100521130>
covariant = 0
contravariant = 0
infer_variance = 0
fastargs = 0x7ff7bfefec30
bound = None
name = 'T'
(gdb) p fastargs[5]
$1 = <unknown at remote 0x1>

Note the contents of argsbuf, which is constructed by _PyArg_UnpackKeywordsWithVararg. That last argument should be either NULL or a python type, but it's neither. You can also see that name is assigned right, but bound is None and type has been assigned into constraints because of the bug.

@JelleZijlstra
Copy link
Member

Right, and that code is generated by Argument Clinic (note the path Objects/clinic/typevarobject.c.h).

Also, thanks for the detailed bug report!

@bast0006
Copy link
Author

bast0006 commented May 9, 2024

Right, and that code is generated by Argument Clinic (note the path Objects/clinic/typevarobject.c.h).

Also, thanks for the detailed bug report!

Thanks! Hopefully it's helpful.

That is true. I just mean that I traced the bad data/parsing collision out of argument clinic code and into handwritten code in getargs.c, which makes me think the bug actually isn't in argument clinic code, even if it is triggered by it.

@erlend-aasland
Copy link
Contributor

erlend-aasland commented May 20, 2024

I'm trying to reproduce this1 on macOS, but so far I've been unable to do so. A couple of observations:

  • AFAICS, we have no other Argument Clinic code with this signature, so this is the only case where this particular generation path happens
  • We have no Argument Clinic tests for this input / generated output

Footnotes

  1. the crash, that is

@erlend-aasland
Copy link
Contributor

_PyArg_UnpackKeywordsWithVararg is doing some weird stuff here. Either this is a bug in _PyArg_UnpackKeywordsWithVararg, or it is a bug in how we call it from the generated code.

@picnixz
Copy link
Contributor

picnixz commented Aug 1, 2024

I can reproduce the issue on Python 3.14.0a0 (heads/main:88030861e21, Aug 1 2024, 12:49:44) [GCC 7.5.0]. I'll try to see if I can do something for that.

@devdanzin
Copy link
Contributor

I can't reproduce it in 3.13, is the reason that #118009 seems to only have been backported to 3.12?

@picnixz
Copy link
Contributor

picnixz commented Aug 1, 2024

I couldn't reproduce it the first time because I used TypeVar('x', bound=type). Check that you used TypeVar(name='x', bound=type) for it to work (well in this case, to crash). But there is anyway an issue on 3.14.

@devdanzin
Copy link
Contributor

TypeVar(name='x', bound=type)

That does it, thanks! Should this be a release blocker for 3.13 then?

@picnixz
Copy link
Contributor

picnixz commented Aug 1, 2024

3.13rc1 has already been released so it's not a blocker. But I don't know whether we'll merge it before releasing 3.13.0.

@Yhg1s If my fix is correct, should it land or not? and if my fix is not correct, does this count as a blocker of 3.13.0?

@JelleZijlstra
Copy link
Member

We should definitely merge a fix for this issue into 3.13 even during the RC phase (segfaults are bugs). Not sure it should be marked as a release blocker though since 3.12 is also affected.

@picnixz
Copy link
Contributor

picnixz commented Aug 1, 2024

There's defnitely something wrong with AC here but I won't have time to fix it today.

I think it has to do with the number of defaults keywords. I see different results (sometimes test failures, sometimes crashes) depending on whether there are 2 or 3 keywords. We should test the following cases:

self.assertEqual(func(1, kw1=True), (1, (), True, False))
self.assertEqual(func(1, kw2=True), (1, (), False, True))
self.assertEqual(func(1, kw1=True, kw2=True), (1, (), True, True))
self.assertEqual(func(1, kw2=True, kw1=True), (1, (), True, True))

self.assertEqual(func(1, 2, 3, 4), (1, (2, 3, 4), False, False))
self.assertEqual(func(1, 2, 3, 4, kw1=True), (1, (2, 3, 4), True, False))
self.assertEqual(func(1, 2, 3, 4, kw1=True), (1, (2, 3, 4), True, False))
self.assertEqual(func(1, 2, 3, 4, kw1=True, kw2=True), (1, (2, 3, 4), True, True))

self.assertEqual(func(a=1), (1, (), False, False))
self.assertEqual(func(a=1, kw1=True), (1, (), True, False))
self.assertEqual(func(kw1=True, a=1), (1, (), True, False))

self.assertEqual(func(a=1, kw2=True), (1, (), False, True))
self.assertEqual(func(kw2=True, a=1), (1, (), False, True))

with

/*[clinic input]
vararg_with_multiple_defaults

    a: object
    *args: object
    kw1: bool = False
    kw2: bool = False

[clinic start generated code]*/

static PyObject *
vararg_with_multiple_defaults_impl(PyObject *module, PyObject *a,
                                   PyObject *args, int kw1, int kw2)
/*[clinic end generated code: output=ae7ee8d22dfc7fbf input=534d91e23e6d360b]*/
{
    PyObject *obj_kw1 = kw1 ? Py_True : Py_False;
    PyObject *obj_kw2 = kw2 ? Py_True : Py_False;
    return pack_arguments_newref(4, a, args, obj_kw1, obj_kw2);
}

If the tests fail, then something's wrong. I couldn't test every combination but the following always crashes, whether I applied my (wrong) fix or not:

self.assertEqual(func(a=1, kw1=True, kw2=True), (1, (), True, True))
self.assertEqual(func(a=1, kw2=True, kw1=True), (1, (), True, True))
self.assertEqual(func(kw1=True, a=1, kw2=True), (1, (), True, True))
self.assertEqual(func(kw2=True, a=1, kw1=True), (1, (), True, True))
self.assertEqual(func(kw1=True, kw2=True, a=1), (1, (), True, True))
self.assertEqual(func(kw2=True, kw1=True, a=1), (1, (), True, True))

I think the logic of _PyArg_UnpackKeywordsWithVararg should be entirely reworked.

@JelleZijlstra JelleZijlstra removed their assignment Aug 1, 2024
@picnixz picnixz self-assigned this Aug 2, 2024
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Aug 4, 2024
…eyword

Fix _PyArg_UnpackKeywordsWithVararg for the case when argument for
positional-or-keyword parameter is passed by keyword.
There was only one such case in the stdlib -- the TypeVar constructor.
@picnixz picnixz removed their assignment Aug 4, 2024
@serhiy-storchaka serhiy-storchaka self-assigned this Aug 4, 2024
@serhiy-storchaka
Copy link
Member

There are several issues here.

  • There is a bug in _PyArg_UnpackKeywordsWithVararg. It is only manifested when keyword arguments are passed for keyword-or-positional parameters. So print() does not have this bug, but the TypeVar constructor does. The fix is quite simple: gh-118814: Fix the TypeVar constructor when name is passed by keyword #122664.
  • There are several bugs in Argument Clinic (@picnixz discovered this when tried to write more extended Argument Clinic tests). This is not directly related to this issue, so I'll address this in a separate issue. I already have a solution.
  • There are questions about the design of _PyArg_UnpackKeywordsWithVararg. I'm going to re-design it in a separate issue, after fixing bugs in _PyArg_UnpackKeywordsWithVararg and Argument Clinic.

serhiy-storchaka added a commit that referenced this issue Aug 7, 2024
…GH-122664)

Fix _PyArg_UnpackKeywordsWithVararg for the case when argument for
positional-or-keyword parameter is passed by keyword.
There was only one such case in the stdlib -- the TypeVar constructor.
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Aug 7, 2024
…ed by keyword (pythonGH-122664)

Fix _PyArg_UnpackKeywordsWithVararg for the case when argument for
positional-or-keyword parameter is passed by keyword.
There was only one such case in the stdlib -- the TypeVar constructor.
(cherry picked from commit 540fcc6)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Aug 7, 2024
…ed by keyword (pythonGH-122664)

Fix _PyArg_UnpackKeywordsWithVararg for the case when argument for
positional-or-keyword parameter is passed by keyword.
There was only one such case in the stdlib -- the TypeVar constructor.
(cherry picked from commit 540fcc6)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit that referenced this issue Aug 8, 2024
…keyword (GH-122664) (GH-122806)

Fix _PyArg_UnpackKeywordsWithVararg for the case when argument for
positional-or-keyword parameter is passed by keyword.
There was only one such case in the stdlib -- the TypeVar constructor.
(cherry picked from commit 540fcc6)
serhiy-storchaka added a commit that referenced this issue Aug 8, 2024
…keyword (GH-122664) (GH-122807)

Fix _PyArg_UnpackKeywordsWithVararg for the case when argument for
positional-or-keyword parameter is passed by keyword.
There was only one such case in the stdlib -- the TypeVar constructor.
(cherry picked from commit 540fcc6)
blhsing pushed a commit to blhsing/cpython that referenced this issue Aug 22, 2024
…eyword (pythonGH-122664)

Fix _PyArg_UnpackKeywordsWithVararg for the case when argument for
positional-or-keyword parameter is passed by keyword.
There was only one such case in the stdlib -- the TypeVar constructor.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-argument-clinic topic-typing type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

7 participants