Skip to content

Commit

Permalink
Stackless issue python#268: Add magic number to pickled code objects.
Browse files Browse the repository at this point in the history
Pickled code objects now contain importlib.util.MAGIC_NUMBER.
Unpickling code objects pickled with an incompatible version of Python
now creates a RuntimeWarning. The resulting code object starts with an
invalid opcode and renders unpickled frames invalid. Unpickling a
function whose code is invalid causes a RuntimeWarning too.
  • Loading branch information
akruis authored Jul 9, 2021
1 parent 3eec437 commit 1c04e79
Show file tree
Hide file tree
Showing 4 changed files with 262 additions and 4 deletions.
32 changes: 32 additions & 0 deletions Doc/library/stackless/pickling.rst
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,19 @@ types:
C-types PyAsyncGenASend and PyAsyncGenAThrow (see :pep:`525`) as well as
all kinds of :ref:`Dictionary view objects <dict-views>`.

Code
====

|SLP| can pickle :data:`~types.CodeType` objects.

.. versionchanged:: 3.8
The pickled representation of a code object contains the bytecode version number (:data:`~importlib.util.MAGIC_NUMBER`).
If a program tries to unpickle a code object with a wrong bytecode version number, then |SLP|

* emits a ``RuntimeWarning('Unpickling code object with invalid magic number %ld')`` and
* prepends the *co_code* attribute of the unpickled code object with an invalid |PY| bytecode instruction. This way any attempt
to execute the code object raises :exc:`SystemError`.

Frames
======

Expand All @@ -156,6 +169,25 @@ generator. |SLP| does not register a "reduction" function for
:data:`~types.FrameType`. This way |SLP| stays compatible with application
code that registers its own "reduction" function for :data:`~types.FrameType`.

It is not possible to execute an unpickled frame, if the tasklet the original frame belonged to was
not :attr:`~tasklet.restorable`. In this case the frame is marked as invalid and any attempt
to execute it raises

.. versionchanged:: 3.8
If a program tries to unpickle a frame using a code object whose first bytecode instruction is invalid, then |SLP|
marks the frame as invalid. Any attempt to execute the frame raises :exc:`RuntimeError`.


Functions
=========

|SLP| can pickle functions including lambda-objects objects by value.

.. versionchanged:: 3.8
If a program tries to unpickle a function using a code object whose first bytecode instruction is invalid, then |SLP|
emits a ``RuntimeWarning('Unpickling function with invalid code object: %V')``. Any attempt
to execute the function raises :exc:`SystemError`.

.. _slp_pickling_asyncgen:

Asynchronous Generators
Expand Down
8 changes: 8 additions & 0 deletions Stackless/changelog.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,14 @@ What's New in Stackless 3.X.X?

*Release date: 20XX-XX-XX*

- https://github.com/stackless-dev/stackless/issues/268
Pickled code objects now contain importlib.util.MAGIC_NUMBER. Unpickling
code objects pickled with an incompatible version of Python now creates a
RuntimeWarning("Unpickling code object with invalid magic number"). The
resulting code object starts with an invalid opcode and renders unpickled
frames invalid. Unpickling a function whose code is invalid causes a
RuntimeWarning("Unpickling function with invalid code object").

- https://github.com/stackless-dev/stackless/issues/270
Stackless now uses an unmodified PyFrameObject structure. Stackless now
stores more state information in the field f->f_executing than C-Python.
Expand Down
120 changes: 116 additions & 4 deletions Stackless/pickling/prickelpit.c
Original file line number Diff line number Diff line change
Expand Up @@ -381,8 +381,7 @@ slp_cannot_execute(PyCFrameObject *f, const char *exec_name, PyObject *retval)
if (retval != NULL) {
Py_DECREF(retval);
PyErr_Format(PyExc_RuntimeError, "cannot execute invalid frame with "
"'%.100s': frame had a C state that"
" can't be restored.",
"'%.100s': frame had a C state that can't be restored or an invalid code object.",
exec_name);
}

Expand Down Expand Up @@ -610,16 +609,36 @@ slp_from_tuple_with_nulls(PyObject **start, PyObject *tup)
******************************************************/

#define codetuplefmt "iiiiiSOOOSSiSOO"
#define codetuplefmt "liiiiiSOOOSSiSOO"
/* Index of co_code in the tuple given to code_new */
static const Py_ssize_t code_co_code_index = 5;

/*
* An unused (invalid) opcode. See opcode.h for a list of used opcodes.
* If Stackless unpickles a code object with an invalid magic number, it prefixes
* co_code with this opcode.
*
* frame_setstate tests if the first opcode of the code of the frame is CODE_INVALID_OPCODE
* and eventually marks a frame as invalid.
*/
static const char CODE_INVALID_OPCODE = 0;

static struct _typeobject wrap_PyCode_Type;
static long bytecode_magic = 0;

static PyObject *
code_reduce(PyCodeObject * co, PyObject *unused)
{
if (0 >= bytecode_magic) {
bytecode_magic = PyImport_GetMagicNumber();
if (-1 == bytecode_magic)
return NULL;
}

PyObject *tup = Py_BuildValue(
"(O(" codetuplefmt ")())",
&wrap_PyCode_Type,
bytecode_magic,
co->co_argcount,
co->co_kwonlyargcount,
co->co_nlocals,
Expand All @@ -640,7 +659,73 @@ code_reduce(PyCodeObject * co, PyObject *unused)
return tup;
}

MAKE_WRAPPERTYPE(PyCode_Type, code, "code", code_reduce, generic_new,
static PyObject *
code_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
long magic = 0;

if (0 >= bytecode_magic) {
bytecode_magic = PyImport_GetMagicNumber();
if (-1 == bytecode_magic)
return NULL;
}

assert(PyTuple_CheckExact(args));
if (PyTuple_GET_SIZE(args) == sizeof(codetuplefmt) - 1) {
/* */
magic = PyLong_AsLong(PyTuple_GET_ITEM(args, 0));
if (-1 == magic && PyErr_Occurred()) {
return NULL;
}
args = PyTuple_GetSlice(args, 1, sizeof(codetuplefmt) - 1);
} else if (PyTuple_GET_SIZE(args) == sizeof(codetuplefmt) - 2) {
/* Format used by Stackless versions up to 3.7 */
args = PyTuple_GetSlice(args, 0, sizeof(codetuplefmt) - 2);
} else {
PyErr_SetString(PyExc_IndexError, "Argument tuple has wrong size.");
return NULL;
}
if (NULL == args)
return NULL;

if (bytecode_magic != magic) {
if (PyErr_WarnFormat(PyExc_RuntimeWarning, 1, "Unpickling code object with invalid magic number %ld", magic)) {
Py_DECREF(args);
return NULL;
}

PyObject *code = PyTuple_GET_ITEM(args, code_co_code_index);
if (NULL == code) {
Py_DECREF(args);
return NULL;
}
if (!PyBytes_Check(code)) {
Py_DECREF(args);
PyErr_SetString(PyExc_TypeError,
"Unpickling code object: code is not a bytes object");
return NULL;
}
Py_ssize_t code_len = PyBytes_Size(code);
assert(code_len <= INT_MAX);
assert(code_len % sizeof(_Py_CODEUNIT) == 0);

/* Now prepend an invalid opcode to the code.
*/
PyObject *code2 = PyBytes_FromStringAndSize(NULL, code_len + sizeof(_Py_CODEUNIT));
assert(_Py_IS_ALIGNED(PyBytes_AS_STRING(code), sizeof(_Py_CODEUNIT)));
char *p = PyBytes_AS_STRING(code2);
p[0] = Py_BUILD_ASSERT_EXPR(sizeof(_Py_CODEUNIT) == 2) + CODE_INVALID_OPCODE;
p[1] = 0; /* Argument */
memcpy(p + sizeof(_Py_CODEUNIT), PyBytes_AS_STRING(code), code_len);
PyTuple_SET_ITEM(args, code_co_code_index, code2);
}

PyObject *retval = generic_new(type, args, kwds);
Py_DECREF(args);
return retval;
}

MAKE_WRAPPERTYPE(PyCode_Type, code, "code", code_reduce, code_new,
generic_setstate)

static int init_codetype(PyObject * mod)
Expand Down Expand Up @@ -779,12 +864,33 @@ func_setstate(PyObject *self, PyObject *args)
{
PyFunctionObject *fu;
PyObject *args2;
char *pcode;

if (is_wrong_type(Py_TYPE(self))) return NULL;
Py_TYPE(self) = Py_TYPE(self)->tp_base;

/* Test for an invalid code object */
args2 = PyTuple_GetItem(args, 0);
if (NULL==args2)
return NULL;
if (! PyCode_Check(args2)) {
PyErr_SetString(PyExc_TypeError, "func_setstate: value for func_code is not a code object");
return NULL;
}
pcode = PyBytes_AsString(((PyCodeObject *) args2)->co_code);
if (NULL == pcode)
return NULL;
if (*pcode == CODE_INVALID_OPCODE) {
/* invalid code object, was pickled with a different version of python */
if (PyErr_WarnFormat(PyExc_RuntimeWarning, 1, "Unpickling function with invalid code object: %V",
PyTuple_GetItem(args, 2), "~ name is missing ~"))
return NULL;
}

args2 = PyTuple_GetSlice(args, 0, 5);
if (args2 == NULL)
return NULL;

fu = (PyFunctionObject *)
Py_TYPE(self)->tp_new(Py_TYPE(self), args2, NULL);
Py_DECREF(args2);
Expand Down Expand Up @@ -958,6 +1064,7 @@ frame_setstate(PyFrameObject *f, PyObject *args)
int valid, have_locals;
char f_executing;
Py_ssize_t tmp;
char *pcode;

if (is_wrong_type(Py_TYPE(f))) return NULL;

Expand All @@ -982,6 +1089,11 @@ frame_setstate(PyFrameObject *f, PyObject *args)
"invalid code object for frame_setstate");
return NULL;
}
pcode = PyBytes_AsString(((PyCodeObject *) f_code)->co_code);
if (NULL == pcode)
return NULL;
if (*pcode == CODE_INVALID_OPCODE)
valid = 0; /* invalid code object, was pickled with a different version of python */

if (have_locals) {
Py_INCREF(f_locals);
Expand Down
106 changes: 106 additions & 0 deletions Stackless/unittests/test_pickle.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,13 @@
import threading
import contextvars
import ctypes
import importlib.util
import struct
import warnings
import subprocess
import stackless

from textwrap import dedent
from stackless import schedule, tasklet

from support import test_main # @UnusedImport
Expand All @@ -20,6 +25,7 @@
# we need to make it appear that pickling them is ok, otherwise we will fail when pickling
# closures that refer to test runner instances
import copyreg
from _warnings import warn


def reduce(obj):
Expand Down Expand Up @@ -1346,6 +1352,106 @@ def run():
self.assertEqual(stackless.pickle_flags(), current)


class TestCodePickling(unittest.TestCase):
def test_reduce_magic(self):
code = (lambda :None).__code__
reduce = stackless._stackless._wrap.code.__reduce__
reduced = reduce(code)
self.assertIsInstance(reduced, tuple)
self.assertEqual(len(reduced), 3)
self.assertIsInstance(reduced[1], tuple)
self.assertGreater(len(reduced[1]), 0)
self.assertIsInstance(reduced[1][0], int)
# see Python C-API documentation for PyImport_GetMagicNumber()
self.assertEqual(reduced[1][0], struct.unpack("<l", importlib.util.MAGIC_NUMBER)[0])

def test_new_with_wrong_magic_error(self):
code = (lambda :None).__code__
reduce = stackless._stackless._wrap.code.__reduce__
reduced = reduce(code)
args = (reduced[1][0] + 1,) + reduced[1][1:]
self.assertIsInstance(reduced[0](*reduced[1]), type(code))
with self.assertRaisesRegex(RuntimeWarning, "Unpickling code object with invalid magic number"):
with stackless.atomic():
with warnings.catch_warnings():
warnings.simplefilter("error", RuntimeWarning)
reduced[0](*args)

def test_new_without_magic(self):
code = (lambda :None).__code__
reduce = stackless._stackless._wrap.code.__reduce__
reduced = reduce(code)
args = reduced[1][1:]
with self.assertRaisesRegex(RuntimeWarning, "Unpickling code object with invalid magic number 0"):
with stackless.atomic():
with warnings.catch_warnings():
warnings.simplefilter("error", RuntimeWarning)
reduced[0](*args)

def test_func_with_wrong_magic(self):
l = (lambda :None)
code = l.__code__
reduce = stackless._stackless._wrap.code.__reduce__
reduced = reduce(code)
args = (reduced[1][0] + 1,) + reduced[1][1:]
self.assertIsInstance(reduced[0](*reduced[1]), type(code))
with stackless.atomic():
with warnings.catch_warnings():
warnings.simplefilter("ignore", RuntimeWarning)
code2 = reduced[0](*args)
code2.__setstate__(())
self.assertIs(type(code2), type(code))

reduce = stackless._stackless._wrap.function.__reduce__
reduced_func = reduce(l)
f = reduced_func[0](*reduced_func[1])
args = (code2,) + reduced_func[2][1:]
with self.assertRaisesRegex(RuntimeWarning, "Unpickling function with invalid code object:"):
with stackless.atomic():
with warnings.catch_warnings():
warnings.simplefilter("error", RuntimeWarning)
f = f.__setstate__(args)

def test_run_with_wrong_magic(self):
# run this test as a subprocess, because it triggers a fprintf(stderr, ...) in ceval.c
# and I don't like this output in our test suite.
args = []
if not stackless.enable_softswitch(None):
args.append("--hard")

rc = subprocess.call([sys.executable, "-s", "-S", "-E", "-c", dedent("""
import stackless
import warnings
import sys
sys.stderr = sys.stdout
if "--hard" in sys.argv:
stackless.enable_softswitch(False)
l = (lambda :None)
code = l.__code__
reduce = stackless._stackless._wrap.code.__reduce__
reduced = reduce(code)
args = (reduced[1][0] + 1,) + reduced[1][1:]
with stackless.atomic():
with warnings.catch_warnings():
warnings.simplefilter("ignore", RuntimeWarning)
code2 = reduced[0](*args)
code2.__setstate__(())
assert(type(code2) is type(code))
# now execute code 2, first create a function from code2
f = type(l)(code2, globals())
# f should raise
try:
f()
except SystemError as e:
assert(str(e) == 'unknown opcode')
else:
assert(0, "expected exception not raised")
sys.exit(42)
""")] + args, stderr=subprocess.DEVNULL)
self.assertEqual(rc, 42)


if __name__ == '__main__':
if not sys.argv[1:]:
sys.argv.append('-v')
Expand Down

0 comments on commit 1c04e79

Please sign in to comment.