GH-98831: Simple input-output stack effects #99120

gvanrossum · 2022-11-05T09:40:02Z

No arrays; no conditionals; no types; no cache effects.

Issue: Generate the interpreter #98831

No arrays; no conditionals; no types; no cache effects.

iritkatriel · 2022-11-05T19:27:38Z

Python/bytecodes.c

-        // stack effect: (__0 -- )
-        inst(BINARY_OP_MULTIPLY_INT) {
+        instr(BINARY_OP_MULTIPLY_INT, (left, right -- prod)) {
+            // TODO: Don't pop from the stack before DEOPF_IF() calls.


Could you generate peeks at the beginning of the instruction and pops at the end just before the pushes?

Yeah, that's why this is still a draft PR. :-)

gvanrossum · 2022-11-05T22:49:33Z

@brandtbucher @markshannon: In bytecodes.c I now get red wiggles on every use of a variable defined through a stack effect (since the PyObject *value; (etc.) is not in the instruction body any more). It's particularly annoying when it occurs in a macro like Py_DECREF(value) -- the wiggle shows on Py_DECREF. Any suggestions?

gvanrossum · 2022-11-05T23:28:42Z

@markshannon I have some questions about how ERROR_IF() should work. Your spec says "If an ERROR_IF occurs, all values will be removed from the stack." It's easy enough to add a STACK_SHRINK() call (see latest code in generated_cases.c.h), but I'm not sure about whose responsibility it should be to DECREF() those variables. In practice, in the dozen or so instructions I've converted so far, when ERROR_IF() is called the code has already called DECREF(). Example definition:

        instr(UNARY_POSITIVE, (value -- res)) {
            res = PyNumber_Positive(value);
            Py_DECREF(value);
            ERROR_IF(res == NULL, error);
        }

This expands to the following:

        TARGET(UNARY_POSITIVE) {
            PyObject *value = PEEK(1);
            PyObject *res;
            res = PyNumber_Positive(value);
            Py_DECREF(value);
            if (res == NULL) { STACK_SHRINK(1); goto error; }
            POKE(1, res);
            DISPATCH();
        }

Shall we make this part of the spec for ERROR_IF(), that you must call it after "consuming" all the inputs?

brandtbucher · 2022-11-05T23:29:14Z

Perhaps we could make inst a variadic macro that turns a list of stack items into a declaration list? Downside is that you need some marker to separate in/out parameters, and handling braces might get trickier.

On my phone now, but something like:

typedef PyObject *_dummy_stack_item;

#define inst_begin(NAME, ...)          \
    case (NAME): {                     \
        _dummy_stack_item __VA_ARGS__;

#define inst_end }

inst_begin(BINARY_OP, lhs, rhs, _, res)
    // Implementation goes here...
inst_end

iritkatriel · 2022-11-06T07:56:17Z

Python/bytecodes.c

-            if (TOP() == NULL) {
-                goto error;
-            }
+            ERROR_IF(TOP() == NULL, error);


TOP() —> res

Also detect whether names that occur in both inputs and outputs are at the same position.

gvanrossum · 2022-11-07T08:11:23Z

Some notes after converting a few basic instructions (and failing to convert a few outliers).

Converting "raw" instruction definitions (with PUSH()/POP() or custom stack operations) to streamlined DSL with input and output stack effects is a slow manual process that requires careful review (e.g. the mistake that Irit found).
So far, every few instructions I converted required changes to the code generator.
The code generator needs to be refactored to make future changes easier.
I haven't even started to think about how to implement array and conditional stack effects or cache streams.
The families are currently not read by the code generator. We can address this once we need them.
~~For some reason I no longer see red wiggly underlines in bytecodes.c.~~ (EDIT: After closing and reopening the file they are back.)
There are some instructions that don't seem to fit in the DSL.
- PUSH_NULL must wait until I've implemented types, since it pushes a NULL.
- LIST_APPEND and SET_ADD dig up a stack entry that occurs 'oparg' deep.
- Opcodes like BINARY_SUBSCR_ADAPTIVE are problematic since they have special exits (DISPATCH_SAME_OPARG and GO_TO_INSTRUCTION). We may have to rethink such exits.

At this point I think the way forward is to merge this and then iterate, leaving the hardest cases for last.

markshannon

No need to justify not adding more features; smaller PRs are better. There is no need to convert all the instructions at once.
Let's add features to the code generator as we actually need them.

Have you benchmarked this?
Some of the instructions, particularly the BINARY_OP ones have been quite sensitive to minor code re-orderings.

Wiggly lines can be fixed by adding dummy static definitions to the top of bytecodes.c

markshannon · 2022-11-07T10:30:39Z

Python/bytecodes.c

            JUMPBY(INLINE_CACHE_ENTRIES_BINARY_OP);
        }

-        // stack effect: (__0 -- )
-        inst(BINARY_OP_INPLACE_ADD_UNICODE) {
+        // This is a weird one. It's a super-instruction for


Maybe drop the "This is a weird one."
It is unusual, but its there for a good reason, which is to maintain the historical behavior that s += ... in a loop is not quadratic.

markshannon · 2022-11-07T10:33:12Z

Tools/cases_generator/generate_cases.py

-    predictions = set()
-    for inst in instrs:
-        for target in re.findall(r"(?:PREDICT|GO_TO_INSTRUCTION)\((\w+)\)", inst.block.text):
+def write_instr(instr: InstDef, predictions: set[str], indent: str, f: TextIO, dedent: int = 0):


Note for future PRs.
We need to factor out the three parts:

analysis

translation

output

markshannon · 2022-11-07T10:34:59Z

Tools/cases_generator/generate_cases.py

+    # Write the body
+    ninputs = len(instr.inputs or ())
+    for line in blocklines:
+        if m := re.match(r"(\s*)ERROR_IF\(([^,]+), (\w+)\);\s*$", line):


See comment in generated_cases.c.h about introducing code into the if (cond) goto... code.

markshannon · 2022-11-07T10:42:13Z

Python/generated_cases.c.h

            Py_DECREF(container);
+            if (res == NULL) { STACK_SHRINK(3); goto error; }


Anything more than if (cond) goto ... introduces extra jumps around the conditional block and may slow things down.
E.g.

if (res == NULL) { STACK_SHRINK(3); goto error; }

will be lowered to something like:

if (res != NULL) goto next; STACK_SHRINK(3); goto error; next:

The C compiler might move the STACK_SHRINK(3); goto error; out of line, but I think it better to do this in the code generator. Something like:

if (res == NULL) goto pop3_error; ... pop3_error: STACK_SHRINK(1); pop2_error: STACK_SHRINK(1); pop_error: STACK_SHRINK(1); error: ...

gvanrossum · 2022-11-08T16:22:00Z

I have benchmark results (thanks @brandtbucher!). Bottom line, it's a wash.

We compared three commits:

baseline (just before bytecodes.c &c created): 2cfcaf5 _io and _pyio inconsistenly raise ValueErrors on closed files #98999: Raise ValueError in _pyio on closed buffers (gh-98999: Raise ValueError in _pyio on closed buffers #99009)
first improvement in bytecodes.c by @markshannon: f4adb97 Enable specialization of iteration over generators, for gen():, and awaiting coroutines, await coro() #96793: Implement PEP 479 in bytecode. (GH-96793: Implement PEP 479 in bytecode. #99006)
last commit in this PR: 48c2a20 (HEAD -> stack-effect, gvanrossum/stack-effect) Add dummy variables for stack effects

The second and third both are 1% faster than the baseline. This suggests that there is no measurable effect from just this PR, or from the creation of super-instructions (which was merged into this commit from main, but not included in Mark's PEP 479 changes).
output.txt

I am going ahead with merging this.

gvanrossum added the skip news label Nov 5, 2022

gvanrossum requested review from markshannon and brandtbucher November 5, 2022 09:40

bedevere-bot mentioned this pull request Nov 5, 2022

Generate the interpreter #98831

Closed

bedevere-bot added the awaiting core review label Nov 5, 2022

gvanrossum marked this pull request as draft November 5, 2022 09:40

gvanrossum force-pushed the stack-effect branch from 0359940 to cc86e73 Compare November 5, 2022 09:44

Simple input-output stack effects

8868b82

No arrays; no conditionals; no types; no cache effects.

gvanrossum force-pushed the stack-effect branch from cc86e73 to 8868b82 Compare November 5, 2022 10:08

iritkatriel reviewed Nov 5, 2022

View reviewed changes

gvanrossum added 3 commits November 5, 2022 15:00

Use PEEK instead of POP; STACK_GROW; etc.

069cfb4

Use POKE(i, v)

51db97b

Use ERROR_IF()

700dd3a

Expand ERROR_IF(), to include SHRINK_STACK()

4c990fd

gvanrossum added 4 commits November 5, 2022 20:03

Use a varargs inst() macro

2fa6062

Do a few more instructions

73fab62

BINARY_OP_INPLACE_ADD_UNICODE

f23b6fb

Use STACK_SHRINK as needed

fa507a2

gvanrossum force-pushed the stack-effect branch from 0712642 to fa507a2 Compare November 6, 2022 06:48

gvanrossum added 4 commits November 6, 2022 00:13

BINARY_OP_ADD_FLOAT, BINARY_OP_ADD_INT

1198502

Move PREDICTED() up (bad bug)

ea277ce

BINARY_SUBSCR

fe2767c

BINARY_SLICE

e737bcb

iritkatriel reviewed Nov 6, 2022

View reviewed changes

Python/bytecodes.c Outdated

if (TOP() == NULL) {

goto error;

}

ERROR_IF(TOP() == NULL, error);

Copy link

Member

iritkatriel Nov 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TOP() —> res

gvanrossum added 3 commits November 6, 2022 02:12

Bugfix (thanks to Irit)

19b989c

Fix regex nit

b8b4f15

Merge main (TODO: PyObject *value etc.)

43e7796

gvanrossum added 4 commits November 6, 2022 11:30

Fix code generated for super instructions

62c67b0

STORE_SLICE

5fbe7b0

Refactor parser to be more type-correct

fba1de2

Also detect whether names that occur in both inputs and outputs are at the same position.

STORE_SUBSCR

80741b3

gvanrossum marked this pull request as ready for review November 7, 2022 08:11

markshannon reviewed Nov 7, 2022

View reviewed changes

gvanrossum added 2 commits November 7, 2022 15:47

Fewer jumps in ERROR_IF expansion

415757f

Add dummy variables for stack effects

48c2a20

gvanrossum merged commit f1a6546 into python:main Nov 8, 2022

bedevere-bot removed the awaiting core review label Nov 8, 2022

gvanrossum deleted the stack-effect branch November 8, 2022 16:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

GH-98831: Simple input-output stack effects #99120

GH-98831: Simple input-output stack effects #99120

Uh oh!

gvanrossum commented Nov 5, 2022 •

edited

Loading

Uh oh!

iritkatriel Nov 5, 2022

Uh oh!

gvanrossum Nov 5, 2022

Uh oh!

gvanrossum commented Nov 5, 2022

Uh oh!

gvanrossum commented Nov 5, 2022

Uh oh!

brandtbucher commented Nov 5, 2022 •

edited

Loading

Uh oh!

iritkatriel Nov 6, 2022

Uh oh!

gvanrossum commented Nov 7, 2022 •

edited

Loading

Uh oh!

markshannon left a comment

Uh oh!

markshannon Nov 7, 2022

Uh oh!

markshannon Nov 7, 2022

Uh oh!

markshannon Nov 7, 2022

Uh oh!

markshannon Nov 7, 2022

Uh oh!

gvanrossum commented Nov 8, 2022

Uh oh!

Uh oh!

		Py_DECREF(container);
		if (res == NULL) { STACK_SHRINK(3); goto error; }

Uh oh!

GH-98831: Simple input-output stack effects #99120

GH-98831: Simple input-output stack effects #99120

Uh oh!

Conversation

gvanrossum commented Nov 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

iritkatriel Nov 5, 2022

Choose a reason for hiding this comment

Uh oh!

gvanrossum Nov 5, 2022

Choose a reason for hiding this comment

Uh oh!

gvanrossum commented Nov 5, 2022

Uh oh!

gvanrossum commented Nov 5, 2022

Uh oh!

brandtbucher commented Nov 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

iritkatriel Nov 6, 2022

Choose a reason for hiding this comment

Uh oh!

gvanrossum commented Nov 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markshannon left a comment

Choose a reason for hiding this comment

Uh oh!

markshannon Nov 7, 2022

Choose a reason for hiding this comment

Uh oh!

markshannon Nov 7, 2022

Choose a reason for hiding this comment

Uh oh!

markshannon Nov 7, 2022

Choose a reason for hiding this comment

Uh oh!

markshannon Nov 7, 2022

Choose a reason for hiding this comment

Uh oh!

gvanrossum commented Nov 8, 2022

Uh oh!

Uh oh!

gvanrossum commented Nov 5, 2022 •

edited

Loading

brandtbucher commented Nov 5, 2022 •

edited

Loading

gvanrossum commented Nov 7, 2022 •

edited

Loading