Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-46409: Make generators in bytecode #30633

Merged
merged 20 commits into from
Jan 20, 2022

Conversation

markshannon
Copy link
Member

@markshannon markshannon commented Jan 17, 2022

Adds a RETURN_GENERATOR bytecode which makes a generator (or coroutine) from the current frame.
This has a number of advantages:

  1. Generator frames cannot be incomplete, as the frame is fully initialized before creating the generator
  2. It simplifies the code for calling a Python function, as there is no need for a special case for generator functions.
  3. It removes the need to specialize calls to generator functions, they can be specialized as any other call to a Python function.

This PR also adds a JUMP_NO_INTERRUPT bytecode. This is the same as JUMP_ABSOLUTE without any checks for interrupts. Checking for interrupts in yield from would destroy the flaky pretense that Python coroutines are stackful. (They are not, but PEPs 380 and 489 like to pretend that they are), so we need the extra, seemingly redundant instruction.

We use RETURN_GENERATOR rather than MAKE_GENERATOR; RETURN_VALUE as the RETURN_VALUE would make the compiler treat all the following code as unreachable. We could fix that using artificial instructions, but I think that is best left to another PR.

https://bugs.python.org/issue46409

@markshannon
Copy link
Member Author

Fixes https://bugs.python.org/issue46389 and https://bugs.python.org/issue46374, but does not include tests for those issues.
I'll add the tests in separate PRs to close those issues.

@markshannon markshannon added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jan 17, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @markshannon for commit c33abbf 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jan 17, 2022
@markshannon markshannon added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jan 18, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @markshannon for commit 243e441 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jan 18, 2022
Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still trying to follow the rest of the C code here, but in case I fail, here's the rest of my review. :-)

Python/specialize.c Show resolved Hide resolved
Python/specialize.c Show resolved Hide resolved
@@ -0,0 +1,2 @@
Add new bytecode to make generators. Simplifies calling Python functions in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd name the new bytecode(s) explicitly here.

Objects/frameobject.c Outdated Show resolved Hide resolved
@@ -13,7 +13,6 @@ extern "C" {
and coroutine objects. */
#define _PyGenObject_HEAD(prefix) \
PyObject_HEAD \
/* Note: gi_frame can be NULL if the generator is "finished" */ \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question about gi_iframe. Below (L31) it is defined as an array of length 1 of object pointers. But everywhere it's used, it is cast to InterpreterFrame *. That's not an object or object pointer AFAICT. So what's going on here? Is the declaration wrong? An intentional lie?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a lie.

We don't want to expose InterpreterFrame in public headers, so we can't do the sensible thing and declare gi_frame as:

    InterpreterFrame gi_frame;

as C won't allow incomplete types in structs, even as the last member.

We could improve this by breaking up the header, so the public API sees a different definition. Still a lie, but a more elegant one.
Public header:

typedef struct {
    _PyGenObject_HEAD(gi)
} PyGenObject;

Private header:

typedef struct {
    _PyGenObject_HEAD(gi)
    InterpreterFrame gi_frame;
} PyGenObject;

Probably best done in a different PR, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Any type other than PyObject* would be better though :-). And if you replace the casts with a macro it’s easier to fix later. I have an idea for the macro but it’s too painful to type on a phone.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first field of InterpreterFrame is PyFunctionObject * and the first few fields are all PyObject *, so declaring gi_frame as PyObject * gi_frame[1] seemed safe. Do let me know what your macro is, once you at a PC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do let me know what your macro is, once you at a PC.

See https://bugs.python.org/issue40120#msg365465

Python/ceval.c Show resolved Hide resolved
Doc/library/dis.rst Show resolved Hide resolved
Doc/library/dis.rst Show resolved Hide resolved
Lib/test/test_generators.py Outdated Show resolved Hide resolved
@gvanrossum
Copy link
Member

PS. What do you mean by "coroutines are stackful"? (I was hoping that reading the code would clarify this, but either I didn't get to that point in the code yet or it's hidden by details.)

Objects/genobject.c Outdated Show resolved Hide resolved
Objects/genobject.c Outdated Show resolved Hide resolved
@markshannon
Copy link
Member Author

PS. What do you mean by "coroutines are stackful"? (I was hoping that reading the code would clarify this, but either I didn't get to that point in the code yet or it's hidden by details.)

Stackful coroutines have their own stack, like Lua coroutines or greenlets.
async def coroutines are "stackless", since they do not have their own stack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants