Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keeping 'stack_pointer` in sync with frame->stacktop is possibly too tricky #94666

Open
ericsnowcurrently opened this issue Jul 7, 2022 · 7 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@ericsnowcurrently
Copy link
Member

While considering the situation of gh-94215, I realized there is a broader concern worth discussing, of which that issue is a specific instance. FWIW, this isn't some massive, pervasive, nor high-priority problem that needs immediate attention. It is worth consideration at some point, though, as it's bitten us already and I expect it will bite us again.

Context:

In the eval loop (in ceval.c), we copy some interpreter/thread/frame state into local variables, to reduce overhead a little. Of these variables, several get modified during execution. One of them, stack_pointer, must be kept in sync with the frame->stacktop, which can be modified via _PyFrame_SetStackPointer(), _PyFrame_StackPop(), and _PyFrame_StackPush(). We pass stack_pointer in various calls (e.g. vectorcall) throughout the eval loop, so it is important that it stay in sync, at least at the points where we use it. (I covered this a bit more in the other issue.)

The Concern:

From where I'm at, it's hard to have a lot of confidence that we are preserving the invaraiant (that stack_pointer is properly in sync with the frame state) in all cases. There doesn't seem to be any structural mechanism by which we get guarantees (relative to use of _PyFrame_SetStackPointer(), etc.). Instead it seems like knowing when to synchronize stack_pointer, and when we can avoid unnecessary synchronization, currently requires deep knowledge of a lot of pieces.

Consequently, it seems like it is relatively easy to break the invariant accidentally. The fact that it happened to someone as knowledgeable as @markshannon really says a lot.

Am I out of touch here? If not, is it worth doing something about all that?

CC @iritkatriel @tiran @markshannon @pablogsal

@ericsnowcurrently ericsnowcurrently added the type-bug An unexpected behavior, bug, or error label Jul 7, 2022
@ericsnowcurrently
Copy link
Member Author

Some possible "solutions":

  • leave it alone (the performance benefit is enough to offset the very infrequent mistakes?)
  • add tests that catch the mistakes somehow
  • add asserts to every use of stack_pointer to check synchronization
  • add equivalent checks under Py_DEBUG
  • drop the local variable and use _PyFrame_GetStackPointer() directly instead
  • add stack_pointer = _PyFrame_SetStackPointer(frame); right before every call/use of stack_pointer (which makes the local irrelevant)
  • add some mechanism that ties _PyFrame_SetStackPointer(), etc. to the local variable somehow (unrealistic)

@encukou
Copy link
Member

encukou commented Jul 8, 2022

For starters, it would be nice to properly define the invariant. There are obviously cases when stack_pointer goes out of sync with frame->stacktop, and currently it's not clear if they are intentional – and if so, when the eventual sync is expected.
Same goes for next_instr/frame->prev_instr (and anything else that'll be made local in the future).

@markshannon
Copy link
Member

markshannon commented Jul 25, 2022

The invariant is well defined.
For years, the invariant has been that if the stack-pointer stored in the frame was invalid (used to be NULL, now its -1) then the stack pointer was being handled in the interpreter, _PyEval_EvalFrameDefault.

https://github.com/python/cpython/blob/main/Python/ceval.c#L1706 explains why the in-memory stack offset is made invalid when the stack pointer is held in a register.

There are obviously cases when stack_pointer goes out of sync with frame->stacktop

When are they out of sync?

@markshannon
Copy link
Member

I should say that the stack pointer is (along with the instruction pointer) the hottest variable int the entire VM and needs to be in a register. No harm in adding more asserts, though.

@ericsnowcurrently
Copy link
Member Author

When are they out of sync?

See #94215 (and my comment at #94215 (comment)).

@markshannon
Copy link
Member

Where in the code?

They can only be out of sync when both are valid and reachable.
In other words, it can only happen in _PyEval_EvalFrameDefault when frame->stacktop >= 0.
That should only happen in a few places dealing with tracing and the like.

@brandtbucher
Copy link
Member

brandtbucher commented Oct 13, 2022

Looking at the code, it appears that we don't set frame->stacktop = -1; after TRACE_FUNCTION_ENTRY();. I don't know if this is intentional or not, but it seems incorrect to me.

@iritkatriel iritkatriel added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Nov 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants