Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-104635: Add NO_EXCEPTION flag to opcode metadata #106394

Closed
wants to merge 3 commits into from

Conversation

corona10
Copy link
Member

@corona10 corona10 commented Jul 4, 2023

This PR is motivated from comments from @iritkatriel and @carljm.

There are several code patterns that can cause multiple stores to the same locals index in near succession,
...
Also need to make sure there isn't something between them that could raise an exception. We don't currently have this, > we probably need to add this in the opcode_metadata.

But I'm not sure if this is the concept Irit originally intended, so I write the PR as the PoC with heuristic codes that will not raise exceptions.
If there is something I missed, please let me know.

We can use this metadata for aggressive optimization between two opcodes that will not raise exceptions.

@corona10 corona10 requested review from carljm and iritkatriel July 4, 2023 02:25
@corona10 corona10 changed the title gh-104635: add NO_SIDE_EFFECT flag to opcode metadata gh-104635: Add NO_EXCEPTION flag to opcode metadata Jul 4, 2023
@corona10 corona10 requested a review from markshannon as a code owner July 4, 2023 02:48
@corona10 corona10 force-pushed the gh-104635-no-side-effect branch from 4d2ac67 to 4025d49 Compare July 4, 2023 07:07
@corona10 corona10 force-pushed the gh-104635-no-side-effect branch from 4025d49 to 47b2995 Compare July 4, 2023 07:08
@iritkatriel
Copy link
Member

What is the heuristic to determine whether an opcode raises?

@corona10
Copy link
Member Author

corona10 commented Jul 5, 2023

What is the heuristic to determine whether an opcode raises?

I thought that we should use a blocklist approach or allowlist approach for classifying opcodes.
In the current implementation, I used a blocklist approach.

  1. if the opcode calls any APIs that begin with _Py or Py (including Py_Incref), it has the possibility of causing an exception.
  2. Another case is ERROR_IF, it has the possibility of an error exception before the macro is called.

The reason for skipping JUMPXXX opcode as an exceptional case is that the current implementation does not pass the assertion of the analyzer. I should analyze why this happening but before that, I wanted to sync the idea of this metadata before digging into it.

Copy link
Member

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me that "exception-raising" status of an opcode changes enough that we need to automatically determine this metadata via heuristic on the opcode implementation, rather than just manually maintain it.

I'm not necessarily opposed, if we can find a sufficiently conservative heuristic that does not fail in the unsafe direction. But I'm not sure I see any such possible heuristic, given my inline comment.

@@ -252,6 +252,7 @@ class InstructionFlags:
HAS_CONST_FLAG: bool
HAS_NAME_FLAG: bool
HAS_JUMP_FLAG: bool
NO_EXCEPTION_FLAG: bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negative flags tend to lead to confusing double-negative code constructs. So I would prefer for the flag to be CAN_RAISE_FLAG than NO_EXCEPTION_FLAG, and similar throughout the PR.

if token.kind == "IDENTIFIER":
if token_text == "error_if":
return False
if token_text.startswith("py") or token_text.startswith("_py"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are static helper functions in ceval.c that are used by opcodes and not Py prefixed, and some of these can raise. E.g. match_class. We can special case the existing ones, but someone could add a new one at any time. So I don't think this heuristic is safe, and I'm not sure I see any possible heuristic that would be safe in the face of possible future changes.

Copy link
Member

@carljm carljm Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the match_class case should be covered by ERROR_IF used afterward to check if _PyErr_Occurred. But what if someone calls a non Py prefixed helper function and then uses a manual goto error; rather than ERROR_IF? There are some opcodes that use goto error; directly...

Maybe checking for both ERROR_IF and goto error would be adequate, and we wouldn't even need to check for use of Py prefixed API? For an opcode implementation to handle an error correctly, it seems it needs to goto error or ERROR_IF. Unless someone introduces another new macro that includes goto error, that would break the heuristic.

Existing macros that implicitly goto error include CHECK_EVAL_BREAKER, DECREF_INPUTS_AND_REUSE_FLOAT, and INSTRUMENTED_JUMP.

@corona10 corona10 closed this Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants