-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-96793: Specialize FOR_ITER for generators. #98772
GH-96793: Specialize FOR_ITER for generators. #98772
Conversation
There is a bug in this. It doesn't set the |
https://github.com/python/cpython/compare/main...faster-cpython:cpython:specialize-for-iter-gen-handle-exc-stack?expand=1 |
@@ -110,6 +111,7 @@ _PyFrame_InitializeSpecials( | |||
frame->frame_obj = NULL; | |||
frame->prev_instr = _PyCode_CODE(code) - 1; | |||
frame->is_entry = false; | |||
frame->yield_offset = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this called "yield_offset", is it not the relative offset to jump by when the generator is exhausted? (the same as the arg to FOR_ITER?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it's the offset for YIELD_VALUE
to return to relative to RETURN_VALUE
https://github.com/python/cpython/pull/98772/files#diff-c22186367cbe20233e843261998dc027ae5f1f8c0d2e778abfa454ae74cc59deR2087
You can run https://github.com/python/pyperformance/blob/main/pyperformance/data-files/benchmarks/bm_generators/run_benchmark.py, it is specifically designed to benchmark generators (microbenchmark). |
I'll run that benchmark if pyperformance ever does another release. |
I'm seeing a 35% speedup on this benchmark: Tree iteratorimport timeit
class Tree:
def __init__(self, left, value, right):
self.left = left
self.value = value
self.right = right
def __iter__(self):
if self.left:
for item in self.left:
yield item
yield self.value
if self.right:
for item in self.right:
yield item
def tree(input: range) -> Tree | None:
n = len(input)
if n == 0:
return None
i = n // 2
return Tree(tree(input[:i]), input[i], tree(input[i + 1:]))
def setup():
global iterable
assert list(tree(range(10))) == list(range(10))
iterable = tree(range(100000))
print(timeit.timeit("for _ in iterable: pass", "setup()", globals=globals(), number=10)) Note that this uses And a 36% speedup on this one: Flat iteratorimport timeit
class RangeWrapper:
def __init__(self, n):
self.r = range(n)
def __iter__(self):
for item in self.r:
yield item
def setup():
global iterable
iterable = RangeWrapper(1000000)
print(timeit.timeit("for _ in iterable: pass", "setup()", globals=globals(), number=10)) Comparing the fastest of 6 runs for main and this PR. |
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
Performance results seem to be just noise. I suspect there aren't enough uses of generators in the benchmark suite for this to make a difference.
for gen():
, and awaiting coroutines,await coro()
#96793