Description
This is the top level issue for the tier 2 optimizer for CPython 3.13 and beyond.
Try to keep discussion here at a high level and discuss the details on the sub-issues.
The tier 2 optimizer has always promised to optimize larger regions than the tier 1 (PEP 659) optimizer. But we have been a bit vague as to what those regions would be.
In an earlier discussion, I referred to them as "projected short traces".
The term "trace" is a bit misleading, as it suggest some sort of recording of program execution.
The optimization I propose is more akin to basic block versioning, than the trace recording of PyPy.
However, instead of basic blocks, we would be optimizing dynamic superblocks.
The extent of the superblocks would be determined at runtime from profiling data gathered by the tier 1 interpreter.
The term "superblocks" might also be a bit misleading as they might include inlined calls, but I it's the best name I could come up with for now. We could call the tier 2 optimization, "superblock versioning", as we intend to handle polymorphism in much the same way as BBV.
For this we to work, we need to be able to do the following:
- Modify the tier 1 interpreter to detect "hotspots" to determine where to start the dynamic superblocks.
- Create the superblocks
- Optimize the superblocks (it might make sense to merge some or all of this with the creation of the superblocks)
- Deoptimize the superblocks (to enable more speculation in the optimizer)
- Manage the superblocks, discarding cold ones and preventing memory use getting too large.