Adds stats for the tier 2 optimizer #109329

markshannon · 2023-09-12T13:43:07Z

Currently we have no stats for anything regarding the tier 2 optimizer.
Without them we are making too many guesses about what we should be doing.

The performance numbers tell us that things aren't working as well as they should, although not too badly either.
However, performance numbers tell us nothing about why that is, or what is happening.

For example, #109038 should have increased the important ratio of (number of uops executed)/(traces started) but we have no idea if it actually did.

We need the following stats soon:

Total micro-ops executed
Total number of traces started
Total number of traces created
Optimization attempts

The following stats would also be nice, but are less urgent:

Per uop execution counts, like we have for tier 1 instructions.
Exit reason counts: polymorphism vs branch mis-prediction
A histogram of uops executed per trace

Linked PRs

The text was updated successfully, but these errors were encountered:

brandtbucher · 2023-09-14T16:40:25Z

We've been collecting and dumping stats for all of the counters in your first list for three months now... but they seem to be ignored in the summarize_stats.py script:

Optimization uops executed
Optimization traces executed
Optimization traces created
Optimization attempts

@mdboom, maybe you know why these aren't being included in the markdown summaries? From a quick skim, it looks like it might be because we ignore any counters that don't start with "Calls to", "Frame", "GC", or "Object". Maybe we should rework this to not ignore new counters?

brandtbucher · 2023-09-14T16:51:35Z

I'd also like to see the reasons why projecting stopped:

Trace too long (anecdotally, I think this is the leading reason... for example, all five of nbody's traces should be 68-286 uops, but we cap them at 64, which means no loops are closed)
Unsupported opcode (maybe even with counters for each offender)
Inner loop found
Too many frame pushes (currently capped at a depth of 5)
Too many frame pops (if we return from the original frame)
Recursive function call (I don't think we bail on mutual recursion, but detecting that shouldn't be too hard since we have all of the code objects handy)
Call to unknown code object

@gvanrossum, any other stats we might like to see?

gvanrossum · 2023-09-14T17:09:33Z

Those lists look pretty exhaustive. I’d also count anything that I deemed worthy of a DPRINTF call.

mdboom · 2023-09-25T19:49:44Z

Maybe we should rework this to not ignore new counters?

Seems plausible. Let's coordinate on how to get this done -- I'm happy to take this on as I have a bit more time these days.

mdboom · 2023-09-26T18:52:08Z

I have a PR for the basics up, and then I will tackle some of the other suggestions in smaller chunks.

Some questions:

Currently the Tier 2 interpreter affects the results for the Tier 1 interpreter by virtue of calling STAT_INC. Is that just by accident of how we got here? I would assume we want to have completely separate opcode execution counts for the Tier 1 and Tier 2 interpreters, but thought I should confirm.
"A histogram of uops executed per trace" -- I assume this is a histogram of the number of uops executed per trace, not a count-by-type-of-uop-per-trace?

gvanrossum · 2023-09-26T21:00:07Z

Currently the Tier 2 interpreter affects the results for the Tier 1 interpreter by virtue of calling STAT_INC. Is that just by accident of how we got here?

I think it would be more useful to have separate counters per tier. But it would be somewhat complex to do that -- the header files that define them would have to check for TIER_ONE and TIER_TWO (only one of these should be defined).

"A histogram of uops executed per trace" [...]

Sounds like that question is for @brandtbucher.

brandtbucher · 2023-09-27T00:06:11Z

I assume this is a histogram of the number of uops executed per trace, not a count-by-type-of-uop-per-trace?

Yeah, we want to see the distribution in number of uops executed per entry into tier two.

That's helpful because we can execute many uops in traces that are statically short (if they close a hot loop) and few uops in traces that are statically long (if we deopt quickly). We want to optimize for "uops executed in tier two before deopting", not necessarily "trace projection length".

To ease implementation, it's fine to bucket these. The distribution will probably be heavily skewed towards small numbers (since we tend to enter and exit "bad" traces more often), so maybe powers of ten or two would work best?

This keeps a separate 'miss' counter for each micro-opcode, incremented whenever a guard uop takes a deoptimization side exit.

markshannon · 2024-01-11T18:24:23Z

This is all working nicely now.

This keeps a separate 'miss' counter for each micro-opcode, incremented whenever a guard uop takes a deoptimization side exit.

mdboom self-assigned this Sep 26, 2023

mdboom added a commit to mdboom/cpython that referenced this issue Sep 26, 2023

pythongh-109329: Support for basic pystats for Tier 2

7d9497a

mdboom mentioned this issue Sep 26, 2023

gh-109329: Support for basic pystats for Tier 2 #109913

Merged

mdboom added a commit to mdboom/cpython that referenced this issue Sep 26, 2023

pythongh-109329: Support for basic pystats for Tier 2

7ba5f55

mdboom mentioned this issue Sep 28, 2023

Refactor to reduce code duplication in Tools/scripts/summarize_stats.py #110019

Closed

brandtbucher pushed a commit that referenced this issue Oct 4, 2023

GH-109329: Add tier 2 stats (GH-109913)

e561e98

mdboom added a commit to mdboom/cpython that referenced this issue Oct 5, 2023

pythongh-109329: Add stat for "trace too short"

74f013c

bedevere-app bot mentioned this issue Oct 5, 2023

gh-109329: Add stat for "trace too short" #110402

Merged

markshannon pushed a commit that referenced this issue Oct 5, 2023

gh-109329: Add stat for "trace too short" (GH-110402)

9eb2489

bedevere-app bot mentioned this issue Oct 9, 2023

gh-109329: Count tier2 miss opcodes #110561

Merged

gvanrossum pushed a commit that referenced this issue Oct 31, 2023

gh-109329: Count tier2 opcode misses (#110561)

84b4533

This keeps a separate 'miss' counter for each micro-opcode, incremented whenever a guard uop takes a deoptimization side exit.

FullteaR pushed a commit to FullteaR/cpython that referenced this issue Nov 3, 2023

pythongh-109329: Count tier2 opcode misses (python#110561)

4a7cbc0

This keeps a separate 'miss' counter for each micro-opcode, incremented whenever a guard uop takes a deoptimization side exit.

markshannon closed this as completed Jan 11, 2024

aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024

pythongh-109329: Count tier2 opcode misses (python#110561)

ce33921

This keeps a separate 'miss' counter for each micro-opcode, incremented whenever a guard uop takes a deoptimization side exit.

Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024

pythonGH-109329: Add tier 2 stats (pythonGH-109913)

30b1e34

Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024

pythongh-109329: Add stat for "trace too short" (pythonGH-110402)

31d9348

Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024

pythongh-109329: Count tier2 opcode misses (python#110561)

c596a63

This keeps a separate 'miss' counter for each micro-opcode, incremented whenever a guard uop takes a deoptimization side exit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds stats for the tier 2 optimizer #109329

Adds stats for the tier 2 optimizer #109329

markshannon commented Sep 12, 2023 •

edited by bedevere-app bot

Loading

brandtbucher commented Sep 14, 2023

brandtbucher commented Sep 14, 2023

gvanrossum commented Sep 14, 2023

mdboom commented Sep 25, 2023

mdboom commented Sep 26, 2023

gvanrossum commented Sep 26, 2023

brandtbucher commented Sep 27, 2023 •

edited

Loading

markshannon commented Jan 11, 2024

Adds stats for the tier 2 optimizer #109329

Adds stats for the tier 2 optimizer #109329

Comments

markshannon commented Sep 12, 2023 • edited by bedevere-app bot Loading

Linked PRs

brandtbucher commented Sep 14, 2023

brandtbucher commented Sep 14, 2023

gvanrossum commented Sep 14, 2023

mdboom commented Sep 25, 2023

mdboom commented Sep 26, 2023

gvanrossum commented Sep 26, 2023

brandtbucher commented Sep 27, 2023 • edited Loading

markshannon commented Jan 11, 2024

markshannon commented Sep 12, 2023 •

edited by bedevere-app bot

Loading

brandtbucher commented Sep 27, 2023 •

edited

Loading