Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GC performance counters #1099

Closed
benmwatson opened this issue Dec 20, 2019 · 5 comments
Closed

Add GC performance counters #1099

benmwatson opened this issue Dec 20, 2019 · 5 comments
Assignees
Milestone

Comments

@benmwatson
Copy link
Member

It would be great to be able to monitor more detailed GC performance stats out of the box, not just raw count or the vague % Time in GC, but specifics such as:

  • average gen0/1/2 time (from thread pause to thread resume)
  • average time between GCs, both time between GCs of the same generations (e.g., gen0 - gen0 time), and time between GCs of any generation.

min/max/P95 would be great as well, but might be pushing it.

Currently, you can calculate these yourself by listening to ETW events, but some simple counters could replace 95% of the need for that.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-GC-coreclr untriaged New issue has not been triaged by the area owner labels Dec 20, 2019
@sywhang sywhang self-assigned this Jan 3, 2020
@sywhang
Copy link
Contributor

sywhang commented Jan 3, 2020

There are some additional GC counters that were requested from elsewhere:

  • SOH alloc rate
  • LOH alloc rate

Are any of these of interest to you @benmwatson?

average gen0/1/2 time (from thread pause to thread resume)

From thread pause to thread resume is different from from gen0 GC start to gen0 GC end. Is the timing from GC start to end not sufficient? (This is what's reported via ETW/EventPipe Events).

Also I'm not sure if I can add all these GC counters since the cost of computing these add up eventually. I'll have to do a perf test.

min/max/P95 would be great as well, but might be pushing it.

This may/may not be useful since the rate at which these counters are reported may be shorter than the rate GCs are occurring, which would mean these percentiles might not be useful.

Most of the GC counters report values between GCs - i.e. the % time in GC isn't time spent in GC since the last counter value was reported; it really means the % time in GC since the last GC happened that was reported. The main reason for doing this is because we simply don't have anything to report unless a GC happened.

@benmwatson
Copy link
Member Author

Yes, both SOH and LOH alloc rates would be great to have.

I'm thinking that thread pause/resume is the better boundary to report because that's measuring the actual impact on the program. However, most of our analysis is comparative, so as long as you pick a consistent metric, I'm fine with either.

@tommcdon tommcdon removed the untriaged New issue has not been triaged by the area owner label Jan 31, 2020
@jeffschwMSFT jeffschwMSFT added this to the 5.0 milestone Feb 26, 2020
@mikem8361 mikem8361 added the p1 label Mar 7, 2020
@tommcdon
Copy link
Member

Dependent on #34648

@tommcdon
Copy link
Member

Design doc: #39800

@sywhang
Copy link
Contributor

sywhang commented Jul 31, 2020

Closing as discussed in #39800

@sywhang sywhang closed this as completed Jul 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants