-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nan collection in counter buffer #267
Comments
As far as I'm concerned, we can delete that code path. Neither Google nor a search in the Linux kernel code finds anything regarding such an issue ever existing and |
Given that all information points to it being fixed for at least 7 years (the git-preserved history of lo2s), I tend to agree. But at the same time I have to admit that I am utterly curious what it is/was. |
A code path could lead to a NaN if diff_running is 0 is multiplying 0 with +/- infinity results in a NaN. This commit solves this problem by just deleting that code path, as there is no evidence that the bug it addresses has existed in recent times, if ever. This fixes #267
I encountered a trace in which a collected metric
uncore_clock/clockticks/
becameNaN
at some point and never reverted to a valid state.After a brief look at the code, it appears that
CounterBuffer
's state could becomeNaN
in casediff_enabled > diff_running
anddiff_running == 0
. We need to avoid those cases. Note that this happend on a Kernel, 5.19.1. Not sure what is special withtime_enabled
/time_runing
for PMU counters - and not sure if that "swap" bug is still in effect. Maybe we can also simplify the code now and only consider non-broken kernels? In any case there should be logging / handling of cases that result inNaN
.The text was updated successfully, but these errors were encountered: