-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-4346: Add metrics for informer #129160
base: master
Are you sure you want to change the base?
Conversation
Please note that we're already in Test Freeze for the Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Wed Dec 11 12:08:11 UTC 2024. |
Hi @xigang. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
p.metrics.processDuration.Observe(time.Since(startTime).Seconds()) | ||
//TODO: This requires implementing Len() and Capacity() for ring growing | ||
// p.metrics.numberOfPendingNotifications.Set(float64(p.pendingNotifications.Len())) | ||
// p.metrics.sizeOfRingGrowing.Set(float64(p.pendingNotifications.Capacity())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to wait for the Len()
and Capacity()
methods in the ring growing package to be merged.
PR: kubernetes/utils#321
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this single-threaded? (is calling Len and Capacity independently and not under lock safe here, given the pendingNotifications
is not thread-safe?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, pendingNotifications
is not thread-safe. The pop()
and run()
goroutines will concurrently read and write. We can add a pendingNotificationsLock sync.RWMutex
here.
It can be fixed as follows:
p.pendingNotificationsLock.RLock()
length := float64(p.pendingNotifications.Len())
capacity := float64(p.pendingNotifications.Capacity())
p.pendingNotificationsLock.RUnlock()
p.metrics.numberOfPendingNotifications.Set(length)
p.metrics.sizeOfRingGrowing.Set(capacity)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. but it requires the ring buffer PR to be merged.
/sig api-machinery |
for sig-instrumentation review /assign |
/cc @richabanker |
@@ -268,6 +273,11 @@ func NewReflectorWithOptions(lw ListerWatcher, expectedType interface{}, store S | |||
return r | |||
} | |||
|
|||
func makeValidPromethusMetricName(in string) string { | |||
// this isn't perfect, but it removes our common characters | |||
return strings.NewReplacer("/", "_", ".", "_", "-", "_").Replace(in) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of dropping specific bad characters, shouldn't this be inverted and replace any character not in the set of allowed characters for valid prometheus names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, The current approach of replacing specific characters is less robust than enforcing the Prometheus metric naming rules directly. According to Prometheus documentation, metric names must match the regex [a-zA-Z_:][a-zA-Z0-9_:]*
I'll adjust this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt done
name: options.Name, | ||
name: options.Name, | ||
// we need this to be unique per process (some names are still the same)but obvious who it belongs to | ||
metrics: newReflectorMetrics(makeValidPromethusMetricName(fmt.Sprintf("reflector_"+options.Name+"_expectedType_"+reflect.TypeOf(expectedType).String()+"_%07d", rand.Intn(1000000)))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't the randomized suffix explode metrics cardinality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The randomized suffix can cause metric cardinality explosion, worsening Prometheus' storage and query performance. The randomized suffix can be removed here.
When the Name is not specified, Reflector will automatically generate a name by calling the naming.GetNameFromCallsite()
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt done
p.metrics.processDuration.Observe(time.Since(startTime).Seconds()) | ||
//TODO: This requires implementing Len() and Capacity() for ring growing | ||
// p.metrics.numberOfPendingNotifications.Set(float64(p.pendingNotifications.Len())) | ||
// p.metrics.sizeOfRingGrowing.Set(float64(p.pendingNotifications.Capacity())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this single-threaded? (is calling Len and Capacity independently and not under lock safe here, given the pendingNotifications
is not thread-safe?)
5c4ff0e
to
07012c4
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: xigang The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
f352336
to
daa45ba
Compare
@liggitt The issues have been fixed. thanks.😄 |
Signed-off-by: xigang <wangxigang2014@gmail.com>
daa45ba
to
a680145
Compare
/assign @deads2k |
/priority important-soon |
/cc @deads2k @dgrisonnet @wojtek-t @richabanker PTAL. thanks. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
KEP-4346
https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/4346-informer-metrics
Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: