-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add pause metric in prometheus for scaledobject #4446
Conversation
1213fad
to
ee44129
Compare
Can you open a PR for our docs on kedacore/keda-docs please? |
@JorTurFer Hey, can you please review again? |
dc1278d
to
759a45a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Your latest commit isn't signed, could you fix it? https://github.com/kedacore/keda/blob/main/CONTRIBUTING.md#i-didnt-sign-my-commit-now-what |
And there is an error in the code https://github.com/kedacore/keda/actions/runs/4790038051/jobs/8518637396?pr=4446#step:11:20 |
We still need the DCO fixed 🙏 |
Signed-off-by: Elad Motola <eladmotola95@gmail.com>
0b3de51
to
a3ae5e2
Compare
@JorTurFer I fixed everything 😁 |
/run-e2e prometheus |
/run-e2e prometheus |
Signed-off-by: Elad Motola <eladmotola95@gmail.com> Signed-off-by: eladmotola <43670376+eladmotola@users.noreply.github.com>
/run-e2e prometheus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see some inconsistencies in the code:
scaledObjectPausedTotal = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Namespace: DefaultPromMetricsNamespace,
Subsystem: "scaled_object",
Name: "paused",
Help: "Total number of paused scaled_objects",
},
[]string{"namespace", "scaledObject"},
vs
if _, ok := family["keda_scaleobject_pause"]; ok {
t.Errorf("metric should not be available")
}
etc.
So what is the intention behind this PR? It reports total number of ScaledObjects that are in a paused state?
I think that it would be great if we add a metric, that acutally shows whether a specific ScaledObject is paused or not? At least that's what was the idea behind #4430
I think, that similar metric to keda_scaler_activity
, but in this case on ScaledObject level and not Scaler level. ie. keda_scaled_object_paused
should tell us whether a specific SO is paused or not.
oh wow i missed that... |
Signed-off-by: Elad Motola <eladmotola95@gmail.com> Signed-off-by: eladmotola <43670376+eladmotola@users.noreply.github.com>
Why do you think that this PR doesn't do it? I mean, the metric is updated on each reconciliation based on if it's paused or not |
I am not saying it doesn't do that, it was just super confusing given the metric name mismatch and also this: Help: "Total number of paused scaled_objects", I just briefly went through the code, I will do a proper review later :) |
Hey, can you trigger the e2e tests? |
/run-e2e prometheus |
The branch was merged from main |
/run-e2e prometheus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR should be rebased once this: #4470 is merged.
@@ -204,3 +215,9 @@ func DecrementCRDTotal(crdType, namespace string) { | |||
|
|||
crdTotalsGaugeVec.WithLabelValues(crdType, namespace).Dec() | |||
} | |||
|
|||
func RecordScalerPaused(namespace string, scaledObject string, value float64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please rewrite this function to accept bool instead of float64? the same way RecordScalerActive()
is written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind doing it. I just can't figure out why the tests fails
Namespace: DefaultPromMetricsNamespace, | ||
Subsystem: "scaled_object", | ||
Name: "paused", | ||
Help: "Total number of paused scaled_objects", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The help message is wrong.
Any chance that one of you guys can see why what I did is wrong? |
/run-e2e prometheus |
KubectlApplyWithTemplate(t, data, "deploymentForPausedTemplate", deploymentForPausedTemplate) | ||
KubectlApplyWithTemplate(t, data, "scaledObjectPausedTemplate", scaledObjectPausedTemplate) | ||
|
||
family = fetchAndParsePrometheusMetrics(t, fmt.Sprintf("curl --insecure %s", kedaOperatorPrometheusURL)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the metric is checked too fast and KEDA doesn't have enough time to process the ScaledObject. I guess that you can introduce a sleep here (5 seconds should be enough) or maybe deploying this resources in the beginning of the test file instead of inside the test case
Any update? We aim to cut a release next week |
Hey, |
Hi @zroubalik I want to finish up this PR. |
@eladmotola we will continue in this PR - @geoffrey1330 volunteered to do so. |
Done in #5045 |
Provide a description of what has been changed
Checklist
Fixes #
Relates to #4430
Relates to kedacore/keda-docs#1111