-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subsystem metrics for task manager #12235
Merged
fosterseth
merged 1 commit into
ansible:devel
from
fosterseth:subsystem_metrics_task_manager
Jun 14, 2022
Merged
Subsystem metrics for task manager #12235
fosterseth
merged 1 commit into
ansible:devel
from
fosterseth:subsystem_metrics_task_manager
Jun 14, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kdelee
reviewed
May 16, 2022
kdelee
reviewed
May 16, 2022
kdelee
approved these changes
May 16, 2022
fosterseth
force-pushed
the
subsystem_metrics_task_manager
branch
from
May 30, 2022 19:24
ff74e53
to
ce0b28e
Compare
fosterseth
changed the title
[wip] Subsystem metrics for task manager
Subsystem metrics for task manager
May 31, 2022
fosterseth
force-pushed
the
subsystem_metrics_task_manager
branch
from
May 31, 2022 15:10
3e649e6
to
55d0c54
Compare
fosterseth
commented
May 31, 2022
fosterseth
force-pushed
the
subsystem_metrics_task_manager
branch
from
June 1, 2022 21:49
09bc7d6
to
83c1d7b
Compare
kdelee
reviewed
Jun 2, 2022
fosterseth
force-pushed
the
subsystem_metrics_task_manager
branch
2 times, most recently
from
June 9, 2022 18:23
e262f6a
to
26d4e36
Compare
@fosterseth @AlanCoding whats between this and merging? |
fosterseth
force-pushed
the
subsystem_metrics_task_manager
branch
from
June 13, 2022 18:47
26d4e36
to
3a2d96c
Compare
AlanCoding
reviewed
Jun 14, 2022
|
||
def record_aggregate_metrics_and_exit(self, *args): | ||
self.record_aggregate_metrics() | ||
sys.exit(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this sys.exit(1)
do to the transaction? I think it rolls it back right? I assume that would be the same as the current behavior. Right? Probably.
AlanCoding
approved these changes
Jun 14, 2022
fosterseth
force-pushed
the
subsystem_metrics_task_manager
branch
from
June 14, 2022 15:00
0046c4d
to
2f82b75
Compare
11 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
SUMMARY
Adds a handful of subsystem metrics for the task manager. It tracks data from the last-ran task manager cycle only The metrics are saved only at the very end of the task manager
schedule()
call. . A consequence of that useful task manager data may be quickly overridden before prometheus can scrape the endpoint.imagine this timeline of events
prometheus would have missed the first task manager cycle that did a lot of processing, and only "see" the data from the task manager that ran 0 jobs.
To combat this, the task manager will track each time it records metrics. If the last time the metrics were written is less than 15s (
settings.SUBSYSTEM_METRICS_TASK_MANAGER_RECORD_INTERVAL
), the task manager will not record metrics to redis. This will give prometheus enough time to scrape the metrics endpoint and capture a snapshot of the task manager run.ISSUE TYPE
COMPONENT NAME
AWX VERSION