-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Worker level task metrics #12446
Worker level task metrics #12446
Conversation
@@ -63,9 +63,15 @@ | |||
"task/pending/count" : { "dimensions" : ["dataSource"], "type" : "gauge" }, | |||
"task/waiting/count" : { "dimensions" : ["dataSource"], "type" : "gauge" }, | |||
|
|||
"worker/task/failed/count" : { "dimensions" : ["category", "vesion"], "type" : "count" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vesion -> version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahhh dang lol. Good catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Added a new monitor,
WorkerTaskCountStatsMonitor
, that allows each middle manage worker to report metrics for successful / failed tasks, and task slot usage. There is an exsitingTaskCountStatsMonitor
, which allows for reporting of task metrics but is done so by the overlord and so task metric data for individual workers is lost, hence this monitor. This monitor is only supported on MiddleManager type NodeRole.Also fixes an inconsistency in the name of the existing task metric for tracking taskslot usage.
This PR has: