-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for tagged metrics #3147
Conversation
6df1fcc
to
f350355
Compare
client/config/config.go
Outdated
@@ -179,6 +179,14 @@ type Config struct { | |||
// NoHostUUID disables using the host's UUID and will force generation of a | |||
// random UUID. | |||
NoHostUUID bool | |||
|
|||
// DisableTaggedMetrics determines whether metrics will be displayed via a | |||
// key/value/tag format, or simply a key/value format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the parsing of the config in the agent be a separate PR?
client/client.go
Outdated
metrics.SetGauge([]string{"client", "host", "memory", nodeID, "available"}, float32(hStats.Memory.Available)) | ||
metrics.SetGauge([]string{"client", "host", "memory", nodeID, "used"}, float32(hStats.Memory.Used)) | ||
metrics.SetGauge([]string{"client", "host", "memory", nodeID, "free"}, float32(hStats.Memory.Free)) | ||
hStats := c.hostStatsCollector.Stats() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call these in the emitHostStats and pass it in to all the methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was considering creating a separate ClientMetrics object and setting hStats, nodeID, etc as fields. What do you think about that, or should this logic be kept in Client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe if we find that we add more than just the stats and nodeID but for just those two I don't think it is worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Base labels could also be a field- but agree, we can not do it for now and keep it as an option for later.
client/task_runner.go
Outdated
if ru.ResourceUsage.MemoryStats != nil && r.config.PublishAllocationMetrics { | ||
func (r *TaskRunner) setGaugeForMemory(ru *cstructs.TaskResourceUsage) { | ||
if !r.config.DisableTaggedMetrics { | ||
labels := []metrics.Label{{"job", r.alloc.Job.Name}, {"task_group", r.alloc.TaskGroup}, {"alloc_id", r.alloc.ID}, {"task", r.task.Name}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a field to the task runner struct that holds these baseLabels and create it once when creating the task runner. Avoid making a bunch of new arrays every time
client/client.go
Outdated
|
||
metrics.SetGauge([]string{"uptime"}, float32(hStats.Uptime)) | ||
if !c.config.DisableTaggedMetrics { | ||
labels := []metrics.Label{{"node_id", nodeID}, {"datacenter", c.Node().Datacenter}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't the others have datacenter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create a base labels on the client as well and where addition labels are needed they can just append and create a new slice?
having region and dc in all metrics would be the bomb! |
client/client.go
Outdated
metrics.SetGauge([]string{"uptime"}, float32(hStats.Uptime)) | ||
func (c *Client) setGaugeForUptime(hStats *stats.HostStats) { | ||
if !c.config.DisableTaggedMetrics { | ||
labels := []metrics.Label{{"datacenter", c.Node().Datacenter}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not baseLabels?
client/client.go
Outdated
@@ -187,6 +191,8 @@ func NewClient(cfg *config.Config, consulCatalog consul.CatalogAPI, consulServic | |||
serversDiscoveredCh: make(chan struct{}), | |||
} | |||
|
|||
c.baseLabels = []metrics.Label{{"node_id", c.Node().ID}, {"datacenter", c.Node().Datacenter}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible to include region too? would make it easier to work with the metrics inside datadog and influx when you have one common metrics storage cross env
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including the region will take a bit more work and is outside the scope for this PR, but we will definitely keep this in mind for the future. Thanks for the suggestion!
79bbe58
to
7bacaff
Compare
client/client.go
Outdated
@@ -284,6 +288,10 @@ func NewClient(cfg *config.Config, consulCatalog consul.CatalogAPI, consulServic | |||
// Start collecting stats | |||
go c.emitStats() | |||
|
|||
// Assign labels at the latest possible moment so the information expected | |||
// is ready | |||
c.baseLabels = []metrics.Label{{Name: "node_id", Value: c.Node().ID}, {Name: "datacenter", Value: c.Node().Datacenter}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want this before you start the emitStats go routine right above it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If anything you could put this in emitStats itself before it starts the loop.
client/client.go
Outdated
@@ -284,6 +288,10 @@ func NewClient(cfg *config.Config, consulCatalog consul.CatalogAPI, consulServic | |||
// Start collecting stats | |||
go c.emitStats() | |||
|
|||
// Assign labels at the latest possible moment so the information expected | |||
// is ready | |||
c.baseLabels = []metrics.Label{{Name: "node_id", Value: c.Node().ID}, {Name: "datacenter", Value: c.Node().Datacenter}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If anything you could put this in emitStats itself before it starts the loop.
client/client.go
Outdated
metrics.SetGauge([]string{"client", "host", "memory", nodeID, "available"}, float32(hStats.Memory.Available)) | ||
metrics.SetGauge([]string{"client", "host", "memory", nodeID, "used"}, float32(hStats.Memory.Used)) | ||
metrics.SetGauge([]string{"client", "host", "memory", nodeID, "free"}, float32(hStats.Memory.Free)) | ||
func (c *Client) setGaugeForMemoryStats(nodeID string, hStats *stats.HostStats) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comments on all the new functions
client/client.go
Outdated
c.setGaugeForUptime(hStats) | ||
c.setGaugeForCPUStats(nodeID, hStats) | ||
c.setGaugeForDiskStats(nodeID, hStats) | ||
c.setGaugeForAllocationStats(nodeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably shouldn't be guarded by c.config.PublishNodeMetrics
. This should be in emitClientMetrics
client/client_test.go
Outdated
} | ||
|
||
nodeID := client.Node().ID | ||
for _, e := range client.baseLabels { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert length of base labels != 0
client/task_runner.go
Outdated
// emitStats emits resource usage stats of tasks to remote metrics collector | ||
// sinks | ||
func (r *TaskRunner) emitStats(ru *cstructs.TaskResourceUsage) { | ||
if ru.ResourceUsage.MemoryStats != nil && r.config.PublishAllocationMetrics { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (r *TaskRunner) emitStats(ru *cstructs.TaskResourceUsage) {
if !r.config.PublishAllocationMetrics {
return
}
if ru.ResourceUsage.MemoryStats != nil {
...
}
....
command/agent/config.go
Outdated
GCInodeUsageThreshold: 70, | ||
GCMaxAllocs: 50, | ||
NoHostUUID: helper.BoolToPtr(true), | ||
DisableTaggedMetrics: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unneeded since the default value for booleans is false.
@@ -1032,6 +1042,14 @@ func (a *ClientConfig) Merge(b *ClientConfig) *ClientConfig { | |||
result.NoHostUUID = b.NoHostUUID | |||
} | |||
|
|||
if b.DisableTaggedMetrics { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9c32be1
to
681a3f3
Compare
f337712
to
ef4aef0
Compare
8f53f39
to
4a03f8f
Compare
4a03f8f
to
68686cd
Compare
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
After the HTTP endpoint for tagged metrics is merged, extensive QA will be useful.