-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add stats to tasks #323
Conversation
@yosiat This is great work thanks! A few administrative things: Can you sing the CLA? and add an entry to the CHANGELOG about this change? Thanks I like the structure that you have added but I think it would be easy and valuable to expose more stats. First what stats are you after? I see two paths we could go down:
We can even do both.. And I think we should. I'll comment directly in the code where I think we could update what you have already. |
// Input nodes and their emitted points | ||
// where input nodes - are batch/stream node | ||
// for example: stream1 -> 12 | ||
Inputs map[string]int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should update the struct to have both summary stats for the task and a map of all stats for the all nodes in the task.
maybe a structure like
type ExecutionStats struct {
TaskStats map[string]interface{} // stats name -> value i.e throughput -> 4.3 points /sec
NodeStats map[string]map[string]interface{} // node name -> stat name -> value i.e influxdb_out -> point_written -> 1000
}
Then use the statsMap that each node has to populate the node stats maps.
Thoughts?
@nathanielc Thanks! I already signed the CLA (https://influxdb.com/community/cla.html) and I will add an entry to the change log once I will finish the output nodes as well. About the structure of stats I will change it but before, I want to clarify the reason for this API - I want to expose general stats - "stats for glance" on all tasks, I don't want to extract all the data from the dot (for example: how much each node processed and how much time it took). I think it should contain:
But in the overall sense I agree this structure is more acceptable: type ExecutionStats struct {
TaskStats map[string]interface{}
NodeStats struct {
Input map[string]map[string]interface{}
Output map[string]map[string]interface{}
}
} I changed NodeStats to be struct that will contain only the Input and Output to note that "for all nodes stats - there is a dot graph, for general data (like input & output), this is the place. About the values - why this is Aside the structure, I don't know how to implement the output nodes stats (as I wrote in the issue), currently I think to fetch all output nodes by predicate: node that have parents and doesn't have child nodes. but I don't know how to expose the output stats -
|
@nathanielc any comment? do you want me to clarify my answer better? currently I am waiting to your response before proceeding on. |
@yosiat Sorry for the delay. I like the plan, stats at a glance is a good approach. I am a little confused about the
Agreed, the interface is so you can use int64 or float64 but should always be one or the other. |
@nathanielc it's ok :) var a = stream.from('a')
var b= stream.from('b')
a.join(b)
// do some alerting logic And output might be creating an alert and writing to influx What do you think about the output nodes implementation? what do you prefer - specific function or exposing the statsMap? |
Changed the ExecutionStats structure, waiting to @nathanielc to answer on how to implement the output nodes. |
@yosiat I think simpler is better here, since we are doing stats at a glance. // Stats about the execution of a task
type ExecutionStats struct {
// Stats global to the task
// throughput etc...
TaskStats map[string]interface{}
// Specific node stats,
// Emitted, Collected counts plus anything in statsMap.
NodeStats map[string]interface{}
} NodeStats will be populated by whatever is in statsMap using statsMap.Do and the add in Populating TaskStats will be a bit more work since it doesn't already have a framework in place. We should probably add a statsMap to ExecutingTask too if it make sense. Otherwise we can populate TaskStats by aggregating other stats from nodes. This should get all the stats we were looking for. |
@nathanielc So you want ExecutionStats to contain all nodes? If so, what is the difference between this and a Dot? |
@yosiat Its exposed via HTTP in JSON so you do not have to parse DOT since its not a good method for conveying numerical data. |
@nathanielc ok, I am doing this right now. NodeStats map[string]interface{} instead of: NodeStats map[string]map[string]interface{} Was it by mistake? |
@yosiat Yes is should be |
@nathanielc I finished. Here is the final result: {
Name: "cpu_alert",
Type: 0,
DBRPs: [{
db: "kapacitor_example",
rp: "default"
}],
Enabled: true,
Executing: true,
ExecutionStats: {
TaskStats: {
throuput: 0
},
NodeStats: {
alert2: {
alerts_triggered: "0",
avg_exec_time_ns: "0",
collected: 0,
emitted: 0
},
srcstream0: {
avg_exec_time_ns: "0",
collected: 0,
emitted: 0
},
stream1: {
avg_exec_time_ns: "0",
collected: 0,
emitted: 0
}
}
}
} Do you know how change statsMap representation to be numbers and not strings? |
Yes , have a look at https://github.com/influxdata/kapacitor/blob/master/global_stats.go#L132 Also quick typo fix |
@nathanielc done. |
@yosiat yes, please. |
@nathanielc done 👍 |
@@ -246,6 +250,30 @@ func (n *node) collectedCount() (count int64) { | |||
return | |||
} | |||
|
|||
func (n *node) emittedCount() (count int64) { | |||
for _, out := range n.outs { | |||
count += out.emittedCount() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to read count += out.collectedCount()
see comment in node.collectedCount
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain? I don't see any comment on node.collectedCount
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the comment is gone, what it said was:
node collected count is the sum of emitted counts of parent edges
The opposite is true for node emitted count:
node emitted count is the sum of collected counts of children edges
Edges can buffer so we care about the actual points this node has collected or emitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed 👍
ee88d5b
to
19d0ae1
Compare
@@ -246,6 +250,30 @@ func (n *node) collectedCount() (count int64) { | |||
return | |||
} | |||
|
|||
func (n *node) emittedCount() (count int64) { | |||
for _, out := range n.outs { | |||
count += out.collectedCount() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add those comments in so its not confusing for the next guy? Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nathanielc Just to make sure:
node collected count is the sum of emitted counts of parent edges
above collectedCount
node emitted count is the sum of collected counts of children edges
above emittedCount
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
On Mar 18, 2016 11:55 AM, "Yosi Attias" notifications@github.com wrote:
In node.go
#323 (comment):@@ -246,6 +250,30 @@ func (n *node) collectedCount() (count int64) {
return
}+func (n *node) emittedCount() (count int64) {
- for _, out := range n.outs {
count += out.collectedCount()
Just to make sure:
node collected count is the sum of emitted counts of parent edges above
collectedCount
node emitted count is the sum of collected counts of children edges above
emittedCount?
—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/influxdata/kapacitor/pull/323/files/19d0ae1ab9e484ab4420b8f8901013506cb50800#r56696878
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done :)
@yosiat Thanks! merging... |
Add stats to executing tasks
@nathanielc I forgot to add a change log entry. |
I added no worries On Fri, Mar 18, 2016 at 1:58 PM, Yosi Attias notifications@github.com
Nathaniel Cook InfluxData.com http://influxdata.com/ |
See issue - #321
Currently I implemented the input nodes, I am waiting for feedback about this and a suggestion of how to implement the metrics fetching for the outputs.