-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create system so that users can measure the performance of their task #215
Comments
+1 |
+1 |
@nhproject @qindj This is my current thinking, calculate the throughput per second for each node and add it as both an internal stat, and in the output of the show command. Currently the output of the show command displays the node pipeline with counts for how many points have passed along the edge. Adding throughput would be on each node and would look it:
Which is valid ![Alt text](http://g.gravizo.com/g? Thoughts? Do you only care about the throughput of the root of the pipeline or each node? |
Another thought is to just compute a throughput for the entire task but then compute average execution times for each node. Then it is apparent which node is a bottle neck in the DAG. Something like this:
![Alt text](http://g.gravizo.com/g? |
@nathanielc just a question, there will be different in the compute average execution times if I have one tick file or if I have 100? |
@panda87 There should not be a difference if your are running 1 task or 100s, but if you are hitting resource limits on your box you might see that. The important thing here is to expose the right actionable information so that if having multiple tasks does slow things down you will know it and be able to take appropriate action. |
But if I have 100 tick files every data point which received will go through on each one right? |
@nathanielc I think that getting the average execution times per each node (+ throughput for the entire task) is more informative. |
👍 @nhproject very important feedback |
@nathanielc It would be nice to have both "points processed" and "points processed per second", something like:
So it will be to parseable and you can add more metrics later if you want |
@nhproject @qindj @panda87 @yosiat I feel like #248 is ready to go. I plan to merge it tomorrow, any last comments? |
Beyond benchmarks of static tasks we need to make it easy for an end user to measure directly the performance of their own tasks so they can provision resources appropriately
The text was updated successfully, but these errors were encountered: