-
Notifications
You must be signed in to change notification settings - Fork 594
Support connection buffer metrics between stmgr and instances #1695
Conversation
The patch supports connection buffer metrics between stmgr and instances. It exports the connection buffer metrics in both # of packets and # of bytes. Metric name format: 1. # of packets count __connection_buffer_by_intanceid/{instance-id}/packets 2. # of bytes __connection_buffer_by_intanceid/{instance-id}/bytes Examples: __connection_buffer_by_intanceid/container_2_word_6/bytes __connection_buffer_by_intanceid/container_2_word_6/packets
@maosongfu This looks good to me. |
sp_int32 task_id = itr->second; | ||
const sp_string& instance_id = instance_info_[task_id]->instance_->instance_id(); | ||
sp_int32 bytes = itr->first->getOutstandingBytes(); | ||
connection_buffer_metric_map_[instance_id]->scope("bytes")->record(bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
record() increases denominator_ by 1. is it averaged by invocation count or by period 10 seconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sum of metrics value/ sum of invocation count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after long run, the mean metric becomes obtuse.
for example, numerator_=10*1000, denominator_=1000, the mean is 10.
then, 20 value pulse 100 times only lead to 0.9 increase on the mean.
numerator_=10*1000+20*100, denominator_=1000+100, the mean is 10.9.
is it the intention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @huijunwu. What i'd like to have is the average queue length using the last N one second samples. For example If i want to ckeck this every minute, then I need the average over 60 samples (each sample taken every sec). Maybe I didn't understand the code correctly. Are you saying that this accumulates over time instead of doing the above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huijunw At the start of every metrics-collection-cycle, it would reset the value as 0; then at the end of every metrics-collection-cycle, the metric value would be: sum of metrics value every time recorded/ sum of invocation count => the average value of this metric recorded during the cycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i saw the reset 0 in MeanMetric::GetAndReset() and the cycle set in config 'heron.metrics.export.interval.sec: 60'.
+1
I see this is part of 0.14.6 milestone. Any progress on this? and @maosongfu, should I update viz code to reflect this metrics? |
The patch supports connection buffer metrics between stmgr and instances. It exports the connection buffer metrics in both # of packets and # of bytes. Metric name format: 1. # of packets count __connection_buffer_by_intanceid/{instance-id}/packets 2. # of bytes __connection_buffer_by_intanceid/{instance-id}/bytes Examples: __connection_buffer_by_intanceid/container_2_word_6/bytes __connection_buffer_by_intanceid/container_2_word_6/packets
The patch supports connection buffer metrics between stmgr and instances.
It exports the connection buffer metrics in both # of packets and # of bytes.
It will sample the metrics at an interval and then export the average value during every metrics-collection-interval.
Metric name format:
__connection_buffer_by_intanceid/{instance-id}/packets
__connection_buffer_by_intanceid/{instance-id}/bytes
Examples:
__connection_buffer_by_intanceid/container_2_word_6/bytes
__connection_buffer_by_intanceid/container_2_word_6/packets
This bases on @congwang : #1686