You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thanks @tripatti for creating this issue. I tried to reproduce but without success. On what environment are you running the router ? On docker ? I suspect the bug you have might be caused by your OS's clock or something related.
We are running on docker. One detail that is very clear is that the sub-graph specific sum is much much higher than the operation specific sum. E.g. total amount of seconds in the target sub-graph is magnitudes higher than that of the operation. In above example operation sum is 3.7 seconds, the sub-graph is truncated.
I just sampled production again and there the difference is 1000-fold. I do not think that would be caused by OS clock?
Describe the bug
When looking at prometheus histogram
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Histogram should distribute to buckets cleanly based on actual total operation time latency e.g.
http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.001"} 10 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.005"} 100 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.015"} 200 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.05"} 300 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.1"} 400 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.2"} 400 ..... http_request_duration_seconds_sum{operation_name="someName",status="200"} ... http_request_duration_seconds_count{operation_name="someName",status="200"} 1010
Output
http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.001"} 421216 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.005"} 421217 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.015"} 421218 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.05"} 421218 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.1"} 421218 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.2"} 421218 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.3"} 421218 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.4"} 421218 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="0.5"} 421218 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="1"} 421218 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="5"} 421218 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="10"} 421218 http_request_duration_seconds_bucket{operation_name="someName",status="200",le="+Inf"} 421218 http_request_duration_seconds_sum{operation_name="someName",status="200"} 3.7173161799999983 http_request_duration_seconds_count{operation_name="someName",status="200"} 421218
Our router (0.9.4) has this config:
The text was updated successfully, but these errors were encountered: