-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
promxy makes up datapoints by repeating the last dataset #606
Comments
I did some initial looking and am unable to reproduce this issue locally. From a quick look the results are confusing to me for both promxy and prometheus. The query is a range of ~45m with a step of 30s; which should be 90 datapoints. Promxy was returning 17 and prometheus was 6 or 7. So its a bit hard to say without the full data that promxy was seeing (either trace logs or a tcpdump). But my guess here is that the max isn't unique -- such that its putting multiple series together, but thats a really off-the wall guess without more data. |
Thanks for taking a first look! I've run the queries with a trace loglevel and got the following (now another pod but still the same issue:
And the result is
Looks to me like it's not receiving data points from any other cluster on that query.... And BTW we're running |
Sorry for the comment-update-mess. I decided it's better to run the curl command again and fetch the necessary information from the very same execution. The logs should definitely match the query. |
Since we already did a query of a downstream (in some aggregate) we don't want to apply any LookbackDelta ranges to this (as it was already computed). This avoids the behavior where a response will "repeat" the last result seen for the duration of the LookbackDelta. Fixes #606
Since we already did a query of a downstream (in some aggregate) we don't want to apply any LookbackDelta ranges to this (as it was already computed). This avoids the behavior where a response will "repeat" the last result seen for the duration of the LookbackDelta. Fixes #606
Thanks for the report here! With all the data this makes it trivial to reproduce which really speeds things up. So looking at this it seems that this is just another case of
For promxy it does the same-- the difference being that the dataset it is applying it against (usually) has already done this. So in this case, the default lookback delta is 5m -- so what happened is promql in promxy decided that the last point it saw was okay to repeat because it was the last point it saw. For other selectors (e.g. |
Since we already did a query of a downstream (in some aggregate) we don't want to apply any LookbackDelta ranges to this (as it was already computed). This avoids the behavior where a response will "repeat" the last result seen for the duration of the LookbackDelta. Fixes #606
We recognized that promxy repeats the last datapoint when doing simple queries like "max()" on a set of clusters.
Our setup looks like the following:
While promxy accesses each of the clusters it does only connect to a single prometheus instance through a http proxy configuration. Each k8s cluster just contains data from it's own running pods.
The configuration looks like the following:
Now querying for data with to promxy or the prometheus instances directly through port-forwarding:
curl -v --get --data-urlencode 'query=max(container_memory_usage_bytes{id!="/",namespace=~"XXXX",container!="",container="XXXX",container!="POD",pod=~".*XXXXX"}) by (pod, container, cluster)' --data-urlencode "start=2023-07-14T06:39:53.152Z" --data-urlencode "end=2023-07-14T07:24:58.224Z" --data-urlencode "step=30s" 'http://localhost:9090/api/v1/query_range'
returns the following data through promxy:
But through prometheus1-1:
and through prometheus1-2:
As you can see there's additional datapoints in the promxy output which shouldn't be there.
Am I holding it somehow wrong?
If I'm missing any data necessary to debug the issue, just let me know.
Thanks for your help in advance!
The text was updated successfully, but these errors were encountered: