You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For our uses, we are trying to set up an asymmetric Prometheus/Thanos setup using Promxy as a datacentre aggregator.
Here is a simplified view of what setup we're aiming to achieve. The problem part is the blue line from thanos query to promxy. The remote_read api call works as of #351 but returns no data. Cutting out promxy and targetting one of the Prometheus instance('s sidecar) directly works as expected and returns the relevant rows.
To replicate this, you'll need promxy and prometheus running, plus two thanos instances, one for the sidecar and one for the query webui.
This is definitely a reasonable looking objective (prometheus local with recent data, remote thanos with more data). Based on the diagram above I'd expect that to work (although as mentioned in #350 I''m not aware of anyone using the remote_read into promxy), although it would have been broken until that PR yesterday.
One thing I'd suggest looking into as an efficiency improvement is trying to get promxy in front of the thanos querier. Promxy has the ability to sub out a query to many different nodes and requires significantly fewer resources to get the answer. I added an example explaining this a bit here but TLDR remote_read is an inefficient interface for queries. So If promxy could be in front of the stack then that enables some queries (data that is "recent") to be served using the regular query interface through promxy which is significantly cheaper (this would mean alerting would be dramatically cheaper since its acting on recent data).
I did see #352 but that issue seems to be some go.mod issue; in reality promxy is currently based on a prometheus 2.10 fork so that should be a non-issue. Now that does mean we aren't new enough to have that STREAMED_XOR_CHUNKS option and its also possible that prometheus 2.10 had a bug in the SAMPLED interface (it wouldn't surprise me, all the remote_read/write stuff is "unsupported"or "experimental" so there are bugs in there with some regularity). So with that I'd suggest trying your setup with prometheus 2.10 and if you see the same problem there -- then its likely some issue in the prometheus dep (which means it'd be time to update again).
Related to #350 and #351
For our uses, we are trying to set up an asymmetric Prometheus/Thanos setup using Promxy as a datacentre aggregator.
Here is a simplified view of what setup we're aiming to achieve. The problem part is the blue line from thanos query to promxy. The remote_read api call works as of #351 but returns no data. Cutting out promxy and targetting one of the Prometheus instance('s sidecar) directly works as expected and returns the relevant rows.
To replicate this, you'll need promxy and prometheus running, plus two thanos instances, one for the sidecar and one for the query webui.
In trying to debug this, it seems that when pointing sidecar directly at Prometheus, it logs
but when pointing sidecar at promxy, it instead logs
After reading your comment yesterday #352 (comment) combined with finding this change prometheus/prometheus@48b2c9c, I'm wondering if Thanos is expecting the remote_read reply in
STREAMED_XOR_CHUNKS
format instead ofSAMPLED
. (edit: It appears Thanos accepts both, although the STREAMED codepath is certainly more tested now as Prometheus uses it by default: https://github.com/thanos-io/thanos/blob/a7b2a449ce9aa77cc225a699c1f399a3528d97b3/pkg/store/prometheus.go#L206-L216). It's entirely possible it's not that but it is one difference I observed. This bug may also be fixed by #352 too, possiblyBefore I start digging deep into the Prometheus/Thanos/Promxy code again, is there anything that jumps to mind that could cause this behaviour?
Thanks
The text was updated successfully, but these errors were encountered: