-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix pushdown with deduplication on #4987
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some test cases to show some bugs in pushdown, unfortunately :/ cc @fpetkovski
b1d830e
to
89f78f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think what happens is that query_range aligns the timestamps for us so then the deduplication doesn't work properly since they are all the same and the dedup depends on the timestamps :/
Hm, I wonder if we could inject a synthetic label when a query is evaluated in the sidecar. |
I think we could leverage series response hints for that. I'd prefer not to add metadata to labels to avoid allocations. In my opinion, we could solve the problem shown in these tests with a custom iterator that performs these functions (max/min) on top of all series from each replica to replicate the old behavior. What do you think, @fpetkovski ? |
I have pushed a new pushdown series iterator that performs the given function over all replicas before returning the result. What do you think about this approach @bwplotka @fpetkovski ? Just need to decide how we'd like to indicate in SeriesResponse that results have been precomputed - via labels, via hints, or some other mechanism. I'd like to know your thoughts 🤗 |
Can we add a test for the case where Querier is pushing a query to 2 different sets of replicated sidecars. For example, s1 and s2 being replicas of each other, and s3 and s4 also being replicas of each other. I am wondering what would happen if s1 and s3 have overlapping series, would the new iterator deduplicate them? |
Add some cases for pushdown with deduplication to test how it works. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Add a pushdown series iterator that performs functions over replicas before returning control to the promql engine. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
ca3d787
to
0aeccbd
Compare
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Could you please elaborate on this with some examples? Because the deduplication should work the same. I have pushed v2 of this change:
I think this approach is much cleaner and handles all cases. PTAL @bwplotka @fpetkovski 🙏 |
Do not reset the slice if that is not needed. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
@yeya24 maybe this would be interesting to you as well |
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Change the name to not have clashes. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Don't call Seek() directly, iterate gently. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Made a small picture showing how it works. @fpetkovski @bwplotka friendly ping 🤗 |
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Given the upcoming release and no other comments, I'm likely to merge this to avoid shipping a broken feature. Plus, everything is behind |
* e2e: add dedup cases for pushdown Add some cases for pushdown with deduplication to test how it works. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * query: add pushdown series iterator Add a pushdown series iterator that performs functions over replicas before returning control to the promql engine. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * pushdown: add package Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * pushdown: fix bug in iterator Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * pushdown: fix grammar mistake Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * *: rework pushdown iterator Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * dedup: cleanups Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * dedup: add if Do not reset the slice if that is not needed. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * dedup: use helpers Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * e2e: fix test Change the name to not have clashes. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * query: ensure pushdown label is at the end Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * dedup: avoid infinite loop Don't call Seek() directly, iterate gently. Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> * dedup: fix bugs with gaps, add test cases Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com> Signed-off-by: Nicholaswang <wzhever@gmail.com>
Add some cases for pushdown with deduplication to test how it works.
There are some bugs here :'/
Fix them by creating a new series iterator that performs the given function over all replicas.