Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a startTime and endTime parameter to the trace by id query API to improve query performance #1373

Closed
sagarwala opened this issue Apr 11, 2022 · 2 comments · Fixed by #1388
Assignees
Milestone

Comments

@sagarwala
Copy link
Contributor

Is your feature request related to a problem? Please describe.
The query trace by id feature in tempo searches all blocks in blocklist to find relevant blocks and merge trace objects found in these blocks. However, in environments that have high loads, even with compactors, the blocklist can extend upto thousands like 18000 etc, blocks and parsing through each of them affects performance. The current problem that we see is, each block has minId and maxId spanning across the maximum ranges which makes all blocks in blocklist an eligible candidate for parsing for a given trace id.

Describe the solution you'd like
We have come up with a time based search approach, that will include blocks for search based on the start time and end time of the blocks. The start time for the query is retrieved based on tags search. Default behaviour remains same if the time window is not specified.
Currently, in order to use the trace by id query, we have an elastic search deployed that gives us the trace ids based on specific tags or attributes. These objects also return the time window for the traces. We use the same window to query trace object by id so that the blocks outside the window are excluded and hence, the latency decreases and performance is better.

Describe alternatives you've considered
We have tried experimenting with compactor to merge only contiguous blocks so that we can restrict the minId and maxId of compacted blocks, as opposed to the current implementation of sorting by block age but we have not been able to scale using this approach. So, we think the alternate way to improve the query performance is to restrict the search for given time windows.

Additional context
The API spec would now be /api/traces/{traceID}?timeEnd={end}&timeStart={start}
A snapshot of the code changes would be : https://github.com/grafana/tempo/compare/v1.3.2...sagarwala:v1.3.2-branch?expand=1

@joe-elliott
Copy link
Member

I really like this idea and have considered adding into Tempo before. I think the real challenge will be getting Grafana support. For some trace by id queries a time range could be retrieved from a logs instance, but for others we would still need to do the full block range search.

@bikashmishra100
Copy link
Contributor

bikashmishra100 commented Apr 18, 2022

It seems the time range is always passed from Grafana and it is part of the core model. Does this diff look fine to you ? grafana/grafana@main...bikashmishra100:tempo

If this looks fine, I will time this appropriately as per your advice for the other PR for Tempo that is in progress and other release timelines that may be adhered to.

cc: @joe-elliott

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
3 participants