-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use of startTimeMillis
for findTraces API , when es-aliases are used
#2923
Comments
@sivatarunp that's an interesting find; did you get a chance to test this out to see how much of an improvement this change would make to the FindTraces query? If so, could you share those numbers? |
@albertteoh . We have couple of things here.
|
@sivatarunp are you able to quantify the query time improvement by providing some numbers from your tests? |
@albertteoh Here is a panel I could build from grafana community dashboards. The current query took more than 4m and even timed out, where as with above changes results were under 1 min, for the same data set |
Thanks for that, @sivatarunp. Your proposal sounds reasonable to me. It's not entirely clear to me why the Are you able to provide a contribution from the change you have tested? |
Looks like this was fixed already. |
Requirement - what kind of business use case are you trying to solve?
Improve the query performance in Jaeger, for Elasticsearch storage, when --es.use-aliases is set to true
Problem - what in Jaeger blocks you from solving the requirement?
Currently, when we use aliases for Elasticsearch storage, the findTraces API queries all indices present under jaeger-span-read alias (irrespective of time range we give in UI). Due to this, when the data set is huge, significant amount of time is being used for querying unnecessary shards which are not in the given time range.
Proposal - what do you suggest to solve the problem or improve the existing situation?
The findTraces API use the
startTime
field for querying, which is along
field. Elasticsearch in built has a feature, to skip shards before querying, when the query is a range query ondate
field.https://discuss.elastic.co/t/timeline-query-on-timestamped-indices/129328/2 is the related discussion for the same.
Hence modifying the findTraces API , to use
startTimeMillis
field(which is already present in the data we store) which is adate
type field, can help in skipping unnecessary shards hence improving the query performanceThe text was updated successfully, but these errors were encountered: