Tempo queries are very slow when using some tags #2639

ogxd · 2023-07-11T21:44:08Z

ogxd
Jul 11, 2023

Hello,

I have been using Tempo for a few months now and I am having major performance issues when querying.

Context

I use Grafana OpenSource (might switch to enterprise soon, but I don't think it makes any difference regarding performances)
All components are hosted on in GCP and run in Kubernetes
I am currently the only active user (so there is no load, only the queries I run)
We receive about 10 traces per second. Each trace has about 5 to 20 spans.
Some spans tags store large payloads (such as 16k character json payloads for instance, that can happen). I was thinking it could be important, maybe it is the reason for Tempo underperforming?
I have many different span tag names (some are autogenerated with numbers in, such as span.somefield.1244.value="somevalue"). I was thinking this could also mess up some kind of indexing ?

Observations

Here are a few TraceQL queries and the time it takes

TraceQL Query	Time Range	Execution Time
{ span.browser_id = "will never match" }	30 days	timeout
{ span.browser_id = "will never match" }	1 day	timeout
{ span.browser_id = "will never match" }	6 hours	28s
{ span.browser_id = "will never match" }	1 hour	6s
{ span.http.method = "will never match" }	30 days	3s
{ span.http.method = "will never match" }	1 hour	94ms
{ name = "will never match" }	30 days	4s
{ name = "will never match" }	1 hour	120ms

As you can see, it seems some tags are ultra-slow, while some are very fast. I wonder what is the logic behind this. Maybe some tags are indexed, and some are not? If so, could that be because of the many number of tag names I have?

Thanks in advance for your help

Answered by mapno

Jul 12, 2023

Hi @ogxd. To give a bit of context on how Tempo's read path works:

Tempo doesn’t do indexing of attributes, but rather uses a columnar storage format to support the requirements you’re referencing regarding search. We use Apache Parquet.

What this allows Tempo is to pull individual columns (mapped as different attributes) from storage when searching with TraceQL, reading a lot less data than with a more common row-wise model. So, for the query { resource.namespace = "prod" }, Tempo will only pull a single column resource.namespace (roughly, it’s more complex than just that).

The problem you're experiencing comes from the static schema we work with in parquet. In this schema, intrinsic fie…

View full answer

mapno · 2023-07-12T10:49:19Z

mapno
Jul 12, 2023
Maintainer

Hi @ogxd. To give a bit of context on how Tempo's read path works:

Tempo doesn’t do indexing of attributes, but rather uses a columnar storage format to support the requirements you’re referencing regarding search. We use Apache Parquet.

What this allows Tempo is to pull individual columns (mapped as different attributes) from storage when searching with TraceQL, reading a lot less data than with a more common row-wise model. So, for the query { resource.namespace = "prod" }, Tempo will only pull a single column resource.namespace (roughly, it’s more complex than just that).

The problem you're experiencing comes from the static schema we work with in parquet. In this schema, intrinsic fields, such as span name, service name or span kind, and special attributes such as namespace, get dedicated columns, whereas all other attributes go into a generic key-value slice. That's why you see those speed differences when you search by attributes that have dedicated columns vs attributes that don't.

Something that will improve this issue is they format that we're working on vParquet3. You can follow progress here and read how it works here. In short, you'll be able to dynamically define dedicated columns for any attribute.

2 replies

ogxd Jul 12, 2023
Author

Thanks a lot for this detailed answer! I really appreciate your help.
So vParquet3 will enable manual configuration of specific attribute columns. For other attributes that end up in this "generic key-value slice", do you think reducing the number of attributes can improve performances? (not the amount of data, just limiting attribute names cardinality) (I would like your opinion on this, to see if its worth doing a benchmark on my end)

mapno Jul 13, 2023
Maintainer

Reducing attribute cardinality shouldn't have impact on the read path. What would have is reducing the number of attributes, b/c it would make the generic attribute slice smaller and faster to iterate.

joe-elliott · 2023-07-12T12:39:54Z

joe-elliott
Jul 12, 2023
Maintainer

We continue to make performance improvements in every release. What version of Tempo are you using? Some tunables that can help:

query_frontend:
  max_outstanding_per_tenant: // increasing this will increase the total jobs of any kind per tenant. needs to be bumped if you bump concurreent_jobs below
  search:
    concurrent_jobs: // increasing this number will increase the number of search jobs per query allowed to run at once
    target_bytes_per_job: // this is a balancing act. smaller values make more smaller jobs. tempo currently struggles to push through huge numbers of jobs, but 2.2 will have some improvements in this area. we use 200MB right now, but i want to go smaller with better job throughput
    
querier:
  max_concurrent_queries: // increasing this number will increase how many jobs each querier will take on at once

storage:
  trace:
    block:
      parquet_row_group_size_bytes: // the smallest a traceql job can be is one row group. decreasing this value will increase your footer size but will allow tempo to break a traceql query into smaller jobs. we use 50MB right now.

1 reply

ogxd Jul 12, 2023
Author

Oh sorry this is an important part of the context, here it is 2.1.1

tempo, version 2.1.1 (branch: HEAD, revision: 4157d7620)
  build user:       
  build date:       
  go version:       go1.20.3
  platform:         linux/amd64
  tags:             unknown

Thanks a lot for the pointers to these configurations, I'll see if I can improve the performances playing with these

pingping95 · 2023-08-10T18:12:52Z

pingping95
Aug 10, 2023

@mapno Thanks for such a great reply !

I have a question. which tempo version do you expect vParquet3 is releasing ?

1 reply

mapno Aug 22, 2023
Maintainer

vParquet3 is already available in main, and will be part of the v2.3 release. We don't have a release date yet, but likely a couple of months.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tempo queries are very slow when using some tags #2639

{{title}}

Replies: 3 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Tempo queries are very slow when using some tags #2639

ogxd Jul 11, 2023

Context

Observations

Replies: 3 comments · 4 replies

mapno Jul 12, 2023 Maintainer

ogxd Jul 12, 2023 Author

mapno Jul 13, 2023 Maintainer

joe-elliott Jul 12, 2023 Maintainer

ogxd Jul 12, 2023 Author

pingping95 Aug 10, 2023

mapno Aug 22, 2023 Maintainer

ogxd
Jul 11, 2023

Replies: 3 comments 4 replies

mapno
Jul 12, 2023
Maintainer

ogxd Jul 12, 2023
Author

mapno Jul 13, 2023
Maintainer

joe-elliott
Jul 12, 2023
Maintainer

ogxd Jul 12, 2023
Author

pingping95
Aug 10, 2023

mapno Aug 22, 2023
Maintainer