Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TraceQL] Multiple &&ed or ||ed should return unique spans #2254

Merged
merged 5 commits into from
Mar 23, 2023

Conversation

joe-elliott
Copy link
Member

What this PR does:
Forces the spanset operations && and || to always return a unique spanset. Before they would return the cross product of all spansets. For instance this query { true } && { true } would return a spanset that was double the input trace. Now it will return a spanset that is equal to the input trace.

Which issue(s) this PR fixes:
Fixes #2246

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
@joe-elliott
Copy link
Member Author

joe-elliott commented Mar 23, 2023

@stoewer heads up that a small change to vparquet is in this PR:

https://github.com/grafana/tempo/pull/2254/files#diff-2cde8510d11a7de630f0fcbffae3a5419968cd3fa838206129f5c4ab40cf27bbR301

We had previously agreed to leave vparquet alone since vparquet2 is up, but I would like to go ahead and merge this since it's a bug fix and only a few lines. Are you ok with that?

tempodb/encoding/vparquet/block_traceql.go Outdated Show resolved Hide resolved

// BenchmarkUniquespans benchmarks the performance of the uniqueSpans function using
// different numbers of spansets and spans.
func BenchmarkUniqueSpans(b *testing.B) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat, did the benchmarks show a benefit from the current approach? It seems like it would be the same number of insertions and lookups at first glance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried building the dictionary using the first, largest and smallest spanset. Smallest outperformed the others. Here is first -> smallest:

name                       old time/op    new time/op    delta
UniqueSpans/10000|1-8         152µs ± 1%     169µs ±15%  +11.23%  (p=0.032 n=5+5)
UniqueSpans/100|1-8          2.20µs ±23%    2.25µs ± 5%     ~     (p=0.968 n=5+5)
UniqueSpans/1|1-8             112ns ±12%     112ns ±18%     ~     (p=0.738 n=5+5)
UniqueSpans/10000|100-8       364µs ±15%     345µs ±10%     ~     (p=0.548 n=5+5)
UniqueSpans/100|100-8        12.5µs ±21%    14.5µs ±17%     ~     (p=0.095 n=5+5)
UniqueSpans/1|100-8          9.80µs ±11%    2.01µs ± 7%  -79.47%  (p=0.008 n=5+5)
UniqueSpans/10000|10000-8    1.62ms ±12%    1.60ms ±13%     ~     (p=1.000 n=5+5)
UniqueSpans/100|10000-8      1.45ms ±13%    0.39ms ± 6%  -73.18%  (p=0.008 n=5+5)
UniqueSpans/1|10000-8        1.21ms ± 7%    0.23ms ±25%  -81.21%  (p=0.008 n=5+5)

name                       old alloc/op   new alloc/op   delta
UniqueSpans/10000|1-8         164kB ± 0%     164kB ± 0%     ~     (all equal)
UniqueSpans/100|1-8          1.79kB ± 0%    1.79kB ± 0%     ~     (all equal)
UniqueSpans/1|1-8             32.0B ± 0%     32.0B ± 0%     ~     (all equal)
UniqueSpans/10000|100-8       169kB ± 0%     169kB ± 0%     ~     (p=0.992 n=5+5)
UniqueSpans/100|100-8        8.43kB ± 0%    8.42kB ± 0%   -0.02%  (p=0.000 n=4+5)
UniqueSpans/1|100-8          7.02kB ± 0%    1.79kB ± 0%     ~     (p=0.079 n=4+5)
UniqueSpans/10000|10000-8    1.00MB ± 0%    1.00MB ± 0%     ~     (p=0.548 n=5+5)
UniqueSpans/100|10000-8       840kB ± 0%     169kB ± 0%  -79.87%  (p=0.008 n=5+5)
UniqueSpans/1|10000-8         840kB ± 0%     164kB ± 0%  -80.49%  (p=0.008 n=5+5)

name                       old allocs/op  new allocs/op  delta
UniqueSpans/10000|1-8          1.00 ± 0%      1.00 ± 0%     ~     (all equal)
UniqueSpans/100|1-8            1.00 ± 0%      1.00 ± 0%     ~     (all equal)
UniqueSpans/1|1-8              1.00 ± 0%      1.00 ± 0%     ~     (all equal)
UniqueSpans/10000|100-8        9.00 ± 0%      9.00 ± 0%     ~     (all equal)
UniqueSpans/100|100-8          9.00 ± 0%      9.00 ± 0%     ~     (all equal)
UniqueSpans/1|100-8            9.00 ± 0%      1.00 ± 0%  -88.89%  (p=0.008 n=5+5)
UniqueSpans/10000|10000-8       214 ± 0%       214 ± 0%     ~     (all equal)
UniqueSpans/100|10000-8         214 ± 0%         9 ± 0%     ~     (p=0.079 n=4+5)
UniqueSpans/1|10000-8           214 ± 0%         1 ± 0%  -99.53%  (p=0.008 n=5+5)

Signed-off-by: Joe Elliott <number101010@gmail.com>
@joe-elliott joe-elliott merged commit 0ed2362 into grafana:main Mar 23, 2023
mdisibio pushed a commit to mdisibio/tempo that referenced this pull request Apr 18, 2023
)

* added uniquespans bench and test

Signed-off-by: Joe Elliott <number101010@gmail.com>

* more tests

Signed-off-by: Joe Elliott <number101010@gmail.com>

* sort spans slice

Signed-off-by: Joe Elliott <number101010@gmail.com>

* changelog

Signed-off-by: Joe Elliott <number101010@gmail.com>

* clarify

Signed-off-by: Joe Elliott <number101010@gmail.com>

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[TraceQL] &&ing spansets into count() will give erroneous results
2 participants