colexec: propagate the set of needed columns in table reader spec #56540

yuzefovich · 2020-11-11T03:12:45Z

This commit adds the propagation of the set of needed columns via the
table reader spec and that information is now used when setting up the
ColBatchScans. The row-by-row engine is not affected since it still
needs to set up the ProcOutputHelpers, but that is no longer needed in
the vectorized engine which gives us a couple of percent improvement on
KV microbenchmark.

Addresses: #53893

Release note: None

This commit adds the propagation of the set of needed columns via the table reader spec and that information is now used when setting up the ColBatchScans. The row-by-row engine is not affected since it still needs to set up the ProcOutputHelpers, but that is no longer needed in the vectorized engine which gives us a couple of percent improvement on KV microbenchmark. Release note: None

cockroach-teamcity · 2020-11-11T03:12:53Z

This change is

asubiotto · 2020-11-11T11:10:03Z

a couple of percent improvement on KV microbenchmark.

That's huge. How much?

asubiotto

Reviewed 11 of 11 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)

pkg/sql/execinfrapb/processors_sql.proto, line 144 at r1 (raw file):

  // Indicates the ordinals of the columns values for which are needed by the
  // post-processing stage and, therefore, are to be populated. It is ignored
  // if is_check is true.

What happens if none are specified? Can that happen?

yuzefovich

I might have called it out too early - further benchmarking showed no difference, and on actual KV95 workload there appears to be a slowdown of 0.2% or so. All the benchmarks I'm currently running have non-negligible degree of variance, so it might be noise since I expect this change to be beneficial because it removes one of the big sources of allocations from the heap profile. Anyway, I'll rerun them shortly.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)

pkg/sql/execinfrapb/processors_sql.proto, line 144 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

What happens if none are specified? Can that happen?

Then no values will be populated and all columns are projected out by the post-processing spec. This could happen for the query like SELECT count(*) FROM t.

asubiotto

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto)

yuzefovich · 2020-11-11T18:56:17Z

Microbenchmarks:

full scan of an empty table with filter (what we have on master)

name                                          old time/op    new time/op    delta
FlowSetup/vectorize=true/distribute=true-24     22.5µs ± 6%    22.5µs ± 7%    ~     (p=1.000 n=19+20)
FlowSetup/vectorize=true/distribute=false-24    20.1µs ± 5%    21.4µs ±17%  +6.57%  (p=0.004 n=20+20)

name                                          old alloc/op   new alloc/op   delta
FlowSetup/vectorize=true/distribute=true-24     33.7kB ± 1%    32.8kB ± 0%  -2.82%  (p=0.000 n=20+19)
FlowSetup/vectorize=true/distribute=false-24    30.2kB ± 0%    29.3kB ± 0%  -3.08%  (p=0.000 n=19+18)

name                                          old allocs/op  new allocs/op  delta
FlowSetup/vectorize=true/distribute=true-24        316 ± 0%       312 ± 0%  -1.27%  (p=0.000 n=19+19)
FlowSetup/vectorize=true/distribute=false-24       285 ± 0%       281 ± 0%  -1.40%  (p=0.000 n=17+16)

modified benchmark to be a KV lookup with a single row in the table:

name                                           old time/op    new time/op    delta
FlowSetup/vectorize=true/distribute=true-24      21.9µs ± 9%    21.7µs ± 6%  -1.04%  (p=0.013 n=92+92)
FlowSetup/vectorize=true/distribute=false-24     20.5µs ± 6%    19.6µs ± 6%  -4.36%  (p=0.000 n=95+96)
FlowSetup/vectorize=false/distribute=true-24     20.9µs ± 8%    20.7µs ± 7%  -1.09%  (p=0.011 n=88+90)
FlowSetup/vectorize=false/distribute=false-24    19.6µs ± 5%    19.5µs ± 7%    ~     (p=0.057 n=95+98)

name                                           old alloc/op   new alloc/op   delta
FlowSetup/vectorize=true/distribute=true-24      33.7kB ± 0%    32.8kB ± 0%  -2.72%  (p=0.000 n=87+87)
FlowSetup/vectorize=true/distribute=false-24     30.3kB ± 0%    29.3kB ± 0%  -3.13%  (p=0.000 n=87+87)
FlowSetup/vectorize=false/distribute=true-24     34.3kB ± 0%    34.3kB ± 0%  -0.03%  (p=0.000 n=90+88)
FlowSetup/vectorize=false/distribute=false-24    30.9kB ± 0%    30.9kB ± 0%  +0.02%  (p=0.000 n=87+88)

name                                           old allocs/op  new allocs/op  delta
FlowSetup/vectorize=true/distribute=true-24         317 ± 1%       314 ± 1%  -0.98%  (p=0.000 n=100+100)
FlowSetup/vectorize=true/distribute=false-24        288 ± 0%       284 ± 0%  -1.39%  (p=0.000 n=100+100)
FlowSetup/vectorize=false/distribute=true-24        288 ± 0%       288 ± 0%  +0.11%  (p=0.000 n=100+100)
FlowSetup/vectorize=false/distribute=false-24       257 ± 0%       258 ± 0%  +0.58%  (p=0.000 n=100+100)

As I expected, the KV-like benchmark shows non-negligible improvement, but it has big variance, so I had to significantly increase the run count to filter the noise out.

I will run KV95 and TPCC benchmarks with this PR and see where we stand.

TFTR!

bors r+

craig · 2020-11-11T19:36:09Z

Build succeeded:

GitHub CI (Cockroach)

yuzefovich requested review from asubiotto and a team November 11, 2020 03:12

asubiotto suggested changes Nov 11, 2020

View reviewed changes

yuzefovich commented Nov 11, 2020

View reviewed changes

asubiotto approved these changes Nov 11, 2020

View reviewed changes

craig bot merged commit ba363c7 into cockroachdb:master Nov 11, 2020

yuzefovich deleted the vec-scan branch November 11, 2020 19:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

colexec: propagate the set of needed columns in table reader spec #56540

colexec: propagate the set of needed columns in table reader spec #56540

yuzefovich commented Nov 11, 2020

cockroach-teamcity commented Nov 11, 2020

asubiotto commented Nov 11, 2020

asubiotto left a comment

yuzefovich left a comment

asubiotto left a comment

yuzefovich commented Nov 11, 2020

craig bot commented Nov 11, 2020

colexec: propagate the set of needed columns in table reader spec #56540

colexec: propagate the set of needed columns in table reader spec #56540

Conversation

yuzefovich commented Nov 11, 2020

cockroach-teamcity commented Nov 11, 2020

asubiotto commented Nov 11, 2020

asubiotto left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

asubiotto left a comment

Choose a reason for hiding this comment

yuzefovich commented Nov 11, 2020

craig bot commented Nov 11, 2020