-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
colexec: propagate the set of needed columns in table reader spec #56540
Conversation
This commit adds the propagation of the set of needed columns via the table reader spec and that information is now used when setting up the ColBatchScans. The row-by-row engine is not affected since it still needs to set up the ProcOutputHelpers, but that is no longer needed in the vectorized engine which gives us a couple of percent improvement on KV microbenchmark. Release note: None
That's huge. How much? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 11 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/execinfrapb/processors_sql.proto, line 144 at r1 (raw file):
// Indicates the ordinals of the columns values for which are needed by the // post-processing stage and, therefore, are to be populated. It is ignored // if is_check is true.
What happens if none are specified? Can that happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have called it out too early - further benchmarking showed no difference, and on actual KV95 workload there appears to be a slowdown of 0.2% or so. All the benchmarks I'm currently running have non-negligible degree of variance, so it might be noise since I expect this change to be beneficial because it removes one of the big sources of allocations from the heap profile. Anyway, I'll rerun them shortly.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)
pkg/sql/execinfrapb/processors_sql.proto, line 144 at r1 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
What happens if none are specified? Can that happen?
Then no values will be populated and all columns are projected out by the post-processing spec. This could happen for the query like SELECT count(*) FROM t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto)
Microbenchmarks:
As I expected, the KV-like benchmark shows non-negligible improvement, but it has big variance, so I had to significantly increase the run count to filter the noise out. I will run KV95 and TPCC benchmarks with this PR and see where we stand. TFTR! bors r+ |
Build succeeded: |
This commit adds the propagation of the set of needed columns via the
table reader spec and that information is now used when setting up the
ColBatchScans. The row-by-row engine is not affected since it still
needs to set up the ProcOutputHelpers, but that is no longer needed in
the vectorized engine which gives us a couple of percent improvement on
KV microbenchmark.
Addresses: #53893
Release note: None