-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
colexec: implement vectorized ntile window function #64977
Conversation
b7cabf9
to
577461d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work! I have some suggestions and a bunch of nit comments, and I will definitely want to take another close look to follow the logic, but I think the PR is in a very good shape.
nit: in the commit title, exec
-> either colexec
or colexecwindow
.
Reviewed 11 of 11 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball)
Makefile, line 913 at r1 (raw file):
pkg/sql/colexec/colexecwindow/row_number.eg.go \ pkg/sql/colexec/colexecwindow/window_peer_grouper.eg.go \ pkg/sql/colexec/colexecwindow/ntile.eg.go
nit: keep these lexicographically sorted as other execgen targets.
pkg/sql/colexec/colexecwindow/BUILD.bazel, line 72 at r1 (raw file):
("row_number.eg.go", "row_number_tmpl.go"), ("window_peer_grouper.eg.go", "window_peer_grouper_tmpl.go"), ("ntile.eg.go", "ntile_tmpl.go"),
ditto
pkg/sql/colexec/colexecwindow/ntile.eg.go, line 72 at r1 (raw file):
// nTileLoading is the state in which ntile operators load an additional // input batch into the currentBatch field. If necessary, the old value of // currentBatch will be pushed to the buffer queue.
It would be helpful to mention under which events one state is changed into another.
pkg/sql/colexec/colexecwindow/ntile.eg.go, line 188 at r1 (raw file):
&colexecutils.NewSpillingQueueArgs{ UnlimitedAllocator: r.allocator, Types: append(r.inputTypes, types.Int),
nit: it might be worth extracting this out and performing a deep copy of
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 85 at r1 (raw file):
nTileLoading nTileState = iota // nTileSeeking is the state in which ntile operators seek to the index of the // next partition. And retrieve the number of ntile buckets for the current
super nit: two sentences read a bit weird for me, maybe combine them into one?
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 192 at r1 (raw file):
// {{end}} if partitionCol[i] { // Don't break for the start of the current partition.
I'm a bit confused by this comment, could you expand a bit?
Also the comment above _GET_NUM_BUCKETS
says that if a new partition begins at index i, this code snippet records the size of the partition, how is that done? Is the comment stale.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 193 at r1 (raw file):
if partitionCol[i] { // Don't break for the start of the current partition. // {{if $.HasSel}}
Seems like the code in the if
case is the same as in else
case.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 204 at r1 (raw file):
} // {{end}} if !argNulls.NullAt(i) {
nit: we usually template out "definitely no nulls" vs "maybe has nulls" cases, I'm not sure how important will be to do it here, possibly leave a TODO.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 267 at r1 (raw file):
} // */}} var errInvalidArgumentForNtile = pgerror.Newf(
nit: we could export the error in window_builtins.go
and reuse it here.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 287 at r1 (raw file):
&colexecutils.NewSpillingQueueArgs{ UnlimitedAllocator: r.allocator, Types: append(r.inputTypes, types.Int),
It might be a bit safer to allocate a deep copy of inputTypes
and append types.Int
to it. It's possible that the caller's input types have extra capacity, so there will be no new allocation when we append a single types.Int
, yet the caller might pass the same input types to a next operator which might modify it. I don't think that this would happen in practice at the moment because of our limited support of window functions, but in the future it could.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 329 at r1 (raw file):
n := batch.Length() sel := batch.Selection() r.allocator.PerformOperation(r.currentBatch.ColVecs(), func() {
I just realized that here we're performing a deep copy of outputIdx
vector because it was appended to the input batch by the vectorTypeEnforcer (we have the same issue in other window functions). However, copying the output column seems pointless, it might be worth leaving a TODO about this.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 350 at r1 (raw file):
argVec := r.currentBatch.ColVec(r.argIdx) argNulls := argVec.Nulls() argCol := argVec.Int64()
It would be great to add logic tests where the argument is of INT2 and INT4 types to make sure this works ok. (I think the unit test in colexecwindow
package would crash, but that's probably ok.)
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 356 at r1 (raw file):
sel := r.currentBatch.Selection() if sel != nil {
Hm, this if
has quite a bit of code duplication. From what I can tell the crux is that we want i
to serve the same meaning (the index of the row we're looking at) in sel and no-sel cases, yet we have different ways to get it; also in the latter case i
serves the same goal as selIdx
. I think we could use the same for
loop in both cases like
for selIdx := r.seekIdx; selIdx < n; selIdx++
and introduce a bit more templating to get the index with something like
// {{if .HasSel}}
i = sel[selIdx]
// {{else}}
i = selIdx
// {{end}}
(take a look at vec_to_datum_tmpl.go
for inspiration).
This should allows us to remove the code duplication here.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 362 at r1 (raw file):
// We must first find the first row for which num_buckets is not null. for selIdx, i = range sel[selIdx:n] { _GET_NUM_BUCKETS(true)
I think I would get rid off _GET_NUM_BUCKETS
and _COMPUTE_PARTITION_SIZE
entirely and inline their logic if we are templating out this whole nTileSeeking
block.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 376 at r1 (raw file):
r.nextPartitionIdx = selIdx // {{else}} r.partitionSize += len(sel[:n])
nit: this is just + n
, right?
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 387 at r1 (raw file):
// We must first find the first row for which num_buckets is not null. // {{if .HasPartition}} _ = partitionCol[i]
Is this line needed? I would assume given the line above it is not.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 399 at r1 (raw file):
// partition. if i < n { _ = partitionCol[i]
ditto
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 486 at r1 (raw file):
r.state = nTileProcessing if !r.hasArg { return
nit: a quick comment why we're short-circuiting here would be helpful.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 532 at r1 (raw file):
} var lastErr error if err := r.bufferQueue.Close(ctx); err != nil {
nit: you can just return the result of Close
.
pkg/sql/colexec/execgen/cmd/execgen/ntile_gen.go, line 1 at r1 (raw file):
// Copyright 2019 The Cockroach Authors.
nit: s/2019/2021/
.
pkg/sql/logictest/testdata/logic_test/window, line 1061 at r1 (raw file):
query IIII SELECT k, w, v, ntile(v) OVER (PARTITION BY w ORDER BY v, k) FROM kv ORDER BY 1
nit: maybe order the output by w, v, k
so that it was easier to check the results visually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look! Your comments have been very helpful.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @yuzefovich)
Makefile, line 913 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: keep these lexicographically sorted as other execgen targets.
Done.
pkg/sql/colexec/colexecwindow/BUILD.bazel, line 72 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
ditto
Done
pkg/sql/colexec/colexecwindow/ntile.eg.go, line 72 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
It would be helpful to mention under which events one state is changed into another.
Done.
pkg/sql/colexec/colexecwindow/ntile.eg.go, line 188 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: it might be worth extracting this out and performing a deep copy of
Done. (I think, the comment got cut off.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 85 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
super nit: two sentences read a bit weird for me, maybe combine them into one?
Oops, done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 192 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I'm a bit confused by this comment, could you expand a bit?
Also the comment above
_GET_NUM_BUCKETS
says that if a new partition begins at index i, this code snippet records the size of the partition, how is that done? Is the comment stale.
The comment's a bit stale, yeah. I'll change it.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 193 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Seems like the code in the
if
case is the same as inelse
case.
Ah, that first i
is supposed to be selIdx
. Fixed it.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 204 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: we usually template out "definitely no nulls" vs "maybe has nulls" cases, I'm not sure how important will be to do it here, possibly leave a TODO.
I decided not to do that because the loop exits immediately upon the first non-null anyway, so it probably wouldn't make things much more efficient in the "no nulls" case. Do you still want me to add that TODO?
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 267 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: we could export the error in
window_builtins.go
and reuse it here.
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 287 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
It might be a bit safer to allocate a deep copy of
inputTypes
and appendtypes.Int
to it. It's possible that the caller's input types have extra capacity, so there will be no new allocation when we append a singletypes.Int
, yet the caller might pass the same input types to a next operator which might modify it. I don't think that this would happen in practice at the moment because of our limited support of window functions, but in the future it could.
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 329 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I just realized that here we're performing a deep copy of
outputIdx
vector because it was appended to the input batch by the vectorTypeEnforcer (we have the same issue in other window functions). However, copying the output column seems pointless, it might be worth leaving a TODO about this.
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 350 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
It would be great to add logic tests where the argument is of INT2 and INT4 types to make sure this works ok. (I think the unit test in
colexecwindow
package would crash, but that's probably ok.)
Hm, it crashes. I guess this means I'll have to template out the go type?
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 356 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Hm, this
if
has quite a bit of code duplication. From what I can tell the crux is that we wanti
to serve the same meaning (the index of the row we're looking at) in sel and no-sel cases, yet we have different ways to get it; also in the latter casei
serves the same goal asselIdx
. I think we could use the samefor
loop in both cases likefor selIdx := r.seekIdx; selIdx < n; selIdx++
and introduce a bit more templating to get the index with something like
// {{if .HasSel}} i = sel[selIdx] // {{else}} i = selIdx // {{end}}
(take a look at
vec_to_datum_tmpl.go
for inspiration).This should allows us to remove the code duplication here.
This is a really good idea. I'll push up the changes I've made for your other comments now, and then get to work on this (and the overloads for INT2
and INT4
) today or tomorrow.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 376 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: this is just
+ n
, right?
Right, done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 387 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Is this line needed? I would assume given the line above it is not.
When the iterated variable isn't assigned a constant initial value, I think it's necessary. Interestingly, just doing _ = argCol[i]
seems to allow the partitionCol
bounds check to be removed too... weird.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 399 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
ditto
Same sort of deal as the other one. You have to remind the compiler that i
is still in-bounds
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 486 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: a quick comment why we're short-circuiting here would be helpful.
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 532 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: you can just return the result of
Close
.
Done.
pkg/sql/colexec/execgen/cmd/execgen/ntile_gen.go, line 1 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit:
s/2019/2021/
.
Done.
pkg/sql/logictest/testdata/logic_test/window, line 1061 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: maybe order the output by
w, v, k
so that it was easier to check the results visually.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushing the comments I have so far, will look later today.
Reviewed 3 of 8 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @yuzefovich)
pkg/sql/colexec/colexecwindow/BUILD.bazel, line 22 at r2 (raw file):
"//pkg/sql/colmem", # keep "//pkg/sql/execinfrapb", # keep "//pkg/sql/pgwire/pgcode", # keep
I think these are no longer needed given that we're importing the error from sem/builtins
.
pkg/sql/colexec/colexecwindow/ntile.eg.go, line 188 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Done. (I think, the comment got cut off.
Yeah, this was about types - I realized I was typing on the generated code and not on the template but forgot to delete this leftover.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 193 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Ah, that first
i
is supposed to beselIdx
. Fixed it.
It'd be great if you could come up with a unit test case when this bug would have been caught.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 204 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
I decided not to do that because the loop exits immediately upon the first non-null anyway, so it probably wouldn't make things much more efficient in the "no nulls" case. Do you still want me to add that TODO?
Makes sense, let's ignore that thought.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 287 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Done.
We should extract the append outside of this NewSpillingQueue
constructor and reuse the deep-copied types below for currentBatch
constructor.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 350 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Hm, it crashes. I guess this means I'll have to template out the go type?
Yeah, we'll need to template the type out - we'll need to generate separate nTileInt64
etc operators which would "encode" which method to use on argVec
to get the slice of integers.
Alternatively, we could plan a cast operator on top of input to the nTile in execplan
since we know the types then. It might be an easier and cleaner choice.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 84 at r2 (raw file):
// // nTileLoading transitions to nTileSeeking unless the end of the input has // been reached, in which the next state is nTileProcessing.
nit: s/in which the/in which case the/
.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 148 at r2 (raw file):
resetCurrentBatch bool // bufferQueue stores batches that are waiting to be fully processed and
nit: I think all batches in the queue belong to the same partition, it'd be great to mention that.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 153 at r2 (raw file):
// nextPartitionIdx is the index of the start of the next partition within the // current batch. If the next partition does not start in the current batch,
The second sentence here indicates that we're using nextPartitionIdx
for two purposes, so I wonder whether it would be better to introduce a special sentinel value (e.g. -1
) to indicate that the next partition doesn't start in the current batch. I imagine this would come with some perf hit, but I think it'd be acceptable if the resulting code is more clear, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some more comments, but I think the code makes sense to me. I'll wait for you to figure out the templating of nTileSeeking
state, but looks great overall.
Reviewed 5 of 8 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @yuzefovich)
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 312 at r2 (raw file):
switch r.state { case nTileLoading: if r.resetCurrentBatch {
Hm, I wonder whether we can remove resetCurrentBatch
boolean entirely and always reset the current batch after enqueueing it into the buffer queue if its length is greater than 0 (if it is zero, then the batch must be in reset state already). (I'm just trying to think how we can reduce the amount of state we're keeping track of.)
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 439 at r2 (raw file):
} if r.processingIdx >= output.Length() { // This batch has already been fully processed.
When can this occur?
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 473 at r2 (raw file):
break } // There are no tuples. This can happen if the first batch returned by the
nit: I think it might be a bit cleaner to detect this scenario in nTileLoading
state and transitioning to nTileFinished
state right away. My reasoning is that nTileProcessing
state is more complicated, so it's better to move edge cases elsewhere.
pkg/sql/logictest/testdata/logic_test/window, line 1061 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Done.
I meant that it might be worth ordering the output on PARTITION BY followed by ORDER BY columns of window functions (but probably not in all cases). I.e. in this case it would be ORDER BY w, v, k - this way it'll be very clear where the partition boundaries are and then what was the internal ordering for window function processing. Feel free to ignore this idea though.
pkg/sql/sem/builtins/window_builtins.go, line 593 at r2 (raw file):
} var ErrInvalidArgumentForNtile = pgerror.Newf(
nit: this will need a comment now to satisfy the linter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @yuzefovich)
pkg/sql/colexec/colexecwindow/BUILD.bazel, line 22 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I think these are no longer needed given that we're importing the error from
sem/builtins
.
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 193 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
It'd be great if you could come up with a unit test case when this bug would have been caught.
Hm, it doesn't seem like the sel != nil
branch is ever being reached, even for the other window functions. I'll have to investigate this.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 287 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
We should extract the append outside of this
NewSpillingQueue
constructor and reuse the deep-copied types below forcurrentBatch
constructor.
Ah, didn't even notice that use of inputTypes
. Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 350 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Yeah, we'll need to template the type out - we'll need to generate separate
nTileInt64
etc operators which would "encode" which method to use onargVec
to get the slice of integers.Alternatively, we could plan a cast operator on top of input to the nTile in
execplan
since we know the types then. It might be an easier and cleaner choice.
I don't think templating should be too hard in this case since I think argCol := argVec.Int64()
is the only line that needs to change. It is a bit unfortunate to have that much extra generated code for just a single line though... Do you think the performance gains will be significant from templating this / is it worth worrying about the INT2
and INT4
cases that much?
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 84 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit:
s/in which the/in which case the/
.
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 148 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: I think all batches in the queue belong to the same partition, it'd be great to mention that.
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 153 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
The second sentence here indicates that we're using
nextPartitionIdx
for two purposes, so I wonder whether it would be better to introduce a special sentinel value (e.g.-1
) to indicate that the next partition doesn't start in the current batch. I imagine this would come with some perf hit, but I think it'd be acceptable if the resulting code is more clear, wdyt?
Yeah, I could do that if you think it'd be more clear. I do kind of like how it is, since you just have to set nextPartitionIdx
to i
after the conclusion of the loop to get this behavior. It would be pretty easy to use the sentinel value as well, though.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 312 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Hm, I wonder whether we can remove
resetCurrentBatch
boolean entirely and always reset the current batch after enqueueing it into the buffer queue if its length is greater than 0 (if it is zero, then the batch must be in reset state already). (I'm just trying to think how we can reduce the amount of state we're keeping track of.)
The issue is that we don't always want to push to the queue as things are right now. When the argument is null up to the end of a batch, we want to emit the batch immediately, which means we can't reset it until the next call to Next
.
I guess other options are to copy the batch before emitting or to not emit immediately and just increment processingIdx
accordingly. This second option would definitely require this bit of logic you asked about earlier:
if r.processingIdx >= output.Length() {
// This batch has already been fully processed.
r.processingIdx -= output.Length()
return output
}
That's because processingIdx
would then refer to a point beyond the end of the current batch.
I think I'll try and go for this second option. Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 439 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
When can this occur?
If the queue ever split up batches after they were put in, something like this could happen - say, the first row of a batch is a fully processed partition, and the next partition continues from the next row onto the next batch. If the first batch popped from of the queue only contained that first row, r.processing
would refer to an index beyond the length of the dequeued batch.
Can I rely on the queue never doing something like this?
Edit: Since I decided to just push batches that are null up to the end onto the queue, this branch can be hit when one of those batches is dequeued.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 473 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: I think it might be a bit cleaner to detect this scenario in
nTileLoading
state and transitioning tonTileFinished
state right away. My reasoning is thatnTileProcessing
state is more complicated, so it's better to move edge cases elsewhere.
Done. I've added an assertion in that place instead.
pkg/sql/logictest/testdata/logic_test/window, line 1061 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I meant that it might be worth ordering the output on PARTITION BY followed by ORDER BY columns of window functions (but probably not in all cases). I.e. in this case it would be ORDER BY w, v, k - this way it'll be very clear where the partition boundaries are and then what was the internal ordering for window function processing. Feel free to ignore this idea though.
Oh, not sure why I read that the way I did. Done.
pkg/sql/sem/builtins/window_builtins.go, line 593 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: this will need a comment now to satisfy the linter.
Done.
1f30b15
to
56ba11d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have the last round of comments, the only concerning is the question in nTileProcessing
state, others seem minor or nits.
Reviewed 6 of 7 files at r3, 4 of 4 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball)
pkg/sql/colexec/colbuilder/execplan.go, line 1181 at r4 (raw file):
"failed to cast window function argument to type %v", expectedType)) } typs = typs[:castIdx+1]
nit: these two lines could be replaced with typs = append(typs, expectedType)
which looks more Go-natural to me.
pkg/sql/colexec/colexecwindow/BUILD.bazel, line 52 at r4 (raw file):
"//pkg/sql/execinfra", "//pkg/sql/execinfrapb", "//pkg/sql/sem/builtins",
nit: I think the linter tells us that this line is not necessary.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 193 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Hm, it doesn't seem like the
sel != nil
branch is ever being reached, even for the other window functions. I'll have to investigate this.
Indeed, very nice find!
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 350 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
I don't think templating should be too hard in this case since I think
argCol := argVec.Int64()
is the only line that needs to change. It is a bit unfortunate to have that much extra generated code for just a single line though... Do you think the performance gains will be significant from templating this / is it worth worrying about theINT2
andINT4
cases that much?
I don't think it'll be worth it to template the type out - I imagine that INT2 and INT4 are edge cases, so it's ok if our performance is slightly suboptimal (still, I imagine the vectorized cast + vectorized ntile will be much faster than the row execution, so overall it'll be a nice speedup). Plus the cast operation between integers of different widths (especially when upcasting to larger width) should be very quick so that it could be considered negligible.
So I think the cleanliness of the ntile template and its maintainability is more important in this case.
Update: I see that you went with the cast, LGTM.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 84 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Done.
nit: I think in this line there is one more instance of this.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 153 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Yeah, I could do that if you think it'd be more clear. I do kind of like how it is, since you just have to set
nextPartitionIdx
toi
after the conclusion of the loop to get this behavior. It would be pretty easy to use the sentinel value as well, though.
Great, let's keep it as is.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 80 at r3 (raw file):
// Transition graph for nTileState: // ┌──────────────────┬───────────┐ // │ │TileLoading├────────────────┐
Awesome diagram!
nit: s/TileLoading/nTileLoading/
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 176 at r3 (raw file):
// processingIdx is an index into the tuples buffered in the queue, or into // currentBatch if the queue is empty). It is used to preserve state when the
nit: missing opening parenthesis.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 222 at r4 (raw file):
var _ colexecop.Operator = &_NTILE_STRINGOp{} func (r *_NTILE_STRINGOp) Init(ctx context.Context) {
nit: in order to reduce the amount of generated code, we can define all methods that don't care about the type of nTile operator (i.e. w/ or w/o partition) with nTileBase
receiver. I think only Next
method needs to be operator-specific. However, if you prefer the current way for clarity, feel free to ignore this idea.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 241 at r4 (raw file):
}, ) r.currentBatch = r.allocator.NewMemBatchWithFixedCapacity(outputTypes, coldata.BatchSize())
An idea on how to improve a bit on the initialization is to not allocate a current batch of maximum size originally (this will matter if the input data set is very small). We could refactor the code so that nothing is done to currentBatch
here in Init
and modify nTileLoading
state with something like:
// Load the next batch into currentBatch. If currentBatch still has data,
// move it into the queue.
if r.currentBatch != nil && r.currentBatch.Length() > 0 {
r.bufferQueue.Enqueue(r.Ctx, r.currentBatch)
}
r.currentBatch, _ = r.allocator.ResetMaybeReallocate(r.outputTypes, r.currentBatch, batch.Length(), math.MaxInt64)
This will allow us to allocate the current batch lazily of just the right size. However, this might not matter that much in the grand scheme of things, so feel free to ignore this idea.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 286 at r4 (raw file):
r.currentBatch.SetLength(n) }) if r.currentBatch.Selection() != nil {
It actually makes perfect sense that the current batch never has a selection vector - above, we're copying one vector at a time with deselection step - and we never set the selection on the current batch ourselves, so I think I would actually make sense to not have the assertion (it seems a bit like dead code).
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 379 at r4 (raw file):
// Set all the ntile bucket values that remain unset, then emit this // batch. Note that if a batch is in r.bufferQueue, the end of the // current partition must be located beyond it.
nit: "beyond it" - is it "beyond the batch" or "beyond the buffer queue"?
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 383 at r4 (raw file):
r.processingIdx -= output.Length() if r.processingIdx < 0 { // The first of the tuples to be processed is located in this batch.
I'm a bit confused by this comment - does it mean that the next partition begins in output
batch? Does it mean that still unprocessed tuples from the current partition start in output
batch? Do we need to deep-copy the batch then?
pkg/sql/colexec/colexecwindow/window_functions_util.go, line 17 at r4 (raw file):
"github.com/cockroachdb/cockroach/pkg/sql/execinfrapb" "github.com/cockroachdb/errors" "github.com/cockroachdb/cockroach/pkg/sql/types"
nit: ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/colexec/colbuilder/execplan.go, line 1181 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: these two lines could be replaced with
typs = append(typs, expectedType)
which looks more Go-natural to me.
Done.
pkg/sql/colexec/colexecwindow/BUILD.bazel, line 52 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: I think the linter tells us that this line is not necessary.
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 193 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Indeed, very nice find!
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 350 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I don't think it'll be worth it to template the type out - I imagine that INT2 and INT4 are edge cases, so it's ok if our performance is slightly suboptimal (still, I imagine the vectorized cast + vectorized ntile will be much faster than the row execution, so overall it'll be a nice speedup). Plus the cast operation between integers of different widths (especially when upcasting to larger width) should be very quick so that it could be considered negligible.
So I think the cleanliness of the ntile template and its maintainability is more important in this case.
Update: I see that you went with the cast, LGTM.
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 84 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: I think in this line there is one more instance of this.
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 153 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Great, let's keep it as is.
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 80 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Awesome diagram!
nit:
s/TileLoading/nTileLoading/
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 176 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: missing opening parenthesis.
Done. Changed the comment a bit for clarity as well.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 222 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: in order to reduce the amount of generated code, we can define all methods that don't care about the type of nTile operator (i.e. w/ or w/o partition) with
nTileBase
receiver. I think onlyNext
method needs to be operator-specific. However, if you prefer the current way for clarity, feel free to ignore this idea.
I like this idea, I'll do it. Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 241 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
An idea on how to improve a bit on the initialization is to not allocate a current batch of maximum size originally (this will matter if the input data set is very small). We could refactor the code so that nothing is done to
currentBatch
here inInit
and modifynTileLoading
state with something like:// Load the next batch into currentBatch. If currentBatch still has data, // move it into the queue. if r.currentBatch != nil && r.currentBatch.Length() > 0 { r.bufferQueue.Enqueue(r.Ctx, r.currentBatch) } r.currentBatch, _ = r.allocator.ResetMaybeReallocate(r.outputTypes, r.currentBatch, batch.Length(), math.MaxInt64)
This will allow us to allocate the current batch lazily of just the right size. However, this might not matter that much in the grand scheme of things, so feel free to ignore this idea.
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 286 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
It actually makes perfect sense that the current batch never has a selection vector - above, we're copying one vector at a time with deselection step - and we never set the selection on the current batch ourselves, so I think I would actually make sense to not have the assertion (it seems a bit like dead code).
Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 379 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: "beyond it" - is it "beyond the batch" or "beyond the buffer queue"?
I've tried to clarify the comment. Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 383 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I'm a bit confused by this comment - does it mean that the next partition begins in
output
batch? Does it mean that still unprocessed tuples from the current partition start inoutput
batch? Do we need to deep-copy the batch then?
Yes to the second one, sort of. I need to make the comment past-tense - the first of the unprocessed tuples was located in this batch, but all tuples are now processed, so no copying necessary.
pkg/sql/colexec/colexecwindow/window_functions_util.go, line 17 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: ordering.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to update the commit message (to remove "WIP"), and it's probably worth including a release note (that the vectorized engine now supports ntile
window function).
Also, please update the PR description and PR title accordingly. It also would be good to link #37035 in the PR description too.
Reviewed 6 of 6 files at r5.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 379 at r4 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
I've tried to clarify the comment. Done.
Makes sense, thanks!
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 100 at r5 (raw file):
const ( // nTileLoading is the state in which case the ntile operators load an
nit: sorry, I think my previous nit comments might have been a bit confusing. I think it makes sense that we're saying "the state in which the ntile operators ..." (in the first paragraphs of the state comments) and something like "unless the end of the input has been reached, in which case the next state ..." (in the second paragraphs of the state comments). Currently, nTileLoading is the state in which case the ntile operators load
sounds a bit weird to me. But English is not my native language, feel free to ignore.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 132 at r5 (raw file):
) // nTileBase extracts common initializations of two variations of nTile
nit: s/nTileBase/nTileInitFields/
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 100 at r5 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: sorry, I think my previous nit comments might have been a bit confusing. I think it makes sense that we're saying "the state in which the ntile operators ..." (in the first paragraphs of the state comments) and something like "unless the end of the input has been reached, in which case the next state ..." (in the second paragraphs of the state comments). Currently,
nTileLoading is the state in which case the ntile operators load
sounds a bit weird to me. But English is not my native language, feel free to ignore.
Ah, I see what you mean. Done.
pkg/sql/colexec/colexecwindow/ntile_tmpl.go, line 132 at r5 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit:
s/nTileBase/nTileInitFields/
.
Done.
This patch implements the `ntile` window function in the vectorized engine. `ntile` takes in an integer argument `num_buckets` and then distributes all rows in a partition equally between the buckets, outputting the bucket number for each row. In the vectorized implementation, batches are buffered until the end of a partition is reached, at which point the `ntile` bucket values can be calculated. The batches are emitted in a streaming fashion; as soon as a batch is fully processed, it is returned and work is paused until the next call to `Next()`. See cockroachdb#37035 Release note (sql change): the vectorized engine now supports the ntile window function.
Thanks for all the reviewing Yahor! |
bors r+ |
Build succeeded: |
This patch implements the
ntile
window function in the vectorizedengine.
ntile
takes in an integer argumentnum_buckets
and thendistributes all rows in a partition equally between the buckets,
outputting the bucket number for each row.
In the vectorized implementation, batches are buffered until the end
of a partition is reached, at which point the
ntile
bucket values canbe calculated. The batches are emitted in a streaming fashion; as soon
as a batch is fully processed, it is returned and work is paused until
the next call to
Next()
.See #37035
Release note (sql change): the vectorized engine now supports the ntile window function.