Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colexec: add support for percent_rank and cume_dist window functions #42137

Merged
merged 4 commits into from
Mar 3, 2020

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Nov 3, 2019

colexec: add window peer grouper

This commit adds an operator that populates a boolean column to signify
whether the corresponding tuple is the start of a new peer group. Peers
are such tuples that belong to the same partition and are equal on the
ordering columns. Some window functions must return the same output for
all peers in the peer group. Currently this operator is not used.

Release note: None

colexec: minor refactor of rank and denseRank operators

Previously, rank and denseRank operators contained the logic to figure
out the boundaries of the peer groups, but now that we have a window
peer grouper, it is no longer necessary which simplified the code a
little bit.

Release note: None

colexec: add support for PERCENT_RANK window function

This commit adds the support for PERCENT_RANK window function. This
function differs from two other rank variances in that it needs to know
the number of tuples in the partition. If there is no PARTITION BY
clause, then we have no other choice but to buffer the input fully. If
PARTITION BY clause is present, we need to buffer all tuples that belong
to each partition before we can populate the output. However, for
simplicity, the current implementation of the operator with PARTITION BY
clause also fully buffers the whole input before emitting any output.

This commit also adds a couple of "vec-on" configs to 'window' logic
test. This will increase the test coverage of window functions supported
by the vectorized engine.

Addresses: #37035.

Release note: None

colexec: add support for cume_dist window function

This commit adds support for CUME_DIST window function. This function is
quite similar to PERCENT_RANK, so it reuses the same template.
"percent_rank" things have been renamed to "relative_rank" things.

This commit also enables running of row_number, rank, and
dense_rank with vectorize=auto. The reason is that these window
functions are streaming and internally they might a sorter which can
fallback to disk if necessary.

Addresses: #37035.

Release note: None

@yuzefovich yuzefovich requested review from jordanlewis, asubiotto and a team November 3, 2019 00:39
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich force-pushed the vec-windows branch 3 times, most recently from c34e65d to 31ebca3 Compare February 28, 2020 18:26
@yuzefovich yuzefovich changed the title colexec: add support for PERCENT_RANK window function colexec: add support for percent_rank and cume_dist window functions Feb 28, 2020
@yuzefovich
Copy link
Member Author

Added support for cume_dist window function, it seemed easy enough.

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits.

Reviewed 6 of 6 files at r1, 2 of 2 files at r2, 10 of 10 files at r3, 10 of 10 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @yuzefovich)


pkg/sql/colexec/rank_tmpl.go, line 52 at r3 (raw file):

	if len(orderingCols) == 0 {
		if windowFn == execinfrapb.WindowerSpec_PERCENT_RANK {
			return NewConstOp(allocator, input, coltypes.Float64, float64(0), outputColIdx)

Out of curiosity, could you explain why we return a const op that returns a zero float?


pkg/sql/colexec/window_functions_util.go, line 17 at r1 (raw file):

const columnOmitted = -1

var windowFnNeedsPeersInfo map[execinfrapb.WindowerSpec_WindowFunc]bool

Could we add a comment to this? I personally have very little knowledge of window functions so some explanations are usually helpful.


pkg/sql/colexec/window_functions_util.go, line 24 at r4 (raw file):

	// WindowFnOutputType is a mapping from a window function to its output type.
	// Only window functions supported by the vectorized engine are present.
	WindowFnOutputType map[execinfrapb.WindowerSpec_WindowFunc]*types.T

What if instead of doing this you created a function called GetWindowFnInfo() that returns both the peer info an output type according to a switch statement? I think it would be easier to read. Maybe this should be shared across col and rowexec?


pkg/sql/colexec/window_peer_grouper_tmpl.go, line 33 at r1 (raw file):

// outputColIdx'th column (which is appended if needed) for every tuple that is
// the first within its peer group. Peers are tuples that belong to the same
// partition and are equal on the ordering columns. If ordCols is empty, than

nit s/than/then. I would also rename ordCols to orderingCols since ord makes me think of ordinal rather than ordering.


pkg/sql/colexec/window_peer_grouper_tmpl.go, line 34 at r1 (raw file):

// the first within its peer group. Peers are tuples that belong to the same
// partition and are equal on the ordering columns. If ordCols is empty, than
// all tuples within the partition are peers.

Window functions are confusing.


pkg/sql/distsql/columnar_operators_test.go, line 846 at r3 (raw file):

}

func TestWindowFunctionsAgainstProcessor(t *testing.T) {

You probably already did this but I want to make sure we stress this test for a bit before merging.


pkg/sql/distsql/columnar_operators_test.go, line 879 at r3 (raw file):

						continue
					}
					inputTypes := typs[:nCols:nCols]

Why the extra nCols?


pkg/sql/logictest/testdata/logic_test/window, line 1 at r3 (raw file):

# LogicTest: local local-vec fakedist fakedist-metadata fakedist-vec 5node-local 5node-dist 5node-dist-metadata 5node-dist-disk

Super cool!

@yuzefovich yuzefovich force-pushed the vec-windows branch 3 times, most recently from f896534 to edd730f Compare March 2, 2020 18:33
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto and @jordanlewis)


pkg/sql/colexec/rank_tmpl.go, line 52 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Out of curiosity, could you explain why we return a const op that returns a zero float?

When there are no columns in ORDER BY clause of window functions, then all rows are considered "peers". Percent rank is calculated as (rank - 1) / (total rows - 1), and in such scenario all rows' rank = 0, so percentRank will also be 0 for all rows.

yuzefovich=# create table temp (a int);
CREATE TABLE
yuzefovich=# insert into temp values (0), (1), (2);
INSERT 0 3
yuzefovich=# select a, percent_rank() over () from temp;
 a | percent_rank 
---+--------------
 0 |            0
 1 |            0
 2 |            0
(3 rows)

pkg/sql/colexec/window_functions_util.go, line 17 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Could we add a comment to this? I personally have very little knowledge of window functions so some explanations are usually helpful.

Added a comment.


pkg/sql/colexec/window_functions_util.go, line 24 at r4 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

What if instead of doing this you created a function called GetWindowFnInfo() that returns both the peer info an output type according to a switch statement? I think it would be easier to read. Maybe this should be shared across col and rowexec?

We actually already have such function, I moved it from rowexec intoexecinfrapb package. Done.


pkg/sql/colexec/window_peer_grouper_tmpl.go, line 33 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

nit s/than/then. I would also rename ordCols to orderingCols since ord makes me think of ordinal rather than ordering.

Done.


pkg/sql/colexec/window_peer_grouper_tmpl.go, line 34 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Window functions are confusing.

Indeed :)


pkg/sql/distsql/columnar_operators_test.go, line 846 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

You probably already did this but I want to make sure we stress this test for a bit before merging.

Yeah, I did 100 runs. Initially, this test confirmed the bug I realized I had in my original implementation of percentRank when we have more than a single batch of data. (That's why I bumped the number of rows here.)


pkg/sql/distsql/columnar_operators_test.go, line 879 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Why the extra nCols?

This is a very subtle issue that I spent a few minutes trying to debug. Here is what happens:

  • at the top of the test we instantiate typs up to maxCols
  • then, depending on the test spec, we might actually have less columns, so we slice it up, i.e. inputTypes := typs[:nCols]
  • later, we're appending to inputTypes the return type of the window function, and here is the catch - if we don't specify the capacity of inputTypes, then no new array is allocated and we, instead, overwrite one of the typs with returnType, but we assume that typs are immutable.

pkg/sql/logictest/testdata/logic_test/window, line 1 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Super cool!

Ack.

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 11 of 11 files at r5, 2 of 2 files at r6, 13 of 13 files at r7, 10 of 11 files at r8.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @jordanlewis, and @yuzefovich)


pkg/sql/colexec/window_functions_util.go, line 24 at r4 (raw file):

Previously, yuzefovich wrote…

We actually already have such function, I moved it from rowexec intoexecinfrapb package. Done.

Maybe we still want to do something like this for the peer info?


pkg/sql/colexec/window_functions_util.go, line 28 at r8 (raw file):

	// SupportedWindowFns contains all window functions supported by the
	// vectorized engine.
	SupportedWindowFns = []execinfrapb.WindowerSpec_WindowFunc{

I would make this a map[wfn]struct{}

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto and @jordanlewis)


pkg/sql/colexec/window_functions_util.go, line 24 at r4 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Maybe we still want to do something like this for the peer info?

rowexec.Windower doesn't care about this info (it has a separate PeerGroupIndicesHelper utility struct). In colexec we have slightly different model: we plan different operators on "top" each other, so we need to know this info during planning, in rowexec that info is "assessed" dynamically.

I also think (but not sure) that row_number might be the only window function that doesn't pay attention to "peers". If so, maybe we'll remove this map later.


pkg/sql/colexec/window_functions_util.go, line 28 at r8 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I would make this a map[wfn]struct{}

Done.

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r9.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis and @yuzefovich)


pkg/sql/colexec/window_functions_util.go, line 24 at r4 (raw file):

Previously, yuzefovich wrote…

rowexec.Windower doesn't care about this info (it has a separate PeerGroupIndicesHelper utility struct). In colexec we have slightly different model: we plan different operators on "top" each other, so we need to know this info during planning, in rowexec that info is "assessed" dynamically.

I also think (but not sure) that row_number might be the only window function that doesn't pay attention to "peers". If so, maybe we'll remove this map later.

I guess I meant just a private function used by colexec instead of a map. But up to you.

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto and @jordanlewis)


pkg/sql/colexec/window_functions_util.go, line 24 at r4 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I guess I meant just a private function used by colexec instead of a map. But up to you.

Oh, I see, I guess a function does look better. Done.

This commit adds an operator that populates a boolean column to signify
whether the corresponding tuple is the start of a new peer group. Peers
are such tuples that belong to the same partition and are equal on the
ordering columns. Some window functions must return the same output for
all peers in the peer group. Currently this operator is not used.

Release note: None
Previously, rank and denseRank operators contained the logic to figure
out the boundaries of the peer groups, but now that we have a window
peer grouper, it is no longer necessary which simplified the code a
little bit.

Release note: None
This commit adds the support for PERCENT_RANK window function. This
function differs from two other rank variances in that it needs to know
the number of tuples in the partition. If there is no PARTITION BY
clause, then we have no other choice but to buffer the input fully. If
PARTITION BY clause is present, we need to buffer all tuples that belong
to each partition before we can populate the output. However, for
simplicity, the current implementation of the operator with PARTITION BY
clause also fully buffers the whole input before emitting any output.

This commit also adds a couple of "vec-on" configs to 'window' logic
test. This will increase the test coverage of window functions supported
by the vectorized engine.

Release note: None
@yuzefovich
Copy link
Member Author

I amended the last commit to plan three streaming window functions with auto setting as well. If the build is green, I'll go ahead and merge it.

This commit adds support for CUME_DIST window function. This function is
quite similar to PERCENT_RANK, so it reuses the same template.
"percent_rank" things have been renamed to "relative_rank" things.

This commit also enables running of `row_number`, `rank`, and
`dense_rank` with `vectorize=auto`. The reason is that these window
functions are streaming and internally they might a sorter which can
fallback to disk if necessary.

Release note: None
@yuzefovich
Copy link
Member Author

TFTR!

bors r+

@yuzefovich
Copy link
Member Author

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 3, 2020

Build succeeded

@craig craig bot merged commit c898b2d into cockroachdb:master Mar 3, 2020
@yuzefovich yuzefovich deleted the vec-windows branch March 3, 2020 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants