Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support set-returning functions within other render expressions #14369

Merged
merged 1 commit into from
Apr 25, 2017

Conversation

benesch
Copy link
Contributor

@benesch benesch commented Mar 24, 2017

In 0903a36 @jordanlewis added support for set-returning functions (SRFs) within a render expression, like:

SELECT generate_series(1, 3);

This commit adds support for detecting SRFs in more complicated render expressions, like:

SELECT 1 + generate_series(1, 3);

Expressions that involve multiple SRFs, like

SELECT generate_series(generate_series(1, 3), 3);

are still unsupported. (Yes, that really is a query which PostgreSQL supports.) Such queries require lateral correlated subqueries, which are not yet supported, but it should be easy to adapt the infrastructure in this commit when lateral correlated subqueries land.

Fixes #13146.


This change is Reviewable

@jordanlewis
Copy link
Member

Thanks for taking on this issue! At first glance I'm not sure what happened with the tests, but I gave this a pass of style comments until you figure out what's going on.


Review status: 0 of 5 files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.


pkg/sql/render.go, line 343 at r1 (raw file):

		// move it up to the sources list as a cross join and add a render for the
		// function's column in the join.
		var err error

nit: i think you can do without this extra var err error by using := below, right?


pkg/sql/render.go, line 372 at r1 (raw file):

// ivarHelper. The extracted SRF is retained in the srf field.
//
// When lateral correlated subqueries are supported, this visitor should be

I would rephrase this comment, since it's not clear at what point if ever we'll be working on lateral correlated subqueries. Maybe something like "This visitor is intentionally limited to extracting a single SRF, because we don't support lateral correlated subqueries."


pkg/sql/parser/generator_builtins.go, line 100 at r1 (raw file):

}

// Generators is a map from name to description for all built-in generators.

s/description/slice of Builtins/


pkg/sql/parser/indexed_vars.go, line 147 at r1 (raw file):

func (h *IndexedVarHelper) AppendSlot() int {
	h.vars = append(h.vars, IndexedVar{})
	return h.NumVars() - 1

I think it might make more sense for reader comprehension and anti-rot to just use len(h.vars) here - NumVars is the external API of this struct, so any kind of indirection that we might want to introduce later in NumVars, such as only exposing some of the vars or something like that, would break this.


Comments from Reviewable

@benesch
Copy link
Contributor Author

benesch commented Mar 26, 2017

Thanks for the review! Turns out you can't call Resolve on ResolvableFunctionReferences before generating the column name (or the column name will be the normalized function name instead of the original function name the user provided). The latest rev has a potential fix for this issue.


Review status: 0 of 10 files reviewed at latest revision, 4 unresolved discussions, some commit checks pending.


pkg/sql/render.go, line 343 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

nit: i think you can do without this extra var err error by using := below, right?

The linter complains about target, err := because it shadows the earlier definition of target, so I changed the line to just overwrite target instead. I've updated this to use a newTarget variable to be more explicit.


pkg/sql/render.go, line 372 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I would rephrase this comment, since it's not clear at what point if ever we'll be working on lateral correlated subqueries. Maybe something like "This visitor is intentionally limited to extracting a single SRF, because we don't support lateral correlated subqueries."

Done.


pkg/sql/parser/generator_builtins.go, line 100 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

s/description/slice of Builtins/

Done.


pkg/sql/parser/indexed_vars.go, line 147 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I think it might make more sense for reader comprehension and anti-rot to just use len(h.vars) here - NumVars is the external API of this struct, so any kind of indirection that we might want to introduce later in NumVars, such as only exposing some of the vars or something like that, would break this.

Done.


Comments from Reviewable

@benesch benesch force-pushed the srf-expressions branch 3 times, most recently from d8c38a1 to 5e461e7 Compare March 26, 2017 20:12
@knz
Copy link
Contributor

knz commented Mar 28, 2017

I would recommend here and in future PRs that make a significant contribution to our SQL semantics that you outline in the commit message how this works.


Reviewed 2 of 5 files at r1, 8 of 8 files at r2.
Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


pkg/sql/render.go, line 456 at r2 (raw file):

	}
	s.source = src
	s.sourceInfo = multiSourceInfo{s.source.info}

I saw that you removed the (previously incorrect) explicit reset of the ivarHelper. I still would like an explanatory comment here: "although the source was modified and now supports more columns, we do not change the ivarHelper here because any subsequent column references should be checked against the original column list, not the column list extended with the new SRF."

That said, I think your patch still contains an error as-is. Right now the source name of a SRF will be the name of the function. Consider the following statement:

SELECT 1 + generate_series(1, 3) FROM t WHERE generate_series > 3

This should fail with error: column generate_series does not exist
However with your patch I believe that it will (erroneously) succeed.

Can you check this (we can also discuss the issues around that.)


Comments from Reviewable

@benesch
Copy link
Contributor Author

benesch commented Apr 24, 2017

Done.


Review status: 2 of 11 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.


pkg/sql/render.go, line 456 at r2 (raw file):

Previously, knz (kena) wrote…

I saw that you removed the (previously incorrect) explicit reset of the ivarHelper. I still would like an explanatory comment here: "although the source was modified and now supports more columns, we do not change the ivarHelper here because any subsequent column references should be checked against the original column list, not the column list extended with the new SRF."

That said, I think your patch still contains an error as-is. Right now the source name of a SRF will be the name of the function. Consider the following statement:

SELECT 1 + generate_series(1, 3) FROM t WHERE generate_series > 3

This should fail with error: column generate_series does not exist
However with your patch I believe that it will (erroneously) succeed.

Can you check this (we can also discuss the issues around that.)

I'm a bit confused, @knz, since I added a mutating AppendSlot method to IndexedVarHelper. So we do change the ivarHelper. I don't think that's problematic, though: there's no way for column references to refer to this new column because it doesn't have a name that's visible to SQL. This also means the query you proposed properly returns an error; I've added it to the logic tests below.


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Apr 24, 2017

:lgtm: but you need to change the all the relevant tests to account for the new normalization of function names...


Reviewed 9 of 9 files at r3.
Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


pkg/sql/render.go, line 456 at r2 (raw file):

Previously, benesch (Nikhil Benesch) wrote…

I'm a bit confused, @knz, since I added a mutating AppendSlot method to IndexedVarHelper. So we do change the ivarHelper. I don't think that's problematic, though: there's no way for column references to refer to this new column because it doesn't have a name that's visible to SQL. This also means the query you proposed properly returns an error; I've added it to the logic tests below.

Thanks for pointing it out; I had overlooked what AppendSlot really does. Yes, this works.


Comments from Reviewable

@benesch
Copy link
Contributor Author

benesch commented Apr 24, 2017

Oops, no, I didn't mean to impact column names anywhere else. Can you sanity check the last update? I've reproduced the diff against the rev you last reviewed below, since Reviewable seems confused.

diff --git a/pkg/sql/render.go b/pkg/sql/render.go
index e08e9cea4..17d55e5a5 100644
--- a/pkg/sql/render.go
+++ b/pkg/sql/render.go
@@ -505,6 +505,14 @@ func getRenderColName(searchPath parser.SearchPath, target parser.SelectExpr) (s
 	if err := target.NormalizeTopLevelVarName(); err != nil {
 		return "", err
 	}
+
+	// If target.Expr is a funcExpr, resolving the function within will normalize
+	// target.Expr's string representation. We want the output column name to be
+	// unnormalized, so we compute target.Expr's string representation now, even
+	// though we may choose to return something other than exprStr in the switch
+	// below.
+	exprStr := target.Expr.String()
+
 	switch t := target.Expr.(type) {
 	case *parser.ColumnItem:
 		// We only shorten the name of the result column to become the
@@ -525,7 +533,8 @@ func getRenderColName(searchPath parser.SearchPath, target parser.SelectExpr) (s
 			return fd.Name, nil
 		}
 	}
-	return target.Expr.String(), nil
+
+	return exprStr, nil
 }
 
 // appendRenderColumn adds a new render expression at the end of the current list.

In 0903a36 @jordanlewis added support for set-returning functions
(SRFs) within a render expression, like:

    SELECT generate_series(1, 3);

This commit adds support for detecting SRFs in more complicated render
expressions, like:

    SELECT 1 + generate_series(1, 3);

Expressions that involve multiple SRFs, like

    SELECT generate_series(generate_series(1, 3), 3);

are still unsupported. (Yes, that really is a query which PostgreSQL
supports.) Such queries require lateral correlated subqueries, which are
not yet supported, but it should be easy to adapt the infrastructure in
this commit when lateral correlated subqueries land.

The approach is as follows: for each SRF that appears in a target
expression, hoist the SRF into a CROSS JOIN and replace the SRF with a
reference to the cross-joined table. For example, the query

    SELECT 1 + generate_series(1, 3) FROM generate_series(1, 2);

is rewritten to

    SELECT 1 + t1.c1 FROM generate_series(1, 2) CROSS JOIN generate_series(1, 3) AS t1 (c1);

except the rewritten table does not need to be explicitly named `t1.c1`,
since we can refer to it internally using an IndexedVar with the
appropriate index.

Fixes cockroachdb#13146.
@benesch benesch merged commit 922f206 into cockroachdb:master Apr 25, 2017
@benesch benesch deleted the srf-expressions branch April 25, 2017 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants