Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support * in udf bodies #95710

Merged
merged 1 commit into from
Feb 3, 2023
Merged

Conversation

rharding6373
Copy link
Collaborator

@rharding6373 rharding6373 commented Jan 24, 2023

This change allows * usage in UDF bodies. We rewrite UDF ASTs in place
to expand *s into the columns they reference.

Informs: #90080

Epic: CRDB-19496
Release note (sql change): Allow * expressions in UDFs.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rharding6373
Copy link
Collaborator Author

There are a handful of UDF tests I need to clean up, but I think it's ready for a first round of review.

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 14 of 15 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rharding6373)


pkg/sql/opt/optbuilder/builder.go line 148 at r2 (raw file):

	// If set, the data source names in the AST are rewritten to the table ID and
	// all the visible column IDs.
	useIDsInAST bool

[nit] Could this be replaced by insideFuncDef?


pkg/sql/opt/optbuilder/select.go line 112 at r2 (raw file):

		ds, depName, resName := b.resolveDataSource(tn, privilege.SELECT)
		if b.useIDsInAST {

Why replace the data source instead of the *?


pkg/sql/opt/optbuilder/util.go line 65 at r2 (raw file):

) (aliases []string, exprs []tree.TypedExpr) {
	if b.insideViewDef {
		panic(unimplemented.NewWithIssue(10028, "views do not currently support * expressions"))

Can we also support * in views the same way?


pkg/sql/logictest/testdata/logic_test/udf line 127 at r2 (raw file):

  LANGUAGE SQL
  AS $$
  SELECT a FROM [113(1, 2, 3) AS t];

For outputs like this one, are we sure we want to show the rewritten version instead of the original?


pkg/sql/logictest/testdata/logic_test/udf_star line 95 at r2 (raw file):


statement ok
ALTER TABLE t_twocol ADD COLUMN c INT DEFAULT 5;

What happens if you change the column type (both to something compatible and something not compatible)? Or the column name?

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rharding6373)


pkg/sql/logictest/testdata/logic_test/udf_star line 95 at r2 (raw file):

Or the column name?

Oops, didn't notice you had this case already.

Also, could you add one where f_allcolsel is called after the table is altered? Postgres actually seems ok with that case:

postgres=# create table t (a int);
CREATE TABLE
postgres=# create function f() returns t language sql as 'select * from t';
CREATE FUNCTION
postgres=# alter table t add column b int;
ALTER TABLE
postgres=# select f();
 f
---

(1 row)

postgres=# insert into t values(1);
INSERT 0 1
postgres=# select f();
  f
------
 (1,)
(1 row)

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rharding6373)


pkg/sql/logictest/testdata/logic_test/udf_star line 95 at r2 (raw file):

Postgres actually seems ok with that case

Nevermind, I didn't realize we wanted to default to early binding.

Copy link
Collaborator Author

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @mgartner)


pkg/sql/opt/optbuilder/builder.go line 148 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] Could this be replaced by insideFuncDef?

Done.


pkg/sql/opt/optbuilder/select.go line 112 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Why replace the data source instead of the *?

Using table and column IDs should make handling schema changes like column and table renaming easier, since the underlying IDs don't change. Otherwise when we rename something we'll need to rewrite the UDF body, I think.


pkg/sql/opt/optbuilder/util.go line 65 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Can we also support * in views the same way?

Maybe, thanks for pointing that out. Let me look into that after this PR goes out. It looks like there's a long history about views and schema changes.


pkg/sql/logictest/testdata/logic_test/udf line 127 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

For outputs like this one, are we sure we want to show the rewritten version instead of the original?

I discussed this with @mgartner offline, and we decided that ultimately we want to support both pretty printing with the table names and printing the rewritten query (like postgres does with \df+ and pg_catalog.pg_proc's prosqlbody, respectively). But now that I'm looking at this again, we are using pg_catalog.pg_proc here, though we seem to be using prosrc and prosqlbody differently than postgres given our different binding treatment for the same syntax... I will solicit opinions in the #udf channel about where we expect pretty versions of the udf vs not.

@mgartner is probably right that we need to reconstruct a pretty version of the body as opposed to the original source. This is what postgres appears to do for early binding UDFs, since altering the schema results in changes to what is output by \df+.


pkg/sql/logictest/testdata/logic_test/udf_star line 95 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Postgres actually seems ok with that case

Nevermind, I didn't realize we wanted to default to early binding.

My understanding is that we only support early binding right now. I added a comment to clarify that this behavior is only expected for early binding. Added more ALTER examples in the test, which work as expected.

@rharding6373 rharding6373 requested a review from a team as a code owner January 25, 2023 00:01
@rharding6373 rharding6373 requested review from a team and dt and removed request for a team January 25, 2023 00:01
Copy link
Collaborator Author

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @dt, and @mgartner)


pkg/sql/logictest/testdata/logic_test/udf line 127 at r2 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

I discussed this with @mgartner offline, and we decided that ultimately we want to support both pretty printing with the table names and printing the rewritten query (like postgres does with \df+ and pg_catalog.pg_proc's prosqlbody, respectively). But now that I'm looking at this again, we are using pg_catalog.pg_proc here, though we seem to be using prosrc and prosqlbody differently than postgres given our different binding treatment for the same syntax... I will solicit opinions in the #udf channel about where we expect pretty versions of the udf vs not.

@mgartner is probably right that we need to reconstruct a pretty version of the body as opposed to the original source. This is what postgres appears to do for early binding UDFs, since altering the schema results in changes to what is output by \df+.

Actually, using the rewritten version here may break backups. Might need to reinterpret the udf body here.

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this :lgtm: sans the possible backport issue.

Reviewed 1 of 1 files at r2, 9 of 9 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dt, @mgartner, and @rharding6373)


pkg/sql/opt/optbuilder/select.go line 112 at r2 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

Using table and column IDs should make handling schema changes like column and table renaming easier, since the underlying IDs don't change. Otherwise when we rename something we'll need to rewrite the UDF body, I think.

Makes sense, it looks like it would be very invasive to make that change on second glance.


pkg/sql/opt/optbuilder/select.go line 46 at r3 (raw file):

// See Builder.buildStmt for a description of the remaining input and
// return values.
func (b *Builder) buildDataSource(

[nit] I think we should either expand the comment here to mention that it can mutate texpr, or maybe change its contract to return the replacement. Do you know if we do this sort of thing anywhere else in the optbuilder?


pkg/sql/logictest/testdata/logic_test/udf_star line 105 at r3 (raw file):

# ok for late binding.
query T
SELECT f_unqualified_twocol()

I wonder if we should consider this a valid departure from postgres like defaulting to early binding - it seems reasonable to also early-bind the return type. Do you know what happens if a UDT is used as the return type and later altered?

Copy link
Contributor

@chengxiong-ruan chengxiong-ruan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this! I learned a lot here. But as we chatted through slack that we'd prefer expanding * into column names instead of rewriting the data source. Not sure how much more work that would be to do the expansion, but it would save a lot of work on deserialization for things like SHOW CREATE FUNCTION.

@rharding6373 rharding6373 force-pushed the 20230110_udf branch 3 times, most recently from 60f6999 to bee5bf0 Compare January 26, 2023 20:25
Copy link
Collaborator Author

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reimplemented to modify the AST in place to do * expansion instead of datasource replacement. RFAL, thanks!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @dt, and @mgartner)


pkg/sql/opt/optbuilder/select.go line 112 at r2 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Makes sense, it looks like it would be very invasive to make that change on second glance.

Reimplemented to expand the *, PTAL


pkg/sql/opt/optbuilder/select.go line 46 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] I think we should either expand the comment here to mention that it can mutate texpr, or maybe change its contract to return the replacement. Do you know if we do this sort of thing anywhere else in the optbuilder?

We do some minor tweaks to the AST, such as fully qualifying table names, but afaict we don't replace expressions often. I've added a comment to analyzeSelectList, which is where the expansion happens in the reimplementation.


pkg/sql/opt/optbuilder/util.go line 65 at r2 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

Maybe, thanks for pointing that out. Let me look into that after this PR goes out. It looks like there's a long history about views and schema changes.

With the reimplementation it's not really relevant anymore, but the work slated to #83233 is more closely aligned with that goal, it seems.


pkg/sql/logictest/testdata/logic_test/udf line 127 at r2 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

Actually, using the rewritten version here may break backups. Might need to reinterpret the udf body here.

No longer relevant with the rewrite


pkg/sql/logictest/testdata/logic_test/udf_star line 105 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

I wonder if we should consider this a valid departure from postgres like defaulting to early binding - it seems reasonable to also early-bind the return type. Do you know what happens if a UDT is used as the return type and later altered?

Renaming the return type is fine, but adding attributes results in a return type mismatch error in postgres:

rharding=# create type typ as (a int);
CREATE TYPE
rharding=# create function f() returns typ language sql begin atomic; select 1; end;
CREATE FUNCTION
rharding=# select f();
  f  
-----
 (1)
(1 row)
rharding=# alter type typ rename to rtyp;
ALTER TYPE
rharding=# select f();
  f  
-----
 (1)
(1 row)
rharding=# alter type rtyp add attribute b int;
ALTER TYPE
rharding=# select f();
ERROR:  return type mismatch in function declared to return rtyp
DETAIL:  Final statement returns too few columns.
CONTEXT:  SQL function "f" during inlining

Copy link
Contributor

@chengxiong-ruan chengxiong-ruan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

ALTER TABLE t_twocol RENAME TO t_twocol_prime;

# Dropping a column a UDF depends on is not allowed.
statement error pq: cannot drop column "b" because function "f_unqualified_twocol" depends on it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add test cases with DROP CASCADE ? like the ones in test TestDropCascadeRemoveFunction. Just want to make sure the dependencies are handled well (I'm sure it does).

@@ -42,7 +42,7 @@ create-function
├── CREATE FUNCTION f()
│ RETURNS workday
│ LANGUAGE SQL
│ AS $$SELECT w FROM t.public.workdays;$$
│ AS $$SELECT w FROM [55(1) AS workdays];$$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these test changes need to be reverted?

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 15 files at r1, 23 of 23 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @dt, and @rharding6373)


pkg/sql/opt/optbuilder/project.go line 153 at r4 (raw file):

						expanded = true
						for _, expr := range exprs {
							col := expr.(*scopeColumn)

I don't think the exprs are guaranteed to be *scopeColumns:

exprs[i] = tree.NewTypedColumnAccessExpr(texpr, tree.Name(colName), i)

Is there a way to exercise that case in a UDF?


pkg/sql/logictest/testdata/logic_test/udf_star line 18 at r4 (raw file):

CREATE FUNCTION f_subquery() RETURNS INT AS
$$
  SELECT * FROM (SELECT a FROM (SELECT * FROM t_onecol) AS foo) AS bar;

Can you add a test like this without the subquery aliases foo and bar? If we can't support it because we the column names can't be fully qualified, then we can return a user error if there are unaliased subqueries in a UDF.


pkg/sql/opt/optbuilder/testdata/udf line 1181 at r4 (raw file):


exec-ddl
CREATE FUNCTION fn_star() RETURNS INT LANGUAGE SQL AS 'SELECT * FROM tstar'

Might be nice to have another, more complicated test case here, like with a subquery where the column qualification is important. It'll make sure that testcat respects the rewriting correctly, too.


pkg/ccl/backupccl/backup_test.go line 10630 at r4 (raw file):

$$;
`)

nit: revert this change

@mgartner
Copy link
Collaborator

This looks great! Nice work! I just left a couple minor comments.

@mgartner
Copy link
Collaborator

pkg/sql/logictest/testdata/logic_test/udf_star line 61 at r4 (raw file):

query TTT
SELECT oid, proname, prosrc
FROM pg_catalog.pg_proc WHERE proname LIKE 'f\_%' ORDER BY oid;

It might be good to add some tests with SHOW CREATE FUNCTION here too.

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 23 of 23 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt, @mgartner, and @rharding6373)


pkg/sql/opt/optbuilder/project.go line 129 at r4 (raw file):

) {
	var expansions tree.SelectExprs
	expanded := false

[nit] Since we're filling out expansions unconditionally when b.insideFuncDef, checking expanded doesn't seem useful.


pkg/sql/opt/optbuilder/select.go line 47 at r4 (raw file):

// return values.
func (b *Builder) buildDataSource(
	texpr *tree.TableExpr, indexFlags *tree.IndexFlags, locking lockingSpec, inScope *scope,

[nit] Is the change here still necessary?

Copy link
Collaborator Author

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @chengxiong-ruan, @DrewKimball, @dt, and @mgartner)


pkg/sql/opt/optbuilder/project.go line 129 at r4 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] Since we're filling out expansions unconditionally when b.insideFuncDef, checking expanded doesn't seem useful.

Done.


pkg/sql/opt/optbuilder/project.go line 153 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I don't think the exprs are guaranteed to be *scopeColumns:

exprs[i] = tree.NewTypedColumnAccessExpr(texpr, tree.Name(colName), i)

Is there a way to exercise that case in a UDF?

Good catch. Fixed this and added a test case.


pkg/sql/opt/optbuilder/select.go line 47 at r4 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] Is the change here still necessary?

Done. Thanks for the catch!


pkg/sql/logictest/testdata/logic_test/udf_star line 18 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Can you add a test like this without the subquery aliases foo and bar? If we can't support it because we the column names can't be fully qualified, then we can return a user error if there are unaliased subqueries in a UDF.

Doesn't seem like we do this correctly outside of UDFs, but maybe this is intentional. Thoughts?

In postgres:

rharding=# create table t (a int);
CREATE TABLE
rharding=# select * from (select a from (select * from t));
ERROR:  subquery in FROM must have an alias
LINE 1: select * from (select a from (select * from t));
                                     ^
HINT:  For example, FROM (SELECT ...) [AS] foo.

In CRDB we don't error:

root@127.0.0.1:26257/defaultdb> create table t (a int);                                               
CREATE TABLE

root@127.0.0.1:26257/defaultdb> select * from (select a from (select * from t));                      
  a
-----
(0 rows)
root@127.0.0.1:26257/defaultdb> explain (opt, verbose) select * from (select a from (select * from t));                                                                  
             info
-------------------------------
  scan t
   ├── columns: a:1
   ├── stats: [rows=1]
   ├── cost: 25.68
   ├── distribution: us-east1
   └── prune: (1)
(6 rows)


pkg/sql/logictest/testdata/logic_test/udf_star line 61 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

It might be good to add some tests with SHOW CREATE FUNCTION here too.

Done.


pkg/sql/logictest/testdata/logic_test/udf_star line 140 at r4 (raw file):

Previously, chengxiong-ruan (Chengxiong Ruan) wrote…

Could you also add test cases with DROP CASCADE ? like the ones in test TestDropCascadeRemoveFunction. Just want to make sure the dependencies are handled well (I'm sure it does).

Done. Let me know if you think we need to test more cases than dropping columns and tables, in which case it might make more sense to expand the unit test to include a function with select *.


pkg/sql/opt/optbuilder/testdata/create_function line 45 at r4 (raw file):

Previously, chengxiong-ruan (Chengxiong Ruan) wrote…

these test changes need to be reverted?

Yes, done!


pkg/sql/opt/optbuilder/testdata/udf line 1181 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Might be nice to have another, more complicated test case here, like with a subquery where the column qualification is important. It'll make sure that testcat respects the rewriting correctly, too.

Done.

Copy link
Contributor

@chengxiong-ruan chengxiong-ruan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @dt, and @mgartner)

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 10 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @chengxiong-ruan, @DrewKimball, @dt, and @rharding6373)


pkg/sql/opt/optbuilder/project.go line 158 at r6 (raw file):

							case *tree.ColumnAccessExpr:
								expansions = append(expansions, tree.SelectExpr{Expr: col})
							}

I think we should add a default case that panics with an assertion failed error rather than continuing on successfully.


pkg/sql/logictest/testdata/logic_test/udf_star line 18 at r4 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

Doesn't seem like we do this correctly outside of UDFs, but maybe this is intentional. Thoughts?

In postgres:

rharding=# create table t (a int);
CREATE TABLE
rharding=# select * from (select a from (select * from t));
ERROR:  subquery in FROM must have an alias
LINE 1: select * from (select a from (select * from t));
                                     ^
HINT:  For example, FROM (SELECT ...) [AS] foo.

In CRDB we don't error:

root@127.0.0.1:26257/defaultdb> create table t (a int);                                               
CREATE TABLE

root@127.0.0.1:26257/defaultdb> select * from (select a from (select * from t));                      
  a
-----
(0 rows)
root@127.0.0.1:26257/defaultdb> explain (opt, verbose) select * from (select a from (select * from t));                                                                  
             info
-------------------------------
  scan t
   ├── columns: a:1
   ├── stats: [rows=1]
   ├── cost: 25.68
   ├── distribution: us-east1
   └── prune: (1)
(6 rows)

Right, CRDB has allowed unaliased subqueries for some time. It's probably a relic from some early implementation rather than a specific decision we made to allow it. Since we allow it, we need to figure out what to do when they are present in UDFs with star expansions. If you add a test with a UDF body of: SELECT * FROM (SELECT a FROM (SELECT * FROM t_onecol));, what do the stars get expanded to?


pkg/sql/logictest/testdata/logic_test/udf_star line 79 at r6 (raw file):

100115  f_unqualified_doublestar  SELECT t_onecol.a, t_onecol.a FROM test.public.t_onecol;
                                  SELECT a FROM test.public.t_onecol;
100116  f_exprstar                SELECT word FROM (SELECT (pg_get_keywords()).word, (pg_get_keywords()).catcode, (pg_get_keywords()).catdesc ORDER BY word LIMIT 1);

This rewrite is a bit unfortunate - now we're evaluating the pg_get_keywords() function multiple times. I wonder if there is a more efficient rewrite we can do.

Copy link
Collaborator Author

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @chengxiong-ruan, @DrewKimball, @dt, and @mgartner)


pkg/sql/logictest/testdata/logic_test/udf_star line 18 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Right, CRDB has allowed unaliased subqueries for some time. It's probably a relic from some early implementation rather than a specific decision we made to allow it. Since we allow it, we need to figure out what to do when they are present in UDFs with star expansions. If you add a test with a UDF body of: SELECT * FROM (SELECT a FROM (SELECT * FROM t_onecol));, what do the stars get expanded to?

The expansion (in the current implementation) is SELECT a FROM (SELECT a FROM (SELECT t_onecol.a FROM test.public.t_onecol))

If we allow it outside of UDFs, wouldn't it make sense to allow it inside UDFs even if it means they both differ in comparison to postgres? Why do we prefer different behavior in UDFs than outside of UDFs?


pkg/sql/logictest/testdata/logic_test/udf_star line 79 at r6 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

This rewrite is a bit unfortunate - now we're evaluating the pg_get_keywords() function multiple times. I wonder if there is a more efficient rewrite we can do.

Not very efficient, but happens to be what postgres does in its rewrite so I decided to leave it for now.

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @chengxiong-ruan, @DrewKimball, @dt, and @rharding6373)


pkg/sql/logictest/testdata/logic_test/udf_star line 18 at r4 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

The expansion (in the current implementation) is SELECT a FROM (SELECT a FROM (SELECT t_onecol.a FROM test.public.t_onecol))

If we allow it outside of UDFs, wouldn't it make sense to allow it inside UDFs even if it means they both differ in comparison to postgres? Why do we prefer different behavior in UDFs than outside of UDFs?

I think we should allow it, if possible, but I'm worried that without an alias for the subquery, this rewrite will produce a query that's either ambiguous or not equivalent to the original. I can see my last example wasn't interesting, so let me try again. Consider the query:

CREATE TABLE t1 (a INT);
CREATE TABLE t2 (a INT);

SELECT * FROM (SELECT a FROM t1) JOIN (SELECT a FROM t2) ON true;

This should produce two columns, one from each side of the join. Without an alias for the subqueries, I'm assuming we'd rewrite this as:

SELECT a, a FROM (SELECT a FROM t1) JOIN (SELECT a FROM t2) ON true;

Both as are ambiguous, and if we try to run this we get the error: column reference "a" is ambiguous.


pkg/sql/logictest/testdata/logic_test/udf_star line 79 at r6 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

Not very efficient, but happens to be what postgres does in its rewrite so I decided to leave it for now.

👍

This change allows `*` usage in UDF bodies. We rewrite UDF ASTs in place
to expand `*`s into the columns they reference.

Informs: cockroachdb#90080

Epic: CRDB-19496
Release note (sql change): Allow `*` expressions in UDFs.
Copy link
Collaborator Author

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the careful reviews!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @chengxiong-ruan, @DrewKimball, @dt, and @mgartner)


pkg/sql/opt/optbuilder/project.go line 158 at r6 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I think we should add a default case that panics with an assertion failed error rather than continuing on successfully.

Done.


pkg/sql/logictest/testdata/logic_test/udf_star line 18 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I think we should allow it, if possible, but I'm worried that without an alias for the subquery, this rewrite will produce a query that's either ambiguous or not equivalent to the original. I can see my last example wasn't interesting, so let me try again. Consider the query:

CREATE TABLE t1 (a INT);
CREATE TABLE t2 (a INT);

SELECT * FROM (SELECT a FROM t1) JOIN (SELECT a FROM t2) ON true;

This should produce two columns, one from each side of the join. Without an alias for the subqueries, I'm assuming we'd rewrite this as:

SELECT a, a FROM (SELECT a FROM t1) JOIN (SELECT a FROM t2) ON true;

Both as are ambiguous, and if we try to run this we get the error: column reference "a" is ambiguous.

I opened an issue to address ambiguous columns in subqueries here: #96375

We now throw an unimplemented error for unaliased subqueries, could you take a look at optbuilder/select.go? TY!

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice work!

Reviewed 3 of 3 files at r7, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @chengxiong-ruan, @DrewKimball, and @dt)


pkg/sql/logictest/testdata/logic_test/udf_star line 18 at r4 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

I opened an issue to address ambiguous columns in subqueries here: #96375

We now throw an unimplemented error for unaliased subqueries, could you take a look at optbuilder/select.go? TY!

👍

@rharding6373
Copy link
Collaborator Author

TFTRs everyone!

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 2, 2023

This PR was included in a batch that was canceled, it will be automatically retried

@craig craig bot merged commit dd97d0c into cockroachdb:master Feb 3, 2023
@craig
Copy link
Contributor

craig bot commented Feb 3, 2023

Build succeeded:

@rharding6373 rharding6373 deleted the 20230110_udf branch February 4, 2023 05:34
mgartner added a commit to mgartner/cockroach that referenced this pull request Jun 15, 2023
Star expressions have been allowed in UDFs since cockroachdb#95710 and in views
since cockroachdb#97515. This commit lifts a restriction that prevented table
references from being resolved as TupleStars in SELECT lists inside UDFs
and views. For example, an expression like `SELECT tbl FROM tbl` is now
allowed inside views and UDFs.

Fixes cockroachdb#104927

Release note (sql change): Table names are now allowed in SELECT lists
inside view and UDF definitions.
craig bot pushed a commit that referenced this pull request Jun 15, 2023
104677: ui: Redirect Range Report page to Hot Ranges page r=gtr a=gtr

Fixes: #102377.

Previously, when a user selected a range ID from the Hot Ranges page,
the left side menu would switch to Advanced Debug and the back button
would also redirect to the Advanced Dedug page. This commit ensures that
when a range ID is selected, the left side menu will stay on the Hot
Ranges page and also changes the back button to redirect back to the
Hot Ranges page.

<img width="791" alt="image" src="https://github.com/cockroachdb/cockroach/assets/35943354/41afa924-7395-4101-a14c-bb4cdeccec0c">

Release note (ui change): The Range Report page (route
`/reports/range/:rangeID`) shows the "Hot Ranges" menu item as
highlighted in the left side menu. The back button in the Range Report
page redirects back to the Hot Ranges page.

104929: opt: resolve TupleStars in UDFs and views r=mgartner a=mgartner

Star expressions have been allowed in UDFs since #95710 and in views
since #97515. This commit lifts a restriction that prevented table
references from being resolved as TupleStars in SELECT lists inside UDFs
and views. For example, an expression like `SELECT tbl FROM tbl` is now
allowed inside views and UDFs.

Fixes #104927
Fixes #97602 

Release note (sql change): Table names are now allowed in SELECT lists
inside view and UDF definitions.


104946: sql: update tenant_span builtins for consistent output r=arulajmani a=stevendanna

The codec's TenantSpan() function was changed to avoid PrefixEnd(). Here, we update overloads of crdb_internal.tenant_span to all use TenantSpan() to avoid inconsistent output from different overloads.

Note, I don't believe that the use of PrefixEnd() in this function constituted a problem in our current usage as the only real use of this function was in calls to crdb_internal.fingerprint() or crdb_internal.scan() where the EndKey is only used as an upper-bound for iteration.

Informs #104928

Epic: none

Release note: None

Co-authored-by: gtr <gerardo@cockroachlabs.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Steven Danna <danna@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants