Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: allow unaliased subqueries within UDFs #96375

Closed
rharding6373 opened this issue Feb 1, 2023 · 8 comments · Fixed by #97544
Closed

sql: allow unaliased subqueries within UDFs #96375

rharding6373 opened this issue Feb 1, 2023 · 8 comments · Fixed by #97544
Assignees
Labels
A-sql-routine UDFs and Stored Procedures C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team

Comments

@rharding6373
Copy link
Collaborator

rharding6373 commented Feb 1, 2023

There is a compatibility difference with postgres for queries with ambiguous columns.

Examples that fail in postgres with "subquery in FROM must have an alias" but not crdb:

CREATE TABLE t1 (a INT);
CREATE TABLE t2 (a INT);
SELECT * FROM (SELECT a FROM t1) JOIN (SELECT a FROM t2) ON true;
SELECT * FROM (SELECT * FROM (SELECT a FROM t1));
SELECT a FROM (SELECT * FROM (SELECT a FROM t1));

The following example fails correctly in crdb with "column reference "a" is ambiguous":

SELECT a FROM (SELECT * FROM (SELECT a FROM t1) JOIN (SELECT a FROM  t2) ON true);

Jira issue: CRDB-24094

@rharding6373 rharding6373 added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Feb 1, 2023
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Feb 1, 2023
@mgartner
Copy link
Collaborator

mgartner commented Feb 2, 2023

Postgres doesn't allow unaliased subqueries in FROM at all. I don't think we can support these within UDFs easily, because we have to qualify column names expanded from *.

So the only decision that I can see us needing to make is whether we remove support for unaliased subqueries in FROM clauses entirely, or live with this oddity where they are allowed only outside a UDF definition.

@mgartner
Copy link
Collaborator

mgartner commented Feb 2, 2023

Actually, it might be possible to "assign" a unique alias for unaliased subqueries when the UDF is created, removing any ambiguity from the rewritten function body.

@mgartner mgartner added the A-sql-routine UDFs and Stored Procedures label Feb 2, 2023
@mgartner mgartner changed the title sql: ambiguous columns in subqueries not always identified correctly. sql: allow unaliased subqueries within UDFs Feb 2, 2023
@michae2
Copy link
Collaborator

michae2 commented Feb 7, 2023

[idea from triage] could we come up with a unique alias and use that to fix the UDF problem?

@mgartner
Copy link
Collaborator

Here's an explanation of why we currently disallow unaliased subqueries in UDFs: #95710 (review)

@knz
Copy link
Contributor

knz commented Feb 22, 2023

Actually, it might be possible to "assign" a unique alias for unaliased subqueries when the UDF is created, removing any ambiguity from the rewritten function body.

Yes this seems to be the preferable approach.

@DrewKimball
Copy link
Collaborator

It looks like postgres doesn't qualify the column names during star expansion for views, but it does for UDFs:

postgres=# CREATE VIEW v AS SELECT * FROM (SELECT a FROM t1) foo JOIN (SELECT a FROM t2) bar ON true;
ERROR:  column "a" specified more than once

postgres=# CREATE FUNCTION f3() RETURNS RECORD LANGUAGE SQL BEGIN ATOMIC; SELECT * FROM (SELECT a FROM t1) foo JOIN (SELECT a FROM t2) bar ON true; END;
CREATE FUNCTION

@knz
Copy link
Contributor

knz commented Feb 23, 2023

It looks like postgres doesn't qualify the column names during star expansion for views, but it does for UDFs:

In both cases, the column names are not qualified. It's always valid for a pg projection to contain multple columns with the same name, for example SELECT 1 AS a, 2 AS a is valid/works.

The extra thing that happens with CREATE VIEW is that the DDL code validates that the resulting table descriptor has unique column names.

This is not needed in your UDF example because it returns a RECORD type, which doesn't include column labels. There's no additional uniqueness check for the column names in that case.

@DrewKimball
Copy link
Collaborator

The output of \df+ looks like this for that function:

 Schema | Name | Result data type | Argument data types | Type | Volatility | Parallel |     Owner     | Security | Access privileges | Language |             Source code              | Description
--------+------+------------------+---------------------+------+------------+----------+---------------+----------+-------------------+----------+--------------------------------------+-------------
 public | f3   | record           |                     | func | volatile   | unsafe   | andrewkimball | invoker  |                   | sql      | BEGIN ATOMIC                        +|
        |      |                  |                     |      |            |          |               |          |                   |          |  SELECT foo.a,                      +|
        |      |                  |                     |      |            |          |               |          |                   |          |      bar.a                          +|
        |      |                  |                     |      |            |          |               |          |                   |          |     FROM (( SELECT t1.a             +|
        |      |                  |                     |      |            |          |               |          |                   |          |             FROM t1) foo            +|
        |      |                  |                     |      |            |          |               |          |                   |          |       JOIN ( SELECT t2.a            +|
        |      |                  |                     |      |            |          |               |          |                   |          |             FROM t2) bar ON (true));+|
        |      |                  |                     |      |            |          |               |          |                   |          | END                                  |
(1 row)

@craig craig bot closed this as completed in #97544 Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-routine UDFs and Stored Procedures C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants