Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support redshift's columns definition list for system information functions #769

Merged

Conversation

mskrzypkows
Copy link
Contributor

pg_get_late_binding_view_cols
pg_get_cols
pg_get_grantee_by_iam_role
pg_get_iam_role_by_user

pg_get_late_binding_view_cols
pg_get_cols
pg_get_grantee_by_iam_role
pg_get_iam_role_by_user
@mskrzypkows mskrzypkows changed the title parsing of redshift's column definition list for Parsing of redshift's columns definition list for system information functions Dec 23, 2022
@coveralls
Copy link

coveralls commented Dec 23, 2022

Pull Request Test Coverage Report for Build 4206736323

  • 109 of 111 (98.2%) changed or added relevant lines in 12 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.09%) to 86.142%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/parser.rs 25 26 96.15%
tests/sqlparser_redshift.rs 42 43 97.67%
Totals Coverage Status
Change from base Build 4206701096: 0.09%
Covered Lines: 13371
Relevant Lines: 15522

💛 - Coveralls

@alamb alamb changed the title Parsing of redshift's columns definition list for system information functions Support redshift's columns definition list for system information functions Dec 28, 2022
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution @mskrzypkows !

src/ast/mod.rs Outdated
/// when used with redshift pg_get_late_binding_view_cols/pg_get_cols)
#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
pub struct ColsDefinition {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not to use the existing ColumnDef?

https://github.com/sqlparser-rs/sqlparser-rs/blob/61c661c234518ae02ae2c50d317e4640e4b90e26/src/ast/ddl.rs#L479-L486

I think that would make this PR significantly shorter as well as supporting other options

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the name is misleading. I should have named it e.g. TableAliasDefinition, because it defines what will be a result table, what column names it will have.
Like in the example:

select * from pg_get_late_binding_view_cols() cols(view_schema name, view_name name, col_name name, col_type varchar, col_num int);
view_schema | view_name | col_name   | col_type                    | col_num
------------+-----------+------------+-----------------------------+--------
public      | sales_lbv | salesid    | integer                     |       1
public      | sales_lbv | listid     | integer                     |       2
...

But I think it's a good idea to use ColumnDef inside the vector, instead of vector of IdentPairs.

What's your opinion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the name is misleading. I should have named it e.g. TableAliasDefinition, because it defines what will be a result table, what column names it will have.

Sounds like a good idea to me

But I think it's a good idea to use ColumnDef inside the vector, instead of vector of IdentPairs.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried using ColumnDef and failed because name is not a column type. It's strange, redshift doesn't define it as a keyword nor a type, but it's used in such a case. I think it's not a good idea to add it as additional type, so I'll rather back to IdentPair, or is there a better way to handle it?

Copy link
Contributor

@alamb alamb Jan 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I looked at the docs for

https://docs.aws.amazon.com/redshift/latest/dg/PG_GET_GRANTEE_BY_IAMROLE.html

I see in the example it has something like

select grantee, grantee_type, cmd_type 
FROM 
  pg_get_grantee_by_iam_role('arn:aws:iam::123456789012:role/Redshift-S3-Write') 
  res_grantee(grantee text, grantee_type text, cmd_type text) 
ORDER BY
 1,2,3;

I think this PR is related to this bit of the query (note there is no comma between the call to pg_get_grantee_by_iam_role)

  res_grantee(grantee text, grantee_type text, cmd_type text) 

I have several confusions:

  1. Are you sure this construct is specific to the system information functions, or is a more general mechanism (it look like a more general mechanism that expands out a tuple to a table to me)
  2. If it is a view definition, I would expect it to be parsed as ident, variable type not as just generic idents

Maybe someone could do some more research into what exactly this syntax construct in Redshift is called and what its rules are

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. You're right, it looks like a more general mechanism. I've checked the documentation, but haven't found any general syntax description for it.
  2. So if we have an additional name type, then it has to be added to the DataType, right?

src/parser.rs Outdated
&mut self,
name: &ObjectName,
) -> Result<Option<ColsDefinition>, ParserError> {
if !dialect_of!(self is RedshiftSqlDialect) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mskrzypkows I guess this could consider the GenericDialect as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions are specific for the Redshift: https://docs.aws.amazon.com/redshift/latest/dg/r_System_information_functions.html so I think it's not needed for GenericDialect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, but the Generic dialect is supposed to be the most permissive one. Unless there are syntax conflicts (e.g., accepting those would break another part of the code), I can't see why not.

src/parser.rs Outdated
.value
.to_lowercase();

if fname == "pg_get_late_binding_view_cols"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb isn't this too much related to semantic logic? it isn't just a database metadata function? Is it really necessary to have that deep information?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree -- I would not expect this list to be hard coded. Instead I would expect any identifier to be accepted here and then the downstream crate would check against a list if it wanted.

This design goal is explained in https://github.com/sqlparser-rs/sqlparser-rs#syntax-vs-semantics

Maciej Skrzypkowski added 2 commits January 9, 2023 12:04
Copy link
Contributor

@AugustoFKL AugustoFKL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mskrzypkows there are some actions failing, could you please take a look?

@alamb
Copy link
Contributor

alamb commented Jan 9, 2023

I am sorry I am a bit behind on reviews / merging in sqlparser-rs. I hope to be able to catch up later this week

@mskrzypkows
Copy link
Contributor Author

@mskrzypkows there are some actions failing, could you please take a look?

Sure, I'll fix it.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good to me now. Thank you @mskrzypkows

This is one of the more unique syntaxes I have seen in SQL 🤯

@alamb
Copy link
Contributor

alamb commented Feb 17, 2023

I took the liberty of merging up from main and fixing a (new) clippy lint on this PR

@alamb alamb merged commit c35dcc9 into apache:main Feb 17, 2023
alamb added a commit to alamb/sqlparser-rs that referenced this pull request Mar 6, 2023
alamb added a commit to alamb/sqlparser-rs that referenced this pull request Mar 6, 2023
@alamb
Copy link
Contributor

alamb commented Mar 6, 2023

This change seems to have caused a regression in 0.31.0 -- see #826. Here is a PR to back it out: #827

Looking more at the example https://docs.aws.amazon.com/redshift/latest/dg/PG_GET_GRANTEE_BY_IAMROLE.html

select grantee, grantee_type, cmd_type 
FROM 
  pg_get_grantee_by_iam_role('arn:aws:iam::123456789012:role/Redshift-S3-Write') 
  res_grantee(grantee text, grantee_type text, cmd_type text) 
ORDER BY
 1,2,3;

Rather than a special table definition syntax I think this is actually

FROM <table_function> <table_alias>(<col_alias1>, <col_alias2>, ...)

So

select grantee, grantee_type, cmd_type 
FROM 
  pg_get_grantee_by_iam_role('arn:aws:iam::123456789012:role/Redshift-S3-Write') <-- this is "table function"
  res_grantee(grantee text, grantee_type text, cmd_type text) <-- `res_grantee` is the table alias, and the other things are column aliases
ORDER BY
 1,2,3;

I see perhaps the sqlparser column alias syntax doesn't support datatypes, maybe that could be extended

alamb added a commit that referenced this pull request Mar 6, 2023
…column definition list (#827)

* Fix table alias parsing regression

* Revert "Support redshift's columns definition list for system information functions (#769)"

This reverts commit c35dcc9.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants