-
Notifications
You must be signed in to change notification settings - Fork 558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support redshift's columns definition list for system information functions #769
Support redshift's columns definition list for system information functions #769
Conversation
pg_get_late_binding_view_cols pg_get_cols pg_get_grantee_by_iam_role pg_get_iam_role_by_user
Pull Request Test Coverage Report for Build 4206736323
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution @mskrzypkows !
src/ast/mod.rs
Outdated
/// when used with redshift pg_get_late_binding_view_cols/pg_get_cols) | ||
#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Hash)] | ||
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] | ||
pub struct ColsDefinition { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason not to use the existing ColumnDef
?
I think that would make this PR significantly shorter as well as supporting other options
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the name is misleading. I should have named it e.g. TableAliasDefinition
, because it defines what will be a result table, what column names it will have.
Like in the example:
select * from pg_get_late_binding_view_cols() cols(view_schema name, view_name name, col_name name, col_type varchar, col_num int);
view_schema | view_name | col_name | col_type | col_num
------------+-----------+------------+-----------------------------+--------
public | sales_lbv | salesid | integer | 1
public | sales_lbv | listid | integer | 2
...
But I think it's a good idea to use ColumnDef
inside the vector, instead of vector of IdentPair
s.
What's your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the name is misleading. I should have named it e.g. TableAliasDefinition, because it defines what will be a result table, what column names it will have.
Sounds like a good idea to me
But I think it's a good idea to use ColumnDef inside the vector, instead of vector of IdentPairs.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried using ColumnDef
and failed because name
is not a column type. It's strange, redshift doesn't define it as a keyword nor a type, but it's used in such a case. I think it's not a good idea to add it as additional type, so I'll rather back to IdentPair
, or is there a better way to handle it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I looked at the docs for
https://docs.aws.amazon.com/redshift/latest/dg/PG_GET_GRANTEE_BY_IAMROLE.html
I see in the example it has something like
select grantee, grantee_type, cmd_type
FROM
pg_get_grantee_by_iam_role('arn:aws:iam::123456789012:role/Redshift-S3-Write')
res_grantee(grantee text, grantee_type text, cmd_type text)
ORDER BY
1,2,3;
I think this PR is related to this bit of the query (note there is no comma between the call to pg_get_grantee_by_iam_role
)
res_grantee(grantee text, grantee_type text, cmd_type text)
I have several confusions:
- Are you sure this construct is specific to the system information functions, or is a more general mechanism (it look like a more general mechanism that expands out a tuple to a table to me)
- If it is a view definition, I would expect it to be parsed as ident, variable type not as just generic idents
Maybe someone could do some more research into what exactly this syntax construct in Redshift is called and what its rules are
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- You're right, it looks like a more general mechanism. I've checked the documentation, but haven't found any general syntax description for it.
- So if we have an additional
name
type, then it has to be added to theDataType
, right?
src/parser.rs
Outdated
&mut self, | ||
name: &ObjectName, | ||
) -> Result<Option<ColsDefinition>, ParserError> { | ||
if !dialect_of!(self is RedshiftSqlDialect) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mskrzypkows I guess this could consider the GenericDialect as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions are specific for the Redshift: https://docs.aws.amazon.com/redshift/latest/dg/r_System_information_functions.html so I think it's not needed for GenericDialect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, but the Generic dialect is supposed to be the most permissive one. Unless there are syntax conflicts (e.g., accepting those would break another part of the code), I can't see why not.
src/parser.rs
Outdated
.value | ||
.to_lowercase(); | ||
|
||
if fname == "pg_get_late_binding_view_cols" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb isn't this too much related to semantic logic? it isn't just a database metadata function? Is it really necessary to have that deep information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree -- I would not expect this list to be hard coded. Instead I would expect any identifier to be accepted here and then the downstream crate would check against a list if it wanted.
This design goal is explained in https://github.com/sqlparser-rs/sqlparser-rs#syntax-vs-semantics
…hift_special_system_information_functions
added generic dialect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mskrzypkows there are some actions failing, could you please take a look?
I am sorry I am a bit behind on reviews / merging in sqlparser-rs. I hope to be able to catch up later this week |
Sure, I'll fix it. |
…hift_special_system_information_functions
…hift_special_system_information_functions
parentheses instead of function name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good to me now. Thank you @mskrzypkows
This is one of the more unique syntaxes I have seen in SQL 🤯
…l_system_information_functions
I took the liberty of merging up from main and fixing a (new) clippy lint on this PR |
…tion functions (apache#769)" This reverts commit c35dcc9.
…tion functions (apache#769)" This reverts commit c35dcc9.
This change seems to have caused a regression in 0.31.0 -- see #826. Here is a PR to back it out: #827 Looking more at the example https://docs.aws.amazon.com/redshift/latest/dg/PG_GET_GRANTEE_BY_IAMROLE.html select grantee, grantee_type, cmd_type
FROM
pg_get_grantee_by_iam_role('arn:aws:iam::123456789012:role/Redshift-S3-Write')
res_grantee(grantee text, grantee_type text, cmd_type text)
ORDER BY
1,2,3; Rather than a special table definition syntax I think this is actually FROM <table_function> <table_alias>(<col_alias1>, <col_alias2>, ...) So select grantee, grantee_type, cmd_type
FROM
pg_get_grantee_by_iam_role('arn:aws:iam::123456789012:role/Redshift-S3-Write') <-- this is "table function"
res_grantee(grantee text, grantee_type text, cmd_type text) <-- `res_grantee` is the table alias, and the other things are column aliases
ORDER BY
1,2,3; I see perhaps the sqlparser column alias syntax doesn't support datatypes, maybe that could be extended |
pg_get_late_binding_view_cols
pg_get_cols
pg_get_grantee_by_iam_role
pg_get_iam_role_by_user