-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: use IDs for columns instead of names in serialized expressions/queries #22212
Comments
@BramGruneir can you remind me what prompted this issue? Where do we store expressions in a place that would benefit from storing IDs instead? |
This was prompted from when I was exploring some of the column renaming logic. It seemed to me that we optimized for the wrong part, the pretty printing, instead of the more common case of evaluation. Evaluating any expression should never rely on the column names and they should be converted to IDs as soon as possible. It's been a while since I looked at that code, but I remember being confused by it. |
FYI the column names are converted to numeric IDs very early on (during name resolution), so for the purpose of query execution we already do what you suggestion. The one area where this is not yet true is when storing an expression in a descriptor, e.g. for view queries, DEFAULT or CHECK. Was this what you had in mind? |
Yep. I was dealing with both of those during my deep dive into FK land. |
Ok then this issue greatly overlaps with #10083. What we need is a serialization format which doesn't embed names, just IDs. USe that both for views and other things that store queries/expressions. |
For reference, how postgres does it: when a SQL query/expression gets serialized, it translates the names of things (table names, column names etc) to OIDs, and it stores the OID. The OID is translated back to a name when pretty-printing. We may be able to do something similar. |
We have marked this issue as stale because it has been inactive for |
Duplicates #10083. |
Instead of storing the name of columns in expressions, we should instead store column IDs. This makes renaming a column super quick. And should speed up evaluation as no string matching needs to occur.
Applications:
The text was updated successfully, but these errors were encountered: