Fix normalization of columns in JOIN ... USING. #16560
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #16120
In SqlToRel::parse_join(), when handling JoinContraint::Using, the
identifiers are normalized using IdentNormalizer::normalize().
That normalization lower-cases unquoted identifiers, and keeps the case
otherwise (but not the quotes).
Before this PR, the normalized column names were passed to
LogicalPlanBuilder::join_using() as strings. When each goes through
LogicalPlanBuilder::normalize(), Column::From() is called,
leading to Column::from_qualified_named(). As it gets an unqualified
column, it lower-cases it.
This means that if a join is USING("SOME_COLUMN_NAME"), we end up with a
Column { name: "some_column_name", ..}. In the end, the join fails, as
that lower-case column does not exist.
With this PR, SqlToRel::parse_join() calls Column::from_name() on
each normalized column and passed those to
LogicalPlanBuilder::join_using(). Downstream, in
LogicalPlanBuilder::normalize(), there is no need to create the Column
objects from strings, and the bug does not happen.
This fixes a regression introduced in 304488d#diff-0762df7208dad0e830a8f0b389945d53ef011cac958582963ab58579caa038bd -- before that commit, Column::from_name() was called on each column name.
Additionally, I remove the genericity on Columns from LogicalPlanBuilder::join_using(). I believe that genericity is bug-prone, while not providing much value.