Fix wrong results of Left Anti Semi (Not-In) Join #130
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pr has 2 commits, please review by commits.
See more details and discussion in https://github.com/greenplum-db/gpdb/pull/15663, https://github.com/greenplum-db/gpdb/issues/15662.
Most codes are same with some refinement & conflicts resolved for CBDB.
Fix wrong results of Left Anti Semi (Not-In) Join
CBDB will try to convert a NOT IN sql into a Left Anti Semi (Not-In) Join
by using cdb_find_nonnullable_vars_walker() to find if there might be nullable
values from inner or outer side.
If there is NullTest, expression_tree_walker will iterate the tree using
(NullTest*)Node->arg.
Example:
Expression c1n is null would be like:
NullTest [nulltesttype=IS_NULL argisrow=false location=102]
[arg] Var [varno=1 varattno=1 vartype=23 varnoold=1 varoattno=1]
Recursive cdb_find_nonnullable_vars_walker will first check the node->arg Var and
insert into nonNullableVars.
That's incorrect for NullTest type IS_NULL.
We should consifer the NullTest under OR expression and recursive OR expression.
Add a field nullableVars to indentify the vars might be nullable, and must be
eliminated from nonNullableVars finally.
This is more strict, but ensure right results at least.
Correct comments in convert_IN_to_antijoin()
Incorrect example comments:
The transformation is to rewrite a query of the form:
Correct it to:
The transformation is to rewrite a query of the form:
SQL NOT IN should be converted to Left Anti Semi (not-in) Join
with join condition c1 != c2.
Any non-null values from t1 don't match values from t2 shoule be kept and
IS NOT FALSE will return TRUE.
GDB checks after function make_lasj_quals()
join_expr->quals:
And oid 518 is a '<>' operator in pg_operator.
closes: #50
Change logs
Describe your change clearly, including what problem is being solved or what feature is being added.
If it has some breaking backward or forward compatibility, please clary.
Why are the changes needed?
Describe why the changes are necessary.
Does this PR introduce any user-facing change?
If yes, please clarify the previous behavior and the change this PR proposes.
How was this patch tested?
Please detail how the changes were tested, including manual tests and any relevant unit or integration tests.
Contributor's Checklist
Here are some reminders and checklists before/when submitting your pull request, please check them:
make installcheck
make -C src/test installcheck-cbdb-parallel