-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gen4: Semi join #9039
Gen4: Semi join #9039
Conversation
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: Andres Taylor <andres@planetscale.com> Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
sqltypes.MakeTestResultNoFields( | ||
rightFields, | ||
), | ||
sqltypes.MakeTestResultNoFields( | ||
rightFields, | ||
"4|d|dd", | ||
), | ||
sqltypes.MakeTestResultNoFields( | ||
rightFields, | ||
), | ||
sqltypes.MakeTestResultNoFields( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use MakeTestStreamingResults
method for streaming tests.
// MakeTestResultNoFields builds a *sqltypes.Result object for testing. | ||
// result := sqltypes.MakeTestResult( | ||
// fields, | ||
// " 1|a", | ||
// "10|abcd", | ||
// ) | ||
// The field type values are set as the types for the rows built. | ||
// Spaces are trimmed from row values. "null" is treated as NULL. | ||
func MakeTestResultNoFields(fields []*querypb.Field, rows ...string) *Result { | ||
result := &Result{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might not be required if using MakeTestStreamingResults
func (c *concatenateTree) pushPredicate(ctx *planningContext, expr sqlparser.Expr) error { | ||
return vterrors.New(vtrpc.Code_UNIMPLEMENTED, "pushPredicate does not work on concatenate trees") | ||
} | ||
|
||
func (c *concatenateTree) removePredicate(ctx *planningContext, expr sqlparser.Expr) error { | ||
return vterrors.New(vtrpc.Code_UNIMPLEMENTED, "removePredicate does not work on concatenate trees") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like an INTERNAL error to me than an UNIMPLEMENTED error.
func (d *derivedTree) pushPredicate(ctx *planningContext, expr sqlparser.Expr) error { | ||
return vterrors.New(vtrpcpb.Code_UNIMPLEMENTED, "pushPredicate does not work on derivedTrees") | ||
} | ||
|
||
func (d *derivedTree) removePredicate(ctx *planningContext, expr sqlparser.Expr) error { | ||
return vterrors.New(vtrpcpb.Code_UNIMPLEMENTED, "removePredicate does not work on derivedTrees") | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these needs to be marked as INTERNAL error
passDownReuseCol := reuseCol | ||
if !reuseCol { | ||
passDownReuseCol = expr.As.IsEmpty() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure when we would need this special handling, we can just pass reuseCol as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the same as we do for joins. What this code accomplishes is that even if the reuseCol is set false, we do not need to add it the children if the alias is empty. We can just add it to the output columns from this semi-join
return vterrors.New(vtrpc.Code_UNIMPLEMENTED, "pushPredicate does not work on joinTrees with predicates having dependencies from both the sides") | ||
} | ||
|
||
func (jp *joinTree) removePredicate(ctx *planningContext, expr sqlparser.Expr) error { | ||
if ctx.semTable.RecursiveDeps(expr).IsSolvedBy(jp.lhs.tableID()) { | ||
return jp.lhs.removePredicate(ctx, expr) | ||
} else if ctx.semTable.RecursiveDeps(expr).IsSolvedBy(jp.rhs.tableID()) { | ||
return jp.rhs.removePredicate(ctx, expr) | ||
} | ||
return vterrors.New(vtrpc.Code_UNIMPLEMENTED, "removePredicate does not work on joinTrees with predicates having dependencies from both the sides") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we should re-phrase this error message to something like add/remove '%s' predicate not supported on cross-shard join query
// get the bindVariable for that column name and replace it in the predicate | ||
bindVar := ctx.reservedVars.ReserveColName(node) | ||
cursor.Replace(sqlparser.NewArgument(bindVar)) | ||
// check whether the bindVariable already exists in the map | ||
_, alreadyExists := vars[bindVar] | ||
if alreadyExists { | ||
return false | ||
} | ||
// if it does not exist, then push this as an output column in the outerTree and add it to the joinVars | ||
columnIndexes, err := outerTree.pushOutputColumns([]*sqlparser.ColName{node}, ctx.semTable) | ||
if err != nil { | ||
rewriteError = err | ||
return false | ||
} | ||
columnIndex := columnIndexes[0] | ||
vars[bindVar] = columnIndex | ||
return false | ||
} | ||
} | ||
return true | ||
}, nil) | ||
if rewriteError != nil { | ||
return nil, rewriteError | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should add cases for multiple co-related predicates
like
-
select 1 from user u1, user u2 where exists (select 1 from user_extra ue where ue.col = u1.col and ue.col = u2.col
-
select 1 from user u where exists (select 1 from user_extra ue where ue.col = u.col and u.col = ue.col2)
the best plan is to supply u.col once but looks like it will be supplied twice as we reserve the column at the top and do not check.
Review comments handled in #9083 |
Description
This PR adds a new primitive SemiJoin which is used to execute correlated subqueries occurring in the exists clause. An example of a query which will work now is
select id from t1 where exists(select 1 from t2 where t1.col = t2.tcol1) order by id
The new primitive SemiJoin has two children, called LHS and RHS. It executes the LHS first. For each result of LHS, it runs the RHS. If there are any returned rows from RHS then it adds the LHS row to the output.
Essentially, SemiJoin filters rows from LHS based on whether there were any returned rows from the right side.
Related Issue(s)
Checklist
Deployment Notes