Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: add the functional dependency for Datasource, proj, select, agg #33071

Merged
merged 29 commits into from
Mar 17, 2022

Conversation

winoros
Copy link
Member

@winoros winoros commented Mar 15, 2022

What problem does this PR solve?

Issue Number: ref #29766

Problem Summary:

What is changed and how it works?

This pr adds the maintain of functional dependency for datasource, projection, selection, aggregation, inner join and semi join.

In next pr, we would add outer join and open the check.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

@winoros winoros requested a review from a team as a code owner March 15, 2022 06:58
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Mar 15, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • AilinKid
  • time-and-fate

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 15, 2022
@sre-bot
Copy link
Contributor

sre-bot commented Mar 15, 2022

planner/core/stringer.go Outdated Show resolved Hide resolved
planner/funcdep/fd_graph.go Outdated Show resolved Hide resolved
planner/funcdep/fd_graph.go Outdated Show resolved Hide resolved
expression/util.go Show resolved Hide resolved
// which can upgrade lax FDs to strict ones.
func (s *FDSet) MakeNotNull(notNullCols FastIntSet) {
notNullCols.UnionWith(s.NotNullCols)
notNullColsSet := s.closureOfEquivalence(notNullCols)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to exclude lax equivalence in it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This time we only maintain the strict equiv classes.

if fd.from.SubsetOf(notNullColsSet) && fd.to.SubsetOf(notNullColsSet) {
// we don't need to clean the old lax FD because when adding the corresponding strict one, the lax
// one will be implied by that and itself is removed.
s.AddStrictFunctionalDependency(fd.from, fd.to)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems it's possible that it's an equivalence here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, since we don't maintain the lax equiv classes, we can make sure this fd must be fd instead of equiv class when we go here

} else {
for k, v := range rightFD.HashCodeToUniqueID {
if _, ok := fds.HashCodeToUniqueID[k]; ok {
panic("shouldn't be here, children has same expr while registered not only once")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid panic.

ok bool
constantUniqueID int
)
if constantUniqueID, ok = fds.IsHashCodeRegistered(hashCode); !ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen for the ok case?
If the same constant appears in different operators but corresponds to different UniqueID, will there be mistakes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same constant will use the same hashcode, then use the same unique id here.

This is temp solution to fix it before we refactor the way we store the const.

Comment on lines 461 to 467
result := expression.EvaluateExprWithNull(p.ctx, p.schema, x)
con, ok := result.(*expression.Constant)
if !ok || con.Value.IsNull() {
// if x can be nullable when referred columns are null, the extended column can be nullable.
nullable = true
}
if !nullable || determinants.SubsetOf(fds.NotNullCols) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's the same as isNullRejected.

case *expression.ScalarFunction:
scalarUniqueID, ok := fds.IsHashCodeRegistered(string(hack.String(x.HashCode(p.SCtx().GetSessionVars().StmtCtx))))
if !ok {
panic("selected expr must have been registered, shouldn't be here")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid panic.

// 1: normal value can be multiple
// 2: null value can be multiple
// for this kind of lax to be strict, we need to make both the determinant and dependency not-null.
fds.AddLaxFunctionalDependency(keyCols, allCols, false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could non-unique index become functional dependency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No
a b
1 2
1 3
2 2

There's an index on (a).
You can see that we cannot build either strict or lax functional dependency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, we should remove this

Comment on lines 236 to 250
fds.NotNullCols.UnionWith(rightFD.NotNullCols)
if fds.HashCodeToUniqueID == nil {
fds.HashCodeToUniqueID = rightFD.HashCodeToUniqueID
} else {
for k, v := range rightFD.HashCodeToUniqueID {
if _, ok := fds.HashCodeToUniqueID[k]; ok {
panic("shouldn't be here, children has same expr while registered not only once")
}
fds.HashCodeToUniqueID[k] = v
}
}
for i, ok := rightFD.GroupByCols.Next(0); ok; i, ok = rightFD.GroupByCols.Next(i + 1) {
fds.GroupByCols.Insert(i)
}
fds.HasAggBuilt = fds.HasAggBuilt || rightFD.HasAggBuilt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they should be put into MakeCartesianProduct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MakeCartesianProduct will be also used in outer join. So we cannot put it into it.

Comment on lines +220 to +222
eqCondSlice := expression.ScalarFuncs2Exprs(p.EqualConditions)
// some join eq conditions are stored in the OtherConditions.
allConds := append(eqCondSlice, p.OtherConditions...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to consider LeftConditions and RightConditions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need to consider that in outer join.

@winoros
Copy link
Member Author

winoros commented Mar 17, 2022

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: faa629c

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Mar 17, 2022
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 17, 2022
@ti-chi-bot
Copy link
Member

@winoros: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot removed the status/can-merge Indicates a PR has been approved by a committer. label Mar 17, 2022
@winoros
Copy link
Member Author

winoros commented Mar 17, 2022

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 9c1d963

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Mar 17, 2022
@winoros winoros removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 17, 2022
@winoros
Copy link
Member Author

winoros commented Mar 17, 2022

@ti-chi-bot

@winoros
Copy link
Member Author

winoros commented Mar 17, 2022

@ti-chi-bot

@winoros
Copy link
Member Author

winoros commented Mar 17, 2022

/merge

@winoros winoros closed this Mar 17, 2022
@winoros winoros reopened this Mar 17, 2022
@ti-chi-bot
Copy link
Member

@winoros: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot merged commit 9bc9572 into pingcap:master Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants