Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: unify OR type IndexMerge code paths #58396

Merged

Conversation

time-and-fate
Copy link
Member

@time-and-fate time-and-fate commented Dec 18, 2024

What problem does this PR solve?

Issue Number: ref #58361

What changed and how does it work?

Now generateORIndexMerge() in indexmerge_unfinished_path.go becomes the new entry for generating all OR type IndexMerge paths. All previous code paths are merged into this one.

Entry points

  • Previous entry in generateIndexMergeOrPaths() (the code path 1 mentioned in the issue) is modified, moved to indexmerge_unfinished_path.go, and becomes the new generateORIndexMerge().
  • Previous entries in generateIndexMergeOnDNF4MVIndex() and generateIndexMerge4ComposedIndex() (the code paths 2 and 3) are deleted and replaced by the new entry. If you look into the implementation, the code is almost the same as the new generateORIndexMerge().
  • Related function names and comments are also updated to reflect this change. You can check changes in generateIndexMergePath() for a simple overview.

Some details in unifying the code paths

  • As the old generateIndexMergeOrPaths() becomes the new generateORIndexMerge(), some logic in this function is deleted:
    • The CanExprsPushDown() check partially becomes the existing same check in generateNormalIndexPartialPath(), partially becomes the newly added check in initUnfinishedPathsFromExpr().
    • The "don't generate the IndexMerge path if all its partial paths use the same non-MV index" check is moved to buildIntoAccessPath()
    • The calculation of AccessPath.CountAfterAccess is replaced by estimateCountAfterAccessForIndexMergeOR() which is introduced in the previous PR.
  • Checks for AccessPath.TableFilters, AccessPath.IndexFilters and the MaybeOverOptimized4PlanCache() check in matchPropForIndexMergeAlternatives() and generateNormalIndexPartialPath() are almost the same. They are unified and moved to buildIntoAccessPath().
  • For accessPathsForConds()
    • previously there was a usage in code path 1 where the input candidatePaths is a slice. That is deleted now, so we can simplify accessPathsForConds() to only receive one *util.AccessPath and return one *util.AccessPath.
    • Besides, the pruning logic for the empty/point ranges in it is moved to cmpAlternatives() now.
  • The needConsiderIndexMerge logic in generateIndexMerge4NormalIndex() (which becomes generateOtherIndexMerge() now) is modified.

Utils

  • In pkg/planner/util/misc.go, a util function SliceRecursiveFlattenIter() is added to iterate over multi-dimensional slices more elegantly. Otherwise, there will be some 5-level nested code blocks in this PR.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. sig/planner SIG: Planner labels Dec 18, 2024
Copy link

codecov bot commented Dec 18, 2024

Codecov Report

Attention: Patch coverage is 92.10526% with 15 lines in your changes missing coverage. Please review.

Project coverage is 73.5860%. Comparing base (2a72e7f) to head (3bed336).
Report is 13 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #58396        +/-   ##
================================================
+ Coverage   73.5209%   73.5860%   +0.0650%     
================================================
  Files          1681       1680         -1     
  Lines        464398     466965      +2567     
================================================
+ Hits         341430     343621      +2191     
- Misses       102138     102455       +317     
- Partials      20830      20889        +59     
Flag Coverage Δ
integration 43.0607% <92.1052%> (?)
unit 72.3003% <92.1052%> (+0.0324%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.6910% <ø> (ø)
parser ∅ <ø> (∅)
br 45.7894% <ø> (+0.0029%) ⬆️

@time-and-fate time-and-fate changed the title planner: [WIP] planner: unify OR type IndexMerge code paths Dec 25, 2024
@time-and-fate
Copy link
Member Author

/retest

Comment on lines -747 to -749
if !needConsiderIndexMerge {
return "IndexMerge is inapplicable or disabled. ", nil // IndexMerge is inapplicable
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this limitation only applies to non-MV indexes, and now the MV index path is also generated here, as said in the design doc, we modify this limitation a little and move it to the end of this function.

Copy link
Member Author

@time-and-fate time-and-fate Dec 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It causes some extra unnecessary work now. This is also why there are new records in pkg/planner/cardinality/testdata/cardinality_suite_out.json and tests/integrationtest/r/imdbload.result.

continue
}
// in this loop we do two things.
// 1: If all the partialPaths use the same index, we will not use the indexMerge.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to buildIntoAccessPath() in indexmerge_unfinished_path.go.

}
// in this loop we do two things.
// 1: If all the partialPaths use the same index, we will not use the indexMerge.
// 2: Compute a theoretical best countAfterAccess(pick its accessConds) for every alternative path(s).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already moved into indexmerge_unfinished_path.go as estimateCountAfterAccessForIndexMergeOR() in the previous PR.

Comment on lines -166 to -181
// identify whether all pushedDownCNFItems are fully used.
// If any partial path contains table filters, we need to keep the whole DNF filter in the Selection.
if len(partialPath.TableFilters) > 0 {
needSelection = true
partialPath.TableFilters = nil
}
// If any partial path's index filter cannot be pushed to TiKV, we should keep the whole DNF filter.
if len(partialPath.IndexFilters) != 0 && !expression.CanExprsPushDown(pushDownCtx, partialPath.IndexFilters, kv.TiKV) {
needSelection = true
// Clear IndexFilter, the whole filter will be put in indexMergePath.TableFilters.
partialPath.IndexFilters = nil
}
// Keep this filter as a part of table filters for safety if it has any parameter.
if expression.MaybeOverOptimized4PlanCache(ds.SCtx().GetExprCtx(), cnfItems) {
needSelection = true
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to buildIntoAccessPath() in indexmerge_unfinished_path.go.

Comment on lines -223 to -227
if expression.CanExprsPushDown(pushDownCtx, []expression.Expression{cnfItem}, kv.TiKV) {
pushedDownCNFItems = append(pushedDownCNFItems, cnfItem)
} else {
shouldKeepCurrentFilter = true
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In initUnfinishedPathsFromExpr():

  • For "case 1": It's already in generateNormalIndexPartialPath() so we don't need to add it again.
  • For "case 2" and "case 3": I added a similar check. Though I'm not sure if it's really needed, I added it anyway.

Comment on lines -353 to +387
needSelection = len(remainingFilters) > 0 || len(unfinishedPath.idxColHasUsableFilter) > 0
needSelection = len(remainingFilters) > 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to the check at L211. If you look at it carefully, they are essentially the same.

Copy link
Member

@Rustin170506 Rustin170506 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! :shipit:

// generateIndexMerge4ComposedIndex generates index path composed of multi indexes including multivalued index from
// (json_member_of / json_overlaps / json_contains) and single-valued index from normal indexes.
// generateANDIndexMerge4ComposedIndex tries to generate AND type index merge AccessPath for (
//json_member_of / json_overlaps / json_contains) on multiple multi-valued or normal indexes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this formatted by the go fmt? I thought it would always keep a space here.

Copy link
Member Author

@time-and-fate time-and-fate Dec 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.
This is formatted by GoLand's "wrap on typing", which is not very clever sometimes. It won't add a space in such places.
Actually, go fmt also won't add this space.

@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Dec 26, 2024
@Rustin170506
Copy link
Member

What changed and how does it work?

Maybe you fill it in as well for future archaeology.

@time-and-fate
Copy link
Member Author

What changed and how does it work?

Maybe you fill it in as well for future archaeology.

Updated.

Copy link
Contributor

@AilinKid AilinKid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

results = append(results, newPath)
} else {
results[0] = newPath
results = results[:1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any notes for removing this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.
My original idea is that the row count estimation logic should reflect the advantage of the empty range.
Anyway, I added this logic to the new implementation in the latest commit.
Probably it's better to keep things unchanged as much as possible in such a refactor.

if finishedIndexMergePath != nil {
mvIndexPaths = append(mvIndexPaths, finishedIndexMergePath)
}
if !containMVPath {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems tricky, but fine for now

Copy link

tiprow bot commented Dec 26, 2024

@time-and-fate: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
fast_test_tiprow 3bed336 link true /test fast_test_tiprow

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link

ti-chi-bot bot commented Dec 27, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: AilinKid, Rustin170506

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Dec 27, 2024
Copy link

ti-chi-bot bot commented Dec 27, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-12-26 07:00:15.417668994 +0000 UTC m=+1717805.506471536: ☑️ agreed by Rustin170506.
  • 2024-12-27 05:55:07.463842697 +0000 UTC m=+70642.819847264: ☑️ agreed by AilinKid.

@@ -16,6 +16,7 @@ package core

import (
"cmp"
"github.com/pingcap/tidb/pkg/sessionctx/variable"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

group imports

@ti-chi-bot ti-chi-bot bot merged commit e44c60c into pingcap:master Dec 27, 2024
18 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm release-note-none Denotes a PR that doesn't merit a release note. sig/planner SIG: Planner size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants