Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: disable tidb_prefer_broadcast_join_by_exchange_data_size by default; set scale factor to optimize estimating broadcast join; #42915

Merged
merged 17 commits into from
Apr 12, 2023
Merged
13 changes: 9 additions & 4 deletions planner/core/exhaust_physical_plans.go
Original file line number Diff line number Diff line change
Expand Up @@ -2129,23 +2129,28 @@ func calcHashExchangeSizeByChild(p1 Plan, p2 Plan, mppStoreCnt int) (float64, fl
return row1 + row2, 0, false
}

// The size of `Build` hash table when using broadcast join is `X`.
// The size of `Build` hash table when using shuffle join is `X / (mppStoreCnt)`.
// It will cost more time to search `Probe` data in hash table.
// Set a scale factor (`mppStoreCnt^*`) when estimating broadcast join in `isJoinFitMPPBCJ` and `isJoinChildFitMPPBCJ` (based on TPCH benchmark, it has been verified in Q9).

solotzg marked this conversation as resolved.
Show resolved Hide resolved
func isJoinFitMPPBCJ(p *LogicalJoin, mppStoreCnt int) bool {
rowBC, szBC, hasSizeBC := calcBroadcastExchangeSizeByChild(p.children[0], p.children[1], mppStoreCnt)
rowHash, szHash, hasSizeHash := calcHashExchangeSizeByChild(p.children[0], p.children[1], mppStoreCnt)
if hasSizeBC && hasSizeHash {
return szBC <= szHash
return szBC*float64(mppStoreCnt) <= szHash
}
return rowBC <= rowHash
return rowBC*float64(mppStoreCnt) <= rowHash
}

func isJoinChildFitMPPBCJ(p *LogicalJoin, childIndexToBC int, mppStoreCnt int) bool {
rowBC, szBC, hasSizeBC := calcBroadcastExchangeSize(p.children[childIndexToBC], mppStoreCnt)
rowHash, szHash, hasSizeHash := calcHashExchangeSizeByChild(p.children[0], p.children[1], mppStoreCnt)

if hasSizeBC && hasSizeHash {
return szBC <= szHash
return szBC*float64(mppStoreCnt) <= szHash
}
return rowBC <= rowHash
return rowBC*float64(mppStoreCnt) <= rowHash
}

// If we can use mpp broadcast join, that's our first choice.
Expand Down
2 changes: 1 addition & 1 deletion sessionctx/variable/tidb_vars.go
Original file line number Diff line number Diff line change
Expand Up @@ -1045,7 +1045,7 @@ const (
DefTiDBProjectionConcurrency = ConcurrencyUnset
DefBroadcastJoinThresholdSize = 100 * 1024 * 1024
DefBroadcastJoinThresholdCount = 10 * 1024
DefPreferBCJByExchangeDataSize = true
DefPreferBCJByExchangeDataSize = false
DefTiDBOptimizerSelectivityLevel = 0
DefTiDBOptimizerEnableNewOFGB = false
DefTiDBEnableOuterJoinReorder = true
Expand Down