Gen4 Planner: AxB vs BxA #7274

systay · 2021-01-09T07:09:42Z

This PR teaches the Gen4 planner how to re-evaluate routes after a predicate has been pushed during the join tree comparisons.

Better plans

I'll use an example to show what the planner does:

SELECT user.col 
FROM user JOIN user_extra ON user.id = user_extra.col

There is a primary vindex on user.id, but for user_extra, there are no applicable vindexes.
This means first of all that we can't merge the two tables into a single route - for certain user.id values, the matching user_extra.col might well be placed in a different shard.

So, we are forced to do the join on the vtgate - we can't delegate this to mysql. Now the question is: should we first query the user table and push down the user.id to the user_extra route, or the other way around?

Given the available vindexes, we have to evaluate user_extra using a scatter query no matter if we have user.id available or not. For the user table on the other hand, if we have user_extra.col available, we could use a SelectEqualUnique which is way better than a scatter query.

Implementation comments

Because the planner now takes the same predicate (user.id = user_extra.col) and rewrites it in two different ways depending on which route is on the LHS, I needed to make it possible to clone expressions. I have hand-written the clone method for expressions, but I think it makes sense to use the visitorgen to generate this code, and do it for all AST structs and not just Expr.

I also had to move the pushing of predicates to the joinTree phase instead of waiting until we have logical plans.

Planner performance

I was afraid that all this new logic that is being done during joinTree selection would lead to much slower planning, but the benchmarks don't look too scary. 😅

BenchmarkPlanner/from_cases.txt-v3-16       	  	    	    1198	    853119 ns/op	  393968 B/op	    7319 allocs/op
BenchmarkPlanner/from_cases.txt-v4-16         	    		    1436	    796068 ns/op	  366321 B/op	    7553 allocs/op
BenchmarkPlanner/from_cases.txt-v4left2right-16         	    1545	    747779 ns/op	  342363 B/op	    6999 allocs/op
BenchmarkPlanner/filter_cases.txt-v3-16                 	    1544	    763470 ns/op	  340849 B/op	    6820 allocs/op
BenchmarkPlanner/filter_cases.txt-v4-16                 	    1749	    656536 ns/op	  291623 B/op	    5779 allocs/op
BenchmarkPlanner/filter_cases.txt-v4left2right-16       	    1764	    658813 ns/op	  288869 B/op	    5705 allocs/op
BenchmarkPlanner/large_cases.txt-v3-16                  	    8616	    126044 ns/op	   54688 B/op	    1189 allocs/op
BenchmarkPlanner/large_cases.txt-v4-16                  	    6630	    163335 ns/op	   72650 B/op	    1564 allocs/op
BenchmarkPlanner/large_cases.txt-v4left2right-16        	   13299	     88614 ns/op	   41088 B/op	    1003 allocs/op
BenchmarkPlanner/aggr_cases.txt-v3-16                   	    2864	    433544 ns/op	  148570 B/op	    4068 allocs/op
BenchmarkPlanner/aggr_cases.txt-v4-16                   	    3622	    337122 ns/op	  110093 B/op	    3157 allocs/op
BenchmarkPlanner/aggr_cases.txt-v4left2right-16         	    3675	    336505 ns/op	  110174 B/op	    3157 allocs/op
BenchmarkPlanner/select_cases.txt-v3-16                 	    1092	    967935 ns/op	  328761 B/op	    8525 allocs/op
BenchmarkPlanner/select_cases.txt-v4-16                 	    1335	    911256 ns/op	  285991 B/op	    7859 allocs/op
BenchmarkPlanner/select_cases.txt-v4left2right-16       	    1269	    900011 ns/op	  279665 B/op	    7769 allocs/op
BenchmarkPlanner/union_cases.txt-v3-16                  	    1315	    948248 ns/op	  338814 B/op	    9025 allocs/op
BenchmarkPlanner/union_cases.txt-v4-16                  	    1221	    925400 ns/op	  338622 B/op	    9025 allocs/op
BenchmarkPlanner/union_cases.txt-v4left2right-16        	    1280	    899773 ns/op	  338773 B/op	    9025 allocs/op
BenchmarkPlanner/wireup_cases.txt-v3-16                 	   19597	     58601 ns/op	   24786 B/op	     586 allocs/op
BenchmarkPlanner/wireup_cases.txt-v4-16                 	   25574	     45954 ns/op	   18512 B/op	     470 allocs/op
BenchmarkPlanner/wireup_cases.txt-v4left2right-16       	   27835	     42890 ns/op	   15732 B/op	     429 allocs/op

Signed-off-by: Andres Taylor <andres@planetscale.com>

When picking the vindex to use, we must also check that the value it is reading from can be used by vtgate for this purpose - the PlanValue must support the expression type. Signed-off-by: Andres Taylor <andres@planetscale.com>

Signed-off-by: Andres Taylor <andres@planetscale.com>

harshit-gangal · 2021-01-19T18:40:08Z

go/vt/sqlparser/ast_funcs.go

+	if node == nil {
+		return nil
+	}
+	panic(1)


nit: can be changed to "subquery clone not supported"

harshit-gangal · 2021-01-19T18:43:27Z

go/vt/sqlparser/ast_funcs.go

+	if node == nil {
+		return nil
+	}
+	panic(1)


same as above

harshit-gangal · 2021-01-19T18:43:39Z

go/vt/sqlparser/ast_funcs.go

+	if node == nil {
+		return nil
+	}
+	panic(1)


same as above

harshit-gangal · 2021-01-19T18:43:50Z

go/vt/sqlparser/ast_funcs.go

+	if node == nil {
+		return nil
+	}
+	panic(1)


same as above

harshit-gangal · 2021-01-19T18:43:56Z

go/vt/sqlparser/ast_funcs.go

+	if node == nil {
+		return nil
+	}
+	panic(1)


same as above

harshit-gangal · 2021-01-19T18:44:02Z

go/vt/sqlparser/ast_funcs.go

+		return nil
+	}
+
+	panic(1)


same as above

harshit-gangal · 2021-01-19T18:52:02Z

go/vt/vtgate/planbuilder/testdata/filter_cases.txt

+      "Sharded": true
+    },
+    "FieldQuery": "select id from music where 1 != 1",
+    "Query": "select id from music",


Either we add the query to the plan with where clause that shows it will select 0 results or we do not add query to the plan to avoid confusion.

harshit-gangal · 2021-01-19T18:53:54Z

go/vt/vtgate/planbuilder/testdata/filter_cases.txt

+  "QueryType": "SELECT",
+  "Original": "select unsharded.id from user join unsharded where unsharded.id = user.id",
+  "Instructions": {
+    "OperatorType": "Join",
+    "Variant": "Join",
+    "JoinColumnIndexes": "-2",
+    "TableName": "unsharded_user",
+    "Inputs": [
+      {
+        "OperatorType": "Route",
+        "Variant": "SelectUnsharded",
+        "Keyspace": {
+          "Name": "main",
+          "Sharded": false
+        },
+        "FieldQuery": "select unsharded.id, unsharded.id from unsharded where 1 != 1",
+        "Query": "select unsharded.id, unsharded.id from unsharded",
+        "Table": "unsharded"
+      },
+      {
+        "OperatorType": "Route",
+        "Variant": "SelectEqualUnique",
+        "Keyspace": {
+          "Name": "user",
+          "Sharded": true
+        },
+        "FieldQuery": "select 1 from user where 1 != 1",
+        "Query": "select 1 from user where user.id = :unsharded_id",
+        "Table": "user",
+        "Values": [
+          ":unsharded_id"
+        ],
+        "Vindex": "user_index"
+      }


for this query the better plan would be select scatter followed by query on selectunsharded.

maybe. would you mind if we look deeper into this later, when a larger percentage of the gen4 planner is done, and we can get a better overview of the options available?

harshit-gangal

code looks good to be. I think there is need to document the logic for the important methods involved in the planning.

harshit-gangal · 2021-01-22T18:13:20Z

go/vt/vtgate/planbuilder/route_planning_test.go

@@ -104,3 +104,23 @@ func TestMergeJoins(t *testing.T) {
 		})
 	}
 }
+


I checked this file and I see the default planner is greedy. Is it intentional to miss the V4Left2Right planner test?

should probably add to it. it was not a conscious decision, no

go/vt/vtgate/planbuilder/route_planning.go

harshit-gangal · 2021-01-22T19:12:00Z

go/vt/vtgate/planbuilder/route_planning.go

+		engine.SelectEqualUnique,
+		engine.SelectNext,
+		engine.SelectNone,
+		engine.SelectReference,
+		engine.SelectUnsharded:
+		return 1
+	case engine.SelectEqual:
+		return 5
+	case engine.SelectIN:
+		return 10
+	case engine.SelectMultiEqual:
+		return 10
+	case engine.SelectScatter:
+		return 20


add a todo to revisit these cost.

harshit-gangal · 2021-01-22T19:27:40Z

go/vt/vtgate/planbuilder/route_planning.go

+	if newVindexFound {
+		rp.pickBestAvailableVindex()
+	}
+
+	// any predicates that cover more than a single table need to be added here
+	rp.predicates = append(rp.predicates, predicates...)
+
+	return nil
+}
+
+// pickBestAvailableVindex goes over the available vindexes for this route and picks the best one available.
+func (rp *routePlan) pickBestAvailableVindex() {
 	//TODO (Manan,Andres): Improve cost metric for vindexes
-	for _, v := range vindexPreds {
+	for _, v := range rp.vindexPreds {


I think we do not need to loop over on rp.vindexPreds.
On a given routePlan there will only we one vindex used so we can just check the cost of the best selected vindex with the current matched vindex.

the idea is that when we add a table to a routePlan, we also add all possible vindexes. later, when predicates are added, new vindexes can become available (in other words, covered as the struct field).

I tried removing vindexes that had already lost and did not need rechecking, but that didn't make any diff in the benchmarks, so I didn't keep that complexity in here

harshit-gangal · 2021-01-22T19:29:21Z

go/vt/vtgate/planbuilder/route_planning.go

+}
+
+// pickBestAvailableVindex goes over the available vindexes for this route and picks the best one available.
+func (rp *routePlan) pickBestAvailableVindex() {
 	//TODO (Manan,Andres): Improve cost metric for vindexes


move this TODO to the place where cost metric is present :)

Signed-off-by: Andres Taylor <andres@planetscale.com>

Before optimisation: BenchmarkPlanner/large_cases.txt-v4-16 70786 167749 ns/op 74315 B/op 1618 allocs/op BenchmarkPlanner/large_cases.txt-v4-16 70998 168734 ns/op 74303 B/op 1618 allocs/op After: BenchmarkPlanner/large_cases.txt-v4-16 76375 156411 ns/op 67752 B/op 1376 allocs/op BenchmarkPlanner/large_cases.txt-v4-16 76207 157553 ns/op 67816 B/op 1376 allocs/op Signed-off-by: Andres Taylor <andres@planetscale.com>

Signed-off-by: Andres Taylor <andres@planetscale.com>

systay added 6 commits January 9, 2021 08:08

moved out jointree transformations into its own file

bdfb4b9

Signed-off-by: Andres Taylor <andres@planetscale.com>

Merge remote-tracking branch 'upstream/master' into v4

67cadbe

Signed-off-by: Andres Taylor <andres@planetscale.com>

start of making it possible to push predicates on the joinTree

6b44402

Signed-off-by: Andres Taylor <andres@planetscale.com>

Added Clone() method to sqlparser.Expr

9fedb90

Signed-off-by: Andres Taylor <andres@planetscale.com>

clean up predicate adding to joinTrees

22973c1

Signed-off-by: Andres Taylor <andres@planetscale.com>

updated test expectations

fca46a7

Signed-off-by: Andres Taylor <andres@planetscale.com>

systay mentioned this pull request Jan 11, 2021

Gen4 Tracking #7280

Closed

13 tasks

Merge branch 'horizon-planning' into v4

0ee68a0

Signed-off-by: Andres Taylor <andres@planetscale.com>

systay force-pushed the v4 branch from ec4dee6 to 0ee68a0 Compare January 13, 2021 06:15

last touch ups of AxB vs BxA

2c86695

Signed-off-by: Andres Taylor <andres@planetscale.com>

systay changed the title V4 Gen4 Planner: AxB vs BxA Jan 16, 2021

clean up predicates

f38b2af

Signed-off-by: Andres Taylor <andres@planetscale.com>

systay marked this pull request as ready for review January 16, 2021 13:57

systay requested a review from harshit-gangal as a code owner January 16, 2021 13:57

systay added 2 commits January 16, 2021 17:01

Only use vindex for valid values

a46dbdb

When picking the vindex to use, we must also check that the value it is reading from can be used by vtgate for this purpose - the PlanValue must support the expression type. Signed-off-by: Andres Taylor <andres@planetscale.com>

Merge remote-tracking branch 'upstream/master' into v4

87ee3ff

Signed-off-by: Andres Taylor <andres@planetscale.com>

systay force-pushed the v4 branch from 240904c to 87ee3ff Compare January 18, 2021 08:42

harshit-gangal reviewed Jan 19, 2021

View reviewed changes

harshit-gangal reviewed Jan 22, 2021

View reviewed changes

systay added 5 commits January 25, 2021 13:39

nicer panic messages

19c3803

Signed-off-by: Andres Taylor <andres@planetscale.com>

tweaked planner costs and added TODO

dfc53d5

Signed-off-by: Andres Taylor <andres@planetscale.com>

add predicates no matter the opcode

b062f53

Signed-off-by: Andres Taylor <andres@planetscale.com>

optimise bitmask operation

7fa31be

Signed-off-by: Andres Taylor <andres@planetscale.com>

harshit-gangal approved these changes Jan 28, 2021

View reviewed changes

harshit-gangal merged commit 51f1c14 into vitessio:master Jan 28, 2021

harshit-gangal deleted the v4 branch January 28, 2021 11:30

askdba added the P3 label Feb 8, 2021

askdba added this to the v10.0 milestone Feb 8, 2021

ajm188 mentioned this pull request Jul 13, 2021

slack vitess v10.pre tinyspeck/vitess#228

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gen4 Planner: AxB vs BxA #7274

Gen4 Planner: AxB vs BxA #7274

systay commented Jan 9, 2021 •

edited

Loading

harshit-gangal Jan 19, 2021

harshit-gangal Jan 19, 2021

harshit-gangal Jan 19, 2021

harshit-gangal Jan 19, 2021

harshit-gangal Jan 19, 2021

harshit-gangal Jan 19, 2021

harshit-gangal Jan 19, 2021

harshit-gangal Jan 19, 2021

systay Jan 25, 2021

harshit-gangal left a comment

harshit-gangal Jan 22, 2021

systay Jan 25, 2021

harshit-gangal Jan 22, 2021

harshit-gangal Jan 22, 2021

systay Jan 26, 2021

harshit-gangal Jan 22, 2021

@@ @@ -104,3 +104,23 @@ func TestMergeJoins(t *testing.T) { @@
               		})
               	}
               }

Gen4 Planner: AxB vs BxA #7274

Gen4 Planner: AxB vs BxA #7274

Conversation

systay commented Jan 9, 2021 • edited Loading

Better plans

Implementation comments

Planner performance

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

harshit-gangal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

systay commented Jan 9, 2021 •

edited

Loading