Skip to content

optimizer: add docs for shared cte and explain enhancement #14634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions explain-walkthrough.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,3 +210,55 @@ EXPLAIN ANALYZE SELECT count(*) FROM trips WHERE start_date BETWEEN '2017-07-01
> **注意:**
>
> 以上示例另一个可用的优化方案是 [coprocessor cache](/coprocessor-cache.md)。如果你无法添加索引,可考虑开启 coprocessor cache 功能。开启后,只要算子上次执行以来 Region 未作更改,TiKV 将从缓存中返回值。这也有助于减少 `TableFullScan` 和 `Selection` 算子的大部分运算成本。

## 禁止子查询提前执行

在查询优化过程中,TiDB 会提前执行可以在优化阶段直接计算的子查询。例如:

```sql
CREATE TABLE t1(a int);
INSERT INTO t1 VALUES(1);
CREATE TABLE t2(a int);
EXPLAIN SELECT * FROM t2 WHERE a = (SELECT a FROM t1);
```

```sql
+--------------------------+----------+-----------+---------------+--------------------------------+
| id | estRows | task | access object | operator info |
+--------------------------+----------+-----------+---------------+--------------------------------+
| TableReader_14 | 10.00 | root | | data:Selection_13 |
| └─Selection_13 | 10.00 | cop[tikv] | | eq(test.t2.a, 1) |
| └─TableFullScan_12 | 10000.00 | cop[tikv] | table:t2 | keep order:false, stats:pseudo |
+--------------------------+----------+-----------+---------------+--------------------------------+
3 rows in set (0.00 sec)
```

在上述例子中 `a = (SELECT a FROM t1)` 子查询在优化阶段就进行了计算,表达式被改写为 `t2.a=1`。这种执行方式可以在优化阶段进行更多的常量传播和常量折叠优化,但是会影响 `EXPLAIN` 语句的执行时间。当子查询本身耗时较长时,`EXPLAIN` 语句无法执行完成,可能会影响线上问题的排查。

从 v7.3.0 开始,TiDB 引入 [`tidb_opt_enable_non_eval_scalar_subquery`](/system-variables.md#tidb_opt_enable_non_eval_scalar_subquery-从-v730-版本开始引入) 系统变量,可以控制这类子查询在 `EXPLAIN` 语句中是否禁止提前执行计算展开。该变量默认值为 `OFF`,即提前计算子查询。你可以将该变量设置为 `ON` 来禁止子查询提前执行:

```sql
SET @@tidb_opt_enable_non_eval_scalar_subquery = ON;
EXPLAIN SELECT * FROM t2 WHERE a = (SELECT a FROM t1);
```

```sql
+---------------------------+----------+-----------+---------------+---------------------------------+
| id | estRows | task | access object | operator info |
+---------------------------+----------+-----------+---------------+---------------------------------+
| Selection_13 | 8000.00 | root | | eq(test.t2.a, ScalarQueryCol#5) |
| └─TableReader_15 | 10000.00 | root | | data:TableFullScan_14 |
| └─TableFullScan_14 | 10000.00 | cop[tikv] | table:t2 | keep order:false, stats:pseudo |
| ScalarSubQuery_10 | N/A | root | | Output: ScalarQueryCol#5 |
| └─MaxOneRow_6 | 1.00 | root | | |
| └─TableReader_9 | 1.00 | root | | data:TableFullScan_8 |
| └─TableFullScan_8 | 1.00 | cop[tikv] | table:t1 | keep order:false, stats:pseudo |
+---------------------------+----------+-----------+---------------+---------------------------------+
7 rows in set (0.00 sec)
```

可以看到,标量子查询在执行阶段并没有被展开,这样更便于理解该类 SQL 具体的执行过程。

> **注意:**
>
> [`tidb_opt_enable_non_eval_scalar_subquery`](/system-variables.md#tidb_opt_enable_non_eval_scalar_subquery-从-v730-版本开始引入) 目前仅控制 `EXPLAIN` 语句的行为,`EXPLAIN ANALYZE` 语句仍然会将子查询提前展开。
12 changes: 10 additions & 2 deletions system-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -2998,6 +2998,14 @@ mysql> desc select count(distinct a) from test.t;
- 默认值:`ON`
- 这个变量用来控制优化器是否开启交叉估算。

### `tidb_opt_enable_non_eval_scalar_subquery` <span class="version-mark">从 v7.3.0 版本开始引入</span>

- 作用域:SESSION | GLOBAL
- 是否持久化到集群:是
- 类型:布尔型
- 默认值:`OFF`
- 这个变量用来控制 `EXPLAIN` 语句是否禁止提前执行可以在优化阶段展开的常量子查询。该变量设置为 `OFF` 时,`EXPLAIN` 语句会在优化阶段提前展开子查询。该变量设置为 `ON` 时,`EXPLAIN` 语句不会在优化阶段展开子查询。更多信息请参考[禁止子查询提前展开](/explain-walkthrough.md#禁止子查询提前执行)。

### `tidb_opt_enable_late_materialization` <span class="version-mark">从 v7.0.0 版本开始引入</span>

- 作用域:SESSION | GLOBAL
Expand All @@ -3011,13 +3019,13 @@ mysql> desc select count(distinct a) from test.t;

> **警告:**
>
> 当前版本中该变量控制的功能尚未完全生效,请保留默认值
> 该变量控制的功能为实验特性,不建议在生产环境中使用。该功能可能会在未事先通知的情况下发生变化或删除。如果发现 bug,请在 GitHub 上提 [issue](https://github.com/pingcap/tidb/issues) 反馈

- 作用域:SESSION | GLOBAL
- 是否持久化到集群:是
- 类型:布尔型
- 默认值:`OFF`
- 该变量控制非递归的[公共表表达式 (CTE)](/sql-statements/sql-statement-with.md) 是否可以直接在 TiFlash MPP 执行而不是在 TiDB 上执行
- 该变量控制非递归的[公共表表达式 (CTE)](/sql-statements/sql-statement-with.md) 是否可以在 TiFlash MPP 执行。默认情况下,未开启该变量时,CTE 在 TiDB 执行,相较于开启该功能,执行性能有较大差距

### `tidb_opt_fix_control` <span class="version-mark">从 v7.1.0 版本开始引入</span>

Expand Down