Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inaccurate display of estimated row count in index join innner side #44802

Closed
pcqz opened this issue Jun 19, 2023 · 3 comments
Closed

Inaccurate display of estimated row count in index join innner side #44802

pcqz opened this issue Jun 19, 2023 · 3 comments
Labels
may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 may-affects-7.1 severity/major sig/planner SIG: Planner type/bug The issue is confirmed as a bug.

Comments

@pcqz
Copy link

pcqz commented Jun 19, 2023

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

  1. Prepare 100 thousand rows into the tables sbtest1 and sbtest2 using sysbench
  2. Execute the following statements:
mysql> update sbtest1 set k=id;
mysql> update sbtest2 set k=id;
mysql> analyze table sbtest1;
mysql> analyze table sbtest2;
mysql> explain analyze select /*+ inl_join(t1) */ count(*) from sbtest1 t1 join sbtest2 t2 where t1.k=t2.k and t2.k < 100;
+-------------------------------+-----------+---------+-----------+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+
| id                            | estRows   | actRows | task      | access object          | execution info                                                                                                                                                                                                                                                                                                                                                 | operator info                                                                                                                                   | memory    | disk |
+-------------------------------+-----------+---------+-----------+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+
| HashAgg_9                     | 1.00      | 1       | root      |                        | time:6.56ms, loops:2, RU:2.082186, partial_worker:{wall_time:6.501073ms, concurrency:5, task_num:1, tot_wait:31.956004ms, tot_exec:53.09µs, tot_time:32.020633ms, max:6.45798ms, p95:6.45798ms}, final_worker:{wall_time:6.55662ms, concurrency:5, task_num:1, tot_wait:32.476808ms, tot_exec:44.91µs, tot_time:32.528199ms, max:6.524459ms, p95:6.524459ms}   | funcs:count(1)->Column#9                                                                                                                        | 9.00 KB   | N/A  |
| └─IndexJoin_26                | 99.00     | 99      | root      |                        | time:6.39ms, loops:2, inner:{total:6.6ms, concurrency:5, task:2, construct:155.8µs, fetch:6.32ms, build:102.5µs}, probe:51µs                                                                                                                                                                                                                                   | inner join, inner:IndexReader_25, outer key:sbtest1.sbtest2.k, inner key:sbtest1.sbtest1.k, equal cond:eq(sbtest1.sbtest2.k, sbtest1.sbtest1.k) | 57.1 KB   | N/A  |
|   ├─IndexReader_35(Build)     | 99.00     | 99      | root      |                        | time:1.1ms, loops:5, cop_task: {num: 1, max: 1.32ms, proc_keys: 99, tot_proc: 402.4µs, tot_wait: 257.8µs, rpc_num: 1, rpc_time: 1.27ms, copr_cache_hit_ratio: 0.00, build_task_duration: 9.32µs, max_distsql_concurrency: 1}                                                                                                                                   | index:IndexRangeScan_34                                                                                                                         | 1.05 KB   | N/A  |
|   │ └─IndexRangeScan_34       | 99.00     | 99      | cop[tikv] | table:t2, index:k_2(k) | tikv_task:{time:0s, loops:3}, scan_detail: {total_process_keys: 99, total_process_keys_size: 4554, total_keys: 100, get_snapshot_time: 200.3µs, rocksdb: {key_skipped_count: 99, block: {cache_hit_count: 3}}}                                                                                                                                                 | range:[-inf,100), keep order:false                                                                                                              | N/A       | N/A  |
|   └─IndexReader_25(Probe)     | 99.00     | 99      | root      |                        | time:5.77ms, loops:4, cop_task: {num: 2, max: 4.25ms, min: 1.24ms, avg: 2.74ms, p95: 4.25ms, max_proc_keys: 96, p95_proc_keys: 96, tot_proc: 3.18ms, tot_wait: 337.9µs, rpc_num: 2, rpc_time: 5.41ms, copr_cache_hit_ratio: 0.00, build_task_duration: 28.1µs, max_distsql_concurrency: 1}                                                                     | index:Selection_24                                                                                                                              | 306 Bytes | N/A  |
|     └─Selection_24            | 99.00     | 99      | cop[tikv] |                        | tikv_task:{proc max:3ms, min:1ms, avg: 2ms, p80:3ms, p95:3ms, iters:4, tasks:2}, scan_detail: {total_process_keys: 99, total_process_keys_size: 4554, total_keys: 198, get_snapshot_time: 195.8µs, rocksdb: {key_skipped_count: 99, block: {cache_hit_count: 297}}}                                                                                            | lt(sbtest1.sbtest1.k, 100)                                                                                                                      | N/A       | N/A  |
|       └─IndexRangeScan_23     | 100000.00 | 99      | cop[tikv] | table:t1, index:k_1(k) | tikv_task:{proc max:3ms, min:1ms, avg: 2ms, p80:3ms, p95:3ms, iters:4, tasks:2}                                                                                                                                                                                                                                                                                | range: decided by [eq(sbtest1.sbtest1.k, sbtest1.sbtest2.k)], keep order:false                                                                  | N/A       | N/A  |
+-------------------------------+-----------+---------+-----------+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+
7 rows in set (0.01 sec)

2. What did you expect to see? (Required)

The estRows of IndexRangeScan_23 is 99 rows as actRows.

3. What did you see instead (Required)

The estRows of IndexRangeScan_23 is 100000 as many as table row count.

4. What is your TiDB version? (Required)

master

@pcqz pcqz added the type/bug The issue is confirmed as a bug. label Jun 19, 2023
@ti-chi-bot ti-chi-bot bot added may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 may-affects-7.1 labels Jun 20, 2023
@pcqz
Copy link
Author

pcqz commented Jun 20, 2023

The above display problem may be introduce by #38188 in v6.4.0. However, there is still an actual inaccurate estimation issue in current LTS versions, which may cause higher estimated cost for index join. For example:

-- v6.1.6
mysql> explain format='verbose' select /*+ inl_join(t1) */ count(*) from sbtest1 t1 join sbtest2 t2 where t1.k=t2.k and t2.k < 100;
+-------------------------------+---------+-----------+-----------+------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| id                            | estRows | estCost   | task      | access object          | operator info                                                                                                                                   |
+-------------------------------+---------+-----------+-----------+------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| HashAgg_8                     | 1.00    | 311548.87 | root      |                        | funcs:count(1)->Column#9                                                                                                                        |
| └─IndexJoin_25                | 99.00   | 311456.47 | root      |                        | inner join, inner:IndexReader_24, outer key:sbtest1.sbtest2.k, inner key:sbtest1.sbtest1.k, equal cond:eq(sbtest1.sbtest2.k, sbtest1.sbtest1.k) |
|   ├─IndexReader_34(Build)     | 99.51   | 343.81    | root      |                        | index:IndexRangeScan_33                                                                                                                         |
|   │ └─IndexRangeScan_33       | 99.51   | 4328.70   | cop[tikv] | table:t2, index:k_2(k) | range:[-inf,100), keep order:false                                                                                                              |
|   └─IndexReader_24(Probe)     | 0.99    | 3117.13   | root      |                        | index:Selection_23                                                                                                                              |
|     └─Selection_23            | 0.99    | 0.00      | cop[tikv] |                        | lt(sbtest1.sbtest1.k, 100)                                                                                                                      |
|       └─IndexRangeScan_22     | 1004.92 | 0.00      | cop[tikv] | table:t1, index:k_1(k) | range: decided by [eq(sbtest1.sbtest1.k, sbtest1.sbtest2.k)], keep order:false                                                                  |
+-------------------------------+---------+-----------+-----------+------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
7 rows in set, 1 warning (0.00 sec)

mysql> explain format='verbose' select count(*) from sbtest1 t1 join sbtest2 t2 where t1.k=t2.k and t2.k < 100;
+-------------------------------+---------+---------+-----------+------------------------+---------------------------------------------------------------------+
| id                            | estRows | estCost | task      | access object          | operator info                                                       |
+-------------------------------+---------+---------+-----------+------------------------+---------------------------------------------------------------------+
| HashAgg_9                     | 1.00    | 1075.27 | root      |                        | funcs:count(1)->Column#9                                            |
| └─MergeJoin_11                | 99.00   | 982.87  | root      |                        | inner join, left key:sbtest1.sbtest1.k, right key:sbtest1.sbtest2.k |
|   ├─IndexReader_39(Build)     | 99.51   | 343.81  | root      |                        | index:IndexRangeScan_38                                             |
|   │ └─IndexRangeScan_38       | 99.51   | 4328.70 | cop[tikv] | table:t2, index:k_2(k) | range:[-inf,100), keep order:true                                   |
|   └─IndexReader_37(Probe)     | 99.00   | 342.06  | root      |                        | index:IndexRangeScan_36                                             |
|     └─IndexRangeScan_36       | 99.00   | 4306.50 | cop[tikv] | table:t1, index:k_1(k) | range:[-inf,100), keep order:true                                   |
+-------------------------------+---------+---------+-----------+------------------------+---------------------------------------------------------------------+
6 rows in set, 2 warnings (0.01 sec)

-- v7.1.0(tidb_cost_model_version=2)
mysql> explain format='cost_trace' select /*+ inl_join(t1) */ count(*) from sbtest1 t1 join sbtest2 t2 where t1.k=t2.k and t2.k < 100;
+-------------------------------+-----------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| id                            | estRows   | estCost   | costFormula                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | task      | access object          | operator info                                                                                                                                   |
+-------------------------------+-----------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| HashAgg_9                     | 1.00      | 103271.32 | (cpu(10*3*tidb_cpu_factor(49.9))) + ((cpu(10*3*tidb_cpu_factor(49.9))) + (((scan(97.75725593667545*logrowsize(16)*tikv_scan_factor(40.7))) + (net(97.75725593667545*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00) + (cpu(97.75725593667545*filters(0)*tidb_cpu_factor(49.9))) + (cpu(97.75725593667545*10*tidb_cpu_factor(49.9))) + ((() + ((((((cpu(1022.9419703103915*filters(1)*tikv_cpu_factor(49.9))) + (scan(1022.9419703103915*logrowsize(16)*tikv_scan_factor(40.7)))) + (net(0.9142178523231156*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)*97.76)/6.00) + (cpu(89.37142857142857*filters(0)*tidb_cpu_factor(49.9))) + ((hashkey(89.37142857142857*0*tidb_cpu_factor(49.9))) + (hashmem(89.37142857142857*16*tidb_mem_factor(0.2))) + (hashbuild(89.37142857142857*tidb_cpu_factor(49.9)))))/5.00)) + (((agg(89.37142857142857*aggs(1)*tidb_cpu_factor(49.9))) + (group(89.37142857142857*cols(0)*tidb_cpu_factor(49.9))) + ((hashkey(1*0*tidb_cpu_factor(49.9))) + (hashmem(1*8*tidb_mem_factor(0.2))) + (hashbuild(1*tidb_cpu_factor(49.9)))) + ((hashkey(89.37142857142857*0*tidb_cpu_factor(49.9))) + (hashprobe(89.37142857142857*tidb_cpu_factor(49.9)))))/5.00) | root      |                        | funcs:count(1)->Column#9                                                                                                                        |
| └─IndexJoin_26                | 89.37     | 99980.16  | (cpu(10*3*tidb_cpu_factor(49.9))) + (((scan(97.75725593667545*logrowsize(16)*tikv_scan_factor(40.7))) + (net(97.75725593667545*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00) + (cpu(97.75725593667545*filters(0)*tidb_cpu_factor(49.9))) + (cpu(97.75725593667545*10*tidb_cpu_factor(49.9))) + ((() + ((((((cpu(1022.9419703103915*filters(1)*tikv_cpu_factor(49.9))) + (scan(1022.9419703103915*logrowsize(16)*tikv_scan_factor(40.7)))) + (net(0.9142178523231156*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00)*97.76)/6.00) + (cpu(89.37142857142857*filters(0)*tidb_cpu_factor(49.9))) + ((hashkey(89.37142857142857*0*tidb_cpu_factor(49.9))) + (hashmem(89.37142857142857*16*tidb_mem_factor(0.2))) + (hashbuild(89.37142857142857*tidb_cpu_factor(49.9)))))/5.00)                                                                                                                                                                                                                                                                                       | root      |                        | inner join, inner:IndexReader_25, outer key:sbtest1.sbtest2.k, inner key:sbtest1.sbtest1.k, equal cond:eq(sbtest1.sbtest2.k, sbtest1.sbtest1.k) |
|   ├─IndexReader_35(Build)     | 97.76     | 1473.92   | ((scan(97.75725593667545*logrowsize(16)*tikv_scan_factor(40.7))) + (net(97.75725593667545*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | root      |                        | index:IndexRangeScan_34                                                                                                                         |
|   │ └─IndexRangeScan_34       | 97.76     | 15914.88  | scan(97.75725593667545*logrowsize(16)*tikv_scan_factor(40.7))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | cop[tikv] | table:t2, index:k_2(k) | range:[-inf,100), keep order:false                                                                                                              |
|   └─IndexReader_25(Probe)     | 89.37     | 14509.18  | (((cpu(1022.9419703103915*filters(1)*tikv_cpu_factor(49.9))) + (scan(1022.9419703103915*logrowsize(16)*tikv_scan_factor(40.7)))) + (net(0.9142178523231156*rowsize(16)*tidb_kv_net_factor(3.96))))/15.00                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | root      |                        | index:Selection_24                                                                                                                              |
|     └─Selection_24            | 89.37     | 217579.76 | (cpu(1022.9419703103915*filters(1)*tikv_cpu_factor(49.9))) + (scan(1022.9419703103915*logrowsize(16)*tikv_scan_factor(40.7)))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | cop[tikv] |                        | lt(sbtest1.sbtest1.k, 100)                                                                                                                      |
|       └─IndexRangeScan_23     | 100000.00 | 166534.95 | scan(1022.9419703103915*logrowsize(16)*tikv_scan_factor(40.7))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | cop[tikv] | table:t1, index:k_1(k) | range: decided by [eq(sbtest1.sbtest1.k, sbtest1.sbtest2.k)], keep order:false                                                                  |
+-------------------------------+-----------+-----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
7 rows in set (0.00 sec)

@pcqz
Copy link
Author

pcqz commented Jun 28, 2023

Another case

mysql> desc  select /*+ inl_join(t2) */ * from sbtest1 t1, sbtest2 t2 where t1.k=t2.k and t1.id<10;
+----------------------------------+---------+-----------+------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| id                               | estRows | task      | access object          | operator info                                                                                                                                   |
+----------------------------------+---------+-----------+------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| IndexJoin_12                     | 51.41   | root      |                        | inner join, inner:IndexLookUp_11, outer key:sbtest1.sbtest1.k, inner key:sbtest1.sbtest2.k, equal cond:eq(sbtest1.sbtest1.k, sbtest1.sbtest2.k) |
| ├─TableReader_21(Build)          | 8.93    | root      |                        | data:TableRangeScan_20                                                                                                                          |
| │ └─TableRangeScan_20            | 8.93    | cop[tikv] | table:t1               | range:[0,10), keep order:false                                                                                                                  |
| └─IndexLookUp_11(Probe)          | 51.41   | root      |                        |                                                                                                                                                 |
|   ├─IndexRangeScan_9(Build)      | 51.41   | cop[tikv] | table:t2, index:k_2(k) | range: decided by [eq(sbtest1.sbtest2.k, sbtest1.sbtest1.k)], keep order:false                                                                  |
|   └─TableRowIDScan_10(Probe)     | 51.41   | cop[tikv] | table:t2               | keep order:false                                                                                                                                |
+----------------------------------+---------+-----------+------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------+
6 rows in set (0.00 sec)

mysql> desc  select /*+ inl_join(t2) */ * from sbtest1 t1, sbtest2 t2 where t1.k=t2.k and t2.c='test' and t1.id<10;
+------------------------------------+-----------+-----------+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                                 | estRows   | task      | access object          | operator info                                                                                                                                                |
+------------------------------------+-----------+-----------+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection_9                       | 5.83      | root      |                        | sbtest1.sbtest1.id, sbtest1.sbtest1.k, sbtest1.sbtest1.c, sbtest1.sbtest1.pad, sbtest1.sbtest2.id, sbtest1.sbtest2.k, sbtest1.sbtest2.c, sbtest1.sbtest2.pad |
| └─IndexJoin_28                     | 5.83      | root      |                        | inner join, inner:IndexLookUp_27, outer key:sbtest1.sbtest1.k, inner key:sbtest1.sbtest2.k, equal cond:eq(sbtest1.sbtest1.k, sbtest1.sbtest2.k)              |
|   ├─TableReader_38(Build)          | 8.93      | root      |                        | data:TableRangeScan_37                                                                                                                                       |
|   │ └─TableRangeScan_37            | 8.93      | cop[tikv] | table:t1               | range:[0,10), keep order:false                                                                                                                               |
|   └─IndexLookUp_27(Probe)          | 5.83      | root      |                        |                                                                                                                                                              |
|     ├─IndexRangeScan_24(Build)     | 574844.79 | cop[tikv] | table:t2, index:k_2(k) | range: decided by [eq(sbtest1.sbtest2.k, sbtest1.sbtest1.k)], keep order:false                                                                               |
|     └─Selection_26(Probe)          | 5.83      | cop[tikv] |                        | eq(sbtest1.sbtest2.c, "test")                                                                                                                                |
|       └─TableRowIDScan_25          | 574844.79 | cop[tikv] | table:t2               | keep order:false                                                                                                                                             |
+------------------------------------+-----------+-----------+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.01 sec)

@pcqz
Copy link
Author

pcqz commented Jun 28, 2023

Fixed by #44865

@pcqz pcqz closed this as completed Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 may-affects-7.1 severity/major sig/planner SIG: Planner type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

2 participants