-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL] Update YB Bitmap Scan cost model to match Yugabyte details #20573
Labels
2024.1.1_blocker
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Comments
timothy-e
added
area/ysql
Yugabyte SQL (YSQL)
status/awaiting-triage
Issue awaiting triage
labels
Jan 11, 2024
yugabyte-ci
added
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
and removed
status/awaiting-triage
Issue awaiting triage
labels
Jan 11, 2024
timothy-e
added a commit
that referenced
this issue
Mar 12, 2024
Summary: == Bitmap Scans == Bitmap Scans use multiple indexes to answer a query, with only one scan of the main table. Each index produces a “bitmap” indicating which rows of the main table are interesting. Multiple bitmaps can be combined with AND or OR operators to create a final bitmap which is used to collect rows from the main table. In this diff, initial support for Yugabyte bitmap scans is added. Yugabyte bitmap scans: * **have two nodes:** `Bitmap Index Scan` shared with PG, and `YB Bitmap Table Scan`. * **support exact results only**: When PG approaches work_mem, it switches to a lossy bitmap that identifies specific pages of the heap relation to access. When YB approaches work_mem, it simply switches to a full sequential scan. * **support arbitary nesting of logical operands and index conditions**. Assuming the same indexes and support for conditions, any arrangement of ANDs, ORs, and index scans that Postgres supports are also supported by Yugabyte because the code is reused. Note that Yugabyte is currently extremely unlikely to use an `AND` condition because the planner doesn't account for the cost of each bitmap index access being half the cost of a normal index access. (Bitmap index scans only need to access the ybctids, but normal index scans access ybctids + use them to query the main table.) * **allow conditions with `AND` to be used as a filter**. This currently happens for all `AND` conditions, see the above bullet point. But with a hack to the planner, `Bitmap And` and filters are both valid options. * **are controlled by `enable_bitmapscan`** * **are controlled by pg hint plan's `/*+ BitmapScan(tab) */` syntax ** * ** only print the recheck condition when recheck is required**. This is contrary to PG behavior, but since fetching unnecessary rows has a higher cost in YB, it’s better to clearly communicate that. * **show distributed storage stats when `EXPLAIN (ANALYZE, DIST)` is run**. `Storage Index Read Requests` are shown under `Bitmap Index Scans` on a secondary index, `Storage Table Read Requests` on`Bitmap Index Scans` on a primary index or `YB Bitmap Table Scans`. * **do not use unbounded memory**. Similar to Postgres, each bitmap is bounded by `work_mem` (although PG follows this bound loosely, i.e. if it can't lossify the bitmap enough it will exceed it). In the current draft, if Yugabyte exceeds work_mem, we stop collecting ybctids and run a full table scan. There is some output in the explain plan to indicate this occurred. * **output average ybctid size when `EXPLAIN (ANALYZE, DEBUG)` is run**. This can be used to get an estimate of how much memory a bitmap scan might take up. Yugabyte bitmap scans do **not** currently: * **have an accurate cost model:** [[ #20573 | #20573 ]] * **spill to disk when exceeding work_mem:** [[ #20576 | #20576 ]] * **parallelise bitmap table scans**: [[ #20575 | #20575 ]] * **support GIN indexes**: [[ #20574 | #20574 ]] * **skip fetching ybctids from YB Bitmap Table Scan**: [[ #21036 | #21036 ]] * **parallelise bitmap index scans**: [[ #21037 | #21037 ]] See the design document here: https://docs.google.com/document/d/1pRUsXZDywzrZTQpTfXZNTZckh-8tTtkfOcurUoybTr0/edit?usp=sharing Jira: DB-2547 Test Plan: ```lang=sh ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressYbBitmapScans' ``` Stress test on a 10 million row table: ```lang=sql CREATE TABLE large (a INT, b INT); CREATE INDEX ON large (a ASC); CREATE INDEX ON large (b ASC); INSERT INTO large SELECT i, i FROM generate_series(1, 10000000) i; ``` Bitmap Scan: ``` EXPLAIN (ANALYZE, DIST) /*+ BitmapScan(large) */ SELECT * FROM large WHERE a > 0 OR b > 0; NOTICE: exceeded work_mem, switching to full table scan QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------- YB Bitmap Table Scan on large (cost=6.91..11.21 rows=10 width=8) (actual time=233.920..10990.324 rows=10000000 loops=1) Storage Table Read Requests: 9767 Storage Table Read Execution Time: 6288.534 ms Storage Table Rows Scanned: 10000000 Exceeded work_mem: true -> BitmapOr (cost=6.91..6.91 rows=20 width=0) (actual time=231.783..231.783 rows=0 loops=1) -> Bitmap Index Scan on large_a_idx (cost=0.00..3.45 rows=10 width=0) (actual time=231.771..231.771 rows=0 loops=1) Index Cond: (a > 0) Storage Index Read Requests: 171 Storage Index Read Execution Time: 199.258 ms Storage Index Rows Scanned: 175104 -> Bitmap Index Scan on large_b_idx (cost=0.00..3.45 rows=10 width=0) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (b > 0) Planning Time: 10.735 ms Execution Time: 15493.938 ms Storage Read Requests: 9938 Storage Read Execution Time: 6487.792 ms Storage Rows Scanned: 10175104 Storage Write Requests: 0 Catalog Read Requests: 8 Catalog Read Execution Time: 6.969 ms Catalog Write Requests: 0 Storage Flush Requests: 0 Storage Execution Time: 6494.761 ms Peak Memory Usage: 100 kB (25 rows) ``` Sequential Scan: ``` EXPLAIN (ANALYZE, DIST) SELECT * FROM large WHERE a > 0 OR b > 0; QUERY PLAN -------------------------------------------------------------------------------------------------------------- Seq Scan on large (cost=0.00..105.00 rows=1000 width=8) (actual time=2.721..9896.584 rows=10000000 loops=1) Storage Filter: ((a > 0) OR (b > 0)) Storage Table Read Requests: 4885 Storage Table Read Execution Time: 7578.189 ms Storage Table Rows Scanned: 10000000 Planning Time: 0.082 ms Execution Time: 14737.837 ms Storage Read Requests: 4885 Storage Read Execution Time: 7578.189 ms Storage Rows Scanned: 10000000 Storage Write Requests: 0 Catalog Read Requests: 0 Catalog Write Requests: 0 Storage Flush Requests: 0 Storage Execution Time: 7578.189 ms Peak Memory Usage: 24 kB (16 rows) ``` Bitmap scan with high work_mem: ``` SET work_mem TO '1GB'; EXPLAIN (ANALYZE, DIST) /*+ BitmapScan(large) */ SELECT * FROM large WHERE a > 0 OR b > 0; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ YB Bitmap Table Scan on large (cost=6.91..11.21 rows=10 width=8) (actual time=76779.066..162561.046 rows=10000000 loops=1) Storage Table Read Requests: 9766 Storage Table Read Execution Time: 78226.718 ms Storage Table Rows Scanned: 10000000 -> BitmapOr (cost=6.91..6.91 rows=20 width=0) (actual time=73603.559..73603.559 rows=10000000 loops=1) -> Bitmap Index Scan on large_a_idx (cost=0.00..3.45 rows=10 width=0) (actual time=36774.895..36774.895 rows=10000000 loops=1) Index Cond: (a > 0) Storage Index Read Requests: 9766 Storage Index Read Execution Time: 11136.148 ms Storage Index Rows Scanned: 10000000 -> Bitmap Index Scan on large_b_idx (cost=0.00..3.45 rows=10 width=0) (actual time=36828.655..36828.655 rows=10000000 loops=1) Index Cond: (b > 0) Storage Index Read Requests: 9766 Storage Index Read Execution Time: 11419.616 ms Storage Index Rows Scanned: 10000000 Planning Time: 0.153 ms Execution Time: 169818.045 ms Storage Read Requests: 29298 Storage Read Execution Time: 100782.482 ms Storage Rows Scanned: 30000000 Storage Write Requests: 0 Catalog Read Requests: 0 Catalog Write Requests: 0 Storage Flush Requests: 0 Storage Execution Time: 100782.482 ms Peak Memory Usage: 1217840 kB (26 rows) ``` Reviewers: tnayak, amartsinchyk Reviewed By: tnayak Subscribers: smishra, ybase, yql Differential Revision: https://phorge.dev.yugabyte.com/D32015
1 task
timothy-e
added a commit
that referenced
this issue
Mar 15, 2024
Summary: To avoid any regressions with CBO on, disable bitmap scan until we have better defaults for costing Bitmap Scans with CBO. Tracked by #20573 Jira: DB-10362 Test Plan: Jenkins: urgent, test regex: .*TestPgRegress.* ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressContribPgTrgm' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressGin' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressThirdPartyExtensionsPgHintPlan' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressPartitions' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressPgSelect' ``` Reviewers: smishra Reviewed By: smishra Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D33161
timothy-e
added a commit
that referenced
this issue
Mar 15, 2024
Summary: Original commit: b038851 / D33161 To avoid any regressions with CBO on, disable bitmap scan until we have better defaults for costing Bitmap Scans with CBO. Tracked by #20573 Jira: DB-10362 Test Plan: Jenkins: urgent, test regex: .*TestPgRegress.* ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressContribPgTrgm' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressGin' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressThirdPartyExtensionsPgHintPlan' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressPartitions' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressPgSelect' ``` Reviewers: smishra Reviewed By: smishra Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D33187
asrinivasanyb
pushed a commit
to asrinivasanyb/yugabyte-db
that referenced
this issue
Mar 18, 2024
Summary: == Bitmap Scans == Bitmap Scans use multiple indexes to answer a query, with only one scan of the main table. Each index produces a “bitmap” indicating which rows of the main table are interesting. Multiple bitmaps can be combined with AND or OR operators to create a final bitmap which is used to collect rows from the main table. In this diff, initial support for Yugabyte bitmap scans is added. Yugabyte bitmap scans: * **have two nodes:** `Bitmap Index Scan` shared with PG, and `YB Bitmap Table Scan`. * **support exact results only**: When PG approaches work_mem, it switches to a lossy bitmap that identifies specific pages of the heap relation to access. When YB approaches work_mem, it simply switches to a full sequential scan. * **support arbitary nesting of logical operands and index conditions**. Assuming the same indexes and support for conditions, any arrangement of ANDs, ORs, and index scans that Postgres supports are also supported by Yugabyte because the code is reused. Note that Yugabyte is currently extremely unlikely to use an `AND` condition because the planner doesn't account for the cost of each bitmap index access being half the cost of a normal index access. (Bitmap index scans only need to access the ybctids, but normal index scans access ybctids + use them to query the main table.) * **allow conditions with `AND` to be used as a filter**. This currently happens for all `AND` conditions, see the above bullet point. But with a hack to the planner, `Bitmap And` and filters are both valid options. * **are controlled by `enable_bitmapscan`** * **are controlled by pg hint plan's `/*+ BitmapScan(tab) */` syntax ** * ** only print the recheck condition when recheck is required**. This is contrary to PG behavior, but since fetching unnecessary rows has a higher cost in YB, it’s better to clearly communicate that. * **show distributed storage stats when `EXPLAIN (ANALYZE, DIST)` is run**. `Storage Index Read Requests` are shown under `Bitmap Index Scans` on a secondary index, `Storage Table Read Requests` on`Bitmap Index Scans` on a primary index or `YB Bitmap Table Scans`. * **do not use unbounded memory**. Similar to Postgres, each bitmap is bounded by `work_mem` (although PG follows this bound loosely, i.e. if it can't lossify the bitmap enough it will exceed it). In the current draft, if Yugabyte exceeds work_mem, we stop collecting ybctids and run a full table scan. There is some output in the explain plan to indicate this occurred. * **output average ybctid size when `EXPLAIN (ANALYZE, DEBUG)` is run**. This can be used to get an estimate of how much memory a bitmap scan might take up. Yugabyte bitmap scans do **not** currently: * **have an accurate cost model:** [[ yugabyte#20573 | yugabyte#20573 ]] * **spill to disk when exceeding work_mem:** [[ yugabyte#20576 | yugabyte#20576 ]] * **parallelise bitmap table scans**: [[ yugabyte#20575 | yugabyte#20575 ]] * **support GIN indexes**: [[ yugabyte#20574 | yugabyte#20574 ]] * **skip fetching ybctids from YB Bitmap Table Scan**: [[ yugabyte#21036 | yugabyte#21036 ]] * **parallelise bitmap index scans**: [[ yugabyte#21037 | yugabyte#21037 ]] See the design document here: https://docs.google.com/document/d/1pRUsXZDywzrZTQpTfXZNTZckh-8tTtkfOcurUoybTr0/edit?usp=sharing Jira: DB-2547 Test Plan: ```lang=sh ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressYbBitmapScans' ``` Stress test on a 10 million row table: ```lang=sql CREATE TABLE large (a INT, b INT); CREATE INDEX ON large (a ASC); CREATE INDEX ON large (b ASC); INSERT INTO large SELECT i, i FROM generate_series(1, 10000000) i; ``` Bitmap Scan: ``` EXPLAIN (ANALYZE, DIST) /*+ BitmapScan(large) */ SELECT * FROM large WHERE a > 0 OR b > 0; NOTICE: exceeded work_mem, switching to full table scan QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------- YB Bitmap Table Scan on large (cost=6.91..11.21 rows=10 width=8) (actual time=233.920..10990.324 rows=10000000 loops=1) Storage Table Read Requests: 9767 Storage Table Read Execution Time: 6288.534 ms Storage Table Rows Scanned: 10000000 Exceeded work_mem: true -> BitmapOr (cost=6.91..6.91 rows=20 width=0) (actual time=231.783..231.783 rows=0 loops=1) -> Bitmap Index Scan on large_a_idx (cost=0.00..3.45 rows=10 width=0) (actual time=231.771..231.771 rows=0 loops=1) Index Cond: (a > 0) Storage Index Read Requests: 171 Storage Index Read Execution Time: 199.258 ms Storage Index Rows Scanned: 175104 -> Bitmap Index Scan on large_b_idx (cost=0.00..3.45 rows=10 width=0) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (b > 0) Planning Time: 10.735 ms Execution Time: 15493.938 ms Storage Read Requests: 9938 Storage Read Execution Time: 6487.792 ms Storage Rows Scanned: 10175104 Storage Write Requests: 0 Catalog Read Requests: 8 Catalog Read Execution Time: 6.969 ms Catalog Write Requests: 0 Storage Flush Requests: 0 Storage Execution Time: 6494.761 ms Peak Memory Usage: 100 kB (25 rows) ``` Sequential Scan: ``` EXPLAIN (ANALYZE, DIST) SELECT * FROM large WHERE a > 0 OR b > 0; QUERY PLAN -------------------------------------------------------------------------------------------------------------- Seq Scan on large (cost=0.00..105.00 rows=1000 width=8) (actual time=2.721..9896.584 rows=10000000 loops=1) Storage Filter: ((a > 0) OR (b > 0)) Storage Table Read Requests: 4885 Storage Table Read Execution Time: 7578.189 ms Storage Table Rows Scanned: 10000000 Planning Time: 0.082 ms Execution Time: 14737.837 ms Storage Read Requests: 4885 Storage Read Execution Time: 7578.189 ms Storage Rows Scanned: 10000000 Storage Write Requests: 0 Catalog Read Requests: 0 Catalog Write Requests: 0 Storage Flush Requests: 0 Storage Execution Time: 7578.189 ms Peak Memory Usage: 24 kB (16 rows) ``` Bitmap scan with high work_mem: ``` SET work_mem TO '1GB'; EXPLAIN (ANALYZE, DIST) /*+ BitmapScan(large) */ SELECT * FROM large WHERE a > 0 OR b > 0; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ YB Bitmap Table Scan on large (cost=6.91..11.21 rows=10 width=8) (actual time=76779.066..162561.046 rows=10000000 loops=1) Storage Table Read Requests: 9766 Storage Table Read Execution Time: 78226.718 ms Storage Table Rows Scanned: 10000000 -> BitmapOr (cost=6.91..6.91 rows=20 width=0) (actual time=73603.559..73603.559 rows=10000000 loops=1) -> Bitmap Index Scan on large_a_idx (cost=0.00..3.45 rows=10 width=0) (actual time=36774.895..36774.895 rows=10000000 loops=1) Index Cond: (a > 0) Storage Index Read Requests: 9766 Storage Index Read Execution Time: 11136.148 ms Storage Index Rows Scanned: 10000000 -> Bitmap Index Scan on large_b_idx (cost=0.00..3.45 rows=10 width=0) (actual time=36828.655..36828.655 rows=10000000 loops=1) Index Cond: (b > 0) Storage Index Read Requests: 9766 Storage Index Read Execution Time: 11419.616 ms Storage Index Rows Scanned: 10000000 Planning Time: 0.153 ms Execution Time: 169818.045 ms Storage Read Requests: 29298 Storage Read Execution Time: 100782.482 ms Storage Rows Scanned: 30000000 Storage Write Requests: 0 Catalog Read Requests: 0 Catalog Write Requests: 0 Storage Flush Requests: 0 Storage Execution Time: 100782.482 ms Peak Memory Usage: 1217840 kB (26 rows) ``` Reviewers: tnayak, amartsinchyk Reviewed By: tnayak Subscribers: smishra, ybase, yql Differential Revision: https://phorge.dev.yugabyte.com/D32015
asrinivasanyb
pushed a commit
to asrinivasanyb/yugabyte-db
that referenced
this issue
Mar 18, 2024
Summary: To avoid any regressions with CBO on, disable bitmap scan until we have better defaults for costing Bitmap Scans with CBO. Tracked by yugabyte#20573 Jira: DB-10362 Test Plan: Jenkins: urgent, test regex: .*TestPgRegress.* ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressContribPgTrgm' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressGin' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressThirdPartyExtensionsPgHintPlan' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressPartitions' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressPgSelect' ``` Reviewers: smishra Reviewed By: smishra Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D33161
timothy-e
added a commit
that referenced
this issue
May 13, 2024
Summary: Implemented CBO for Bitmap Scans, using the same approach as PG: store a partially calculated index cost and use that to compute bitmap index access. If Bitmap Index Scans or Bitmap Ors are predicted to exceed work_mem, they'll have `disable_cost / 2` added to their cost. This discourages the planner from using them, but still costs them lower than other actually disabled scan types. Without this change, `/*+ BitmapScan(test) */ SELECT * FROM t1000000 ...` will use a sequential scan because it's cheaper, even though the sequential scan has been disabled by the hint. To better determine if a Bitmap Or will exceed work_mem, more work has been done to determine an estimate of how many rows a Bitmap Or will return. Also, a slight refactor was done: I put estimated seeks, nexts, width into a YbPlanInfo struct to reduce code duplication of printing / assigning those three values. Jira: DB-9574 Test Plan: Jenkins: compile only Latest: [[ https://jenkins.dev.yugabyte.com/job/optimizer/job/generate-report/231/artifact/taqo/report/20240501-185328/index_yb_vs_yb_cost-validation_.html | TAQO YB CBO with Bitmap Scans enabled / disabled ]] * Very few regressions in the default case - highest regression was 1.26 on query e57fd1caa18f59c4e40138d377ec2a01. I checked each regression above 1.1, and they were all run-to-run variances using the same execution plan. [[ https://jenkins.dev.yugabyte.com/job/optimizer/job/generate-report/224/artifact/taqo/report/20240418-134354/index_yb_vs_yb_cost-validation_bitmap_scan_yb_vs_pg.html | TAQO Run vs PG ]] [[ https://jenkins.dev.yugabyte.com/job/optimizer/job/generate-report/225/artifact/taqo/report/20240419-131907/index_yb_vs_yb_cost-validation_bitmap_scan_ybstatsonly_vs_pg.html | TAQO Run PG vs YB with only stats ]] ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgBaseScansCostModel' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgCostModelSeekNextEstimation' ``` Reviewers: gkukreja, mtakahara, amartsinchyk, tnayak Reviewed By: gkukreja Subscribers: gkukreja, yql Differential Revision: https://phorge.dev.yugabyte.com/D33861
svarnau
pushed a commit
that referenced
this issue
May 25, 2024
Summary: Implemented CBO for Bitmap Scans, using the same approach as PG: store a partially calculated index cost and use that to compute bitmap index access. If Bitmap Index Scans or Bitmap Ors are predicted to exceed work_mem, they'll have `disable_cost / 2` added to their cost. This discourages the planner from using them, but still costs them lower than other actually disabled scan types. Without this change, `/*+ BitmapScan(test) */ SELECT * FROM t1000000 ...` will use a sequential scan because it's cheaper, even though the sequential scan has been disabled by the hint. To better determine if a Bitmap Or will exceed work_mem, more work has been done to determine an estimate of how many rows a Bitmap Or will return. Also, a slight refactor was done: I put estimated seeks, nexts, width into a YbPlanInfo struct to reduce code duplication of printing / assigning those three values. Jira: DB-9574 Test Plan: Jenkins: compile only Latest: [[ https://jenkins.dev.yugabyte.com/job/optimizer/job/generate-report/231/artifact/taqo/report/20240501-185328/index_yb_vs_yb_cost-validation_.html | TAQO YB CBO with Bitmap Scans enabled / disabled ]] * Very few regressions in the default case - highest regression was 1.26 on query e57fd1caa18f59c4e40138d377ec2a01. I checked each regression above 1.1, and they were all run-to-run variances using the same execution plan. [[ https://jenkins.dev.yugabyte.com/job/optimizer/job/generate-report/224/artifact/taqo/report/20240418-134354/index_yb_vs_yb_cost-validation_bitmap_scan_yb_vs_pg.html | TAQO Run vs PG ]] [[ https://jenkins.dev.yugabyte.com/job/optimizer/job/generate-report/225/artifact/taqo/report/20240419-131907/index_yb_vs_yb_cost-validation_bitmap_scan_ybstatsonly_vs_pg.html | TAQO Run PG vs YB with only stats ]] ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgBaseScansCostModel' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgCostModelSeekNextEstimation' ``` Reviewers: gkukreja, mtakahara, amartsinchyk, tnayak Reviewed By: gkukreja Subscribers: gkukreja, yql Differential Revision: https://phorge.dev.yugabyte.com/D33861
svarnau
pushed a commit
that referenced
this issue
May 29, 2024
…of Bitmap Scan CBO Summary: Original commits: * D33861 / eac0c26 * D35044 / ca5b8e6 * D35045 / 49a8e47 Changes: * There are some differences in actual number of seqs and nexts in the table scan. Fix these by just adding `_IgnoreActualResults` to the tests in `testSeekNextEstimationBitmapScan` == Summary from D33861 == Implemented CBO for Bitmap Scans, using the same approach as PG: store a partially calculated index cost and use that to compute bitmap index access. If Bitmap Index Scans or Bitmap Ors are predicted to exceed work_mem, they'll have `disable_cost / 2` added to their cost. This discourages the planner from using them, but still costs them lower than other actually disabled scan types. Without this change, `/*+ BitmapScan(test) */ SELECT * FROM t1000000 ...` will use a sequential scan because it's cheaper, even though the sequential scan has been disabled by the hint. To better determine if a Bitmap Or will exceed work_mem, more work has been done to determine an estimate of how many rows a Bitmap Or will return. Also, a slight refactor was done: I put estimated seeks, nexts, width into a YbPlanInfo struct to reduce code duplication of printing / assigning those three values. Jira: DB-9574 == Summary from D35044 == [#22387] YSQL: Fix Bitmap Scan CBO tests 2338431 / D34646 introduced a flag `yb_enable_bitmapscan = false`, so YB bitmap scans are disabled by default, even when `enable_bitmapscan` is true. At the same time, eac0c26 / D33861 introduced CBO for bitmap scans. These tests passed before `yb_enable_bitmapscan` was added, but once it was added, they started failing because they stopped using bitmap scans. Fix the tests by setting `yb_enable_bitmapscan = true`. Jira: DB-11284 == Summary from D35045 == [##22388] YSQL: Change name of bitmap_exceeded_work_mem_cost Because `bitmap_exceeded_work_mem_cost` doesn't contain the phrase `disable_cost`, it's easy to miss when someone try to find all the places that inflates the cost values, e.g. investigate into cost overflow, precision losses with complex plans, etc. Change the name to `bitmap_exceeded_work_mem_disable_cost`. Jira: DB-11285 Test Plan: Latest: [[ https://jenkins.dev.yugabyte.com/job/optimizer/job/generate-report/231/artifact/taqo/report/20240501-185328/index_yb_vs_yb_cost-validation_.html | TAQO YB CBO with Bitmap Scans enabled / disabled ]] * Very few regressions in the default case - highest regression was 1.26 on query e57fd1caa18f59c4e40138d377ec2a01. I checked each regression above 1.1, and they were all run-to-run variances using the same execution plan. [[ https://jenkins.dev.yugabyte.com/job/optimizer/job/generate-report/224/artifact/taqo/report/20240418-134354/index_yb_vs_yb_cost-validation_bitmap_scan_yb_vs_pg.html | TAQO Run vs PG ]] [[ https://jenkins.dev.yugabyte.com/job/optimizer/job/generate-report/225/artifact/taqo/report/20240419-131907/index_yb_vs_yb_cost-validation_bitmap_scan_ybstatsonly_vs_pg.html | TAQO Run PG vs YB with only stats ]] ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgBaseScansCostModel' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgCostModelSeekNextEstimation' ``` Reviewers: gkukreja, mtakahara, amartsinchyk, tnayak Reviewed By: mtakahara Subscribers: yql, gkukreja Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D35136
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
2024.1.1_blocker
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Jira Link: DB-9574
Description
In the initial bitmap scan implementation, the YB Bitmap Table Scan cost model is the same as the Postgres Bitmap Heap Scan cost. Update this to match the changes made for Yugabyte.
This includes accounting for remote pushdown - there are cases where BitmapAnd is not necessary.
Issue Type
kind/enhancement
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: