Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Cost YB Bitmap Table Scan remote filters #23521

Closed
1 task done
timothy-e opened this issue Aug 15, 2024 · 0 comments
Closed
1 task done

[YSQL] Cost YB Bitmap Table Scan remote filters #23521

timothy-e opened this issue Aug 15, 2024 · 0 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@timothy-e
Copy link
Contributor

timothy-e commented Aug 15, 2024

Jira Link: DB-12438

Description

These two scans are costed the same (table analyzed, CBO on)

yugabyte# /*+ BitmapScan(t) */ EXPLAIN SELECT * FROM t WHERE a < 1000 AND b < 1000;
                                QUERY PLAN
---------------------------------------------------------------------------
 YB Bitmap Table Scan on t  (cost=4797.39..17201.29 rows=100 width=8)
   Storage Filter: (b < 1000)
   ->  Bitmap Index Scan on t_a_idx  (cost=0.00..4617.36 rows=999 width=0)
         Index Cond: (a < 1000)
(4 rows)

yugabyte# SET yb_enable_expression_pushdown = false;
SET
yugabyte# /*+ BitmapScan(t) */ EXPLAIN SELECT * FROM t WHERE a < 1000 AND b < 1000;
                                QUERY PLAN
---------------------------------------------------------------------------
 YB Bitmap Table Scan on t  (cost=4797.39..17201.29 rows=100 width=8)
   Filter: (b < 1000)
   ->  Bitmap Index Scan on t_a_idx  (cost=0.00..4617.36 rows=999 width=0)
         Index Cond: (a < 1000)
(4 rows)

Issue Type

kind/enhancement

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@timothy-e timothy-e added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Aug 15, 2024
@timothy-e timothy-e self-assigned this Aug 15, 2024
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue and removed status/awaiting-triage Issue awaiting triage labels Aug 15, 2024
timothy-e added a commit that referenced this issue Aug 28, 2024
Summary:
Prior to this diff, Bitmap Scan CBO did not handle remote filter costs. Evaluating a remote filter was given no cost, so it was effectively "free". Therefore AND conditions were seen as better to use remote filters for, rather than Bitmap And, because Bitmap And nodes add cost but remote filters added no cost.

To fix this, I
1. determine which quals will be evaluated by the index (as an index cond or a remote filter) in `yb_get_bitmap_index_quals`. This means that the previous change to `create_bitmap_subplan` that collected pushdown expressions can be reverted. The new function has a simpler interface, and allows for better comparing the given qual against the quals evaluated by the plan tree - now some redundant filters can be skipped. (more on this below)
2. determine which quals remain to be addressed by the Bitmap Table Scan node.
3. Account for the cost of evaluating those filters in the same manner that yb_cost_seqscan does, using `cost_qual_eval`. This allows for Bitmap And plans to be a reasonable alternative to remote filter plans.

**Moving the discourage modifier:** I also changed the DISCOURAGE_MODIFIER to affect the startup and runtime costs of Bitmap Table Scan instead of the index scans. Now we no longer incentivise Bitmap Scan plans with more work done in the Bitmap Table Scan node.

**Removing redundant tests: ** Removed some unstable tests from the yb_bitmap_scan_flags regress tests. They failed with the addition of this change, but actually don't provide anything meaningful. They answer the question: "what will the planner choose when both options are disabled?" which depends on exact details of how each plan is costed. That does not belong in a regress test.

**Removing redundant remote filters**: the planner identifies which nodes are resposible for evaluating the given conditions. The Bitmap Table scan node should evaluate anything that's not already implied by the conditions on the Bitmap Index Scans. However, with the addition of a storage index filters on Bitmap Index Scans, this logic was slightly faulty. Two lists of clauses were created, one for index conditions and one for index remote filters. Consider the conditons: `a < 10 OR (b < 10 AND remote_fllter(b))`. This might create a plan like:
```
Bitmap Table Scan
   Storage Filter: xxx
   -> Bitmap Or
        -> Bitmap Index Scan
           Index Cond: a < 10
        -> Bitmap Index Scan
           Index Cond: b < 10
           Storage Index Filter: remote_flter(b))
```
The list of index conditions was: `OR(a < 10, b < 10)`. The list of index filters was: `remote_filter(b)`. Neither of these indiviudally implied the original condition, so the planner decided it was necessary to apply the remote filters.

With this change, a single unified list of index conditions and storage index filters is returned. In this case, the list is `OR(a < 10, AND(b < 10, remote_filter(b))`. Since the original condition is implied by (actually equivalent to) this expression, the planner can now easily determine that the original conditions are already met, so there is no need to add a storage filter.
Jira: DB-12438

Test Plan:
```
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressYbBitmapScans'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgCostModelSeekNextEstimation'
```

TAQO report: {F279017}

Bitmap scan is often the most optimal plan for these queries, but less frequently the default plan with this change because of the movement of the discourage factor and the addition of the remote filter cost. While these are regressions, the feature is not yet in EA and further CBO changes are incoming, including [[ #23565 | #23565 ]] to remove the discourage factor entirely. This is a helpful step in the right direction.

Reviewers: gkukreja, mtakahara

Reviewed By: mtakahara

Subscribers: dsherstobitov, yql

Differential Revision: https://phorge.dev.yugabyte.com/D36484
jasonyb pushed a commit that referenced this issue Aug 29, 2024
Summary:
 bfd17b5 [PLAT-14601]Support restore to Point in time part 2 - Backup/Restore/Restore preflight related changes - Add JSON Property
 91d000a [doc] Thirdparty to integrations (#23598)
 Excluded: 6456777 [#23626] allow loading old dumps that do not have index pg_class OIDs
 c8b8be3 [DB-12586] yugabyted: Update schema migration UI (#23217)
 Excluded: 4ea354b [#23521] YSQL: Cost YB Bitmap Table Scan remote filters
 b80999d [PLAT-14974][PLAT-15045] Added prometheus user as part of yugabyte group
 63f1d65 [#23669] YSQL: Add more logs to debug an assertion failure
 1ad4795 [docs] [TA] Added TA-23476: YCQL currenttimestamp() precision (#23642)
 41ae6b4 [#23653] docdb: Adjust waits for MasterPathHandlersItest.TestUndeletedParentTablet in TSAN
 78b0ae4 [DB-12587] yugabyted: Update data migration UI (#23291)
 d234b3a [PLAT-15046] Create log directory with correct permissions to allow users to export logs without using sudo
 8713c18 [doc][yba] Clarify pre-req for cloud provider image upgrades (#23285)
 9be5c91 [#23448] YSQL: fix failing test PgAutoAnalyzeTest.CheckTableMutationsCount
 Excluded: 9d54710 [#22147] YSQL, QueryDiagnostics: Pgss support for query diagnostics
 417092a [#23373] DocDB: Add max_disk_throughput_mbps gflag to control disk full rejection
 e3a1a36 [PLAT-15035] Add support to sync gflags secret mount location to actual gflag file used by services
 23a6a4c [PLAT-14525][PLAT-14953] Add local provider tests for switchover, failover, change replica, and restart
 6026029 [PLAT-15100][Master]Observed two Scheduled Backup Policies tabs in Backup page
 2cf648b [#23581] CDCSDK: Support dynamic table addition with table removal
 b14851d [#23702] xClusterDDLRepl: Add extra logging
 8a0d6ff [#23645] docdb: Reorder heartbeat handling logic to fix regression.
 2b30b5e [Docs] Changes for Experimental AI (#23714)

Test Plan: Jenkins: rebase: pg15-cherrypicks

Reviewers: jason, tfoucher

Differential Revision: https://phorge.dev.yugabyte.com/D37645
timothy-e added a commit that referenced this issue Aug 30, 2024
Summary:
Original commit: 4ea354b / D36484
Prior to this diff, Bitmap Scan CBO did not handle remote filter costs. Evaluating a remote filter was given no cost, so it was effectively "free". Therefore AND conditions were seen as better to use remote filters for, rather than Bitmap And, because Bitmap And nodes add cost but remote filters added no cost.

To fix this, I
1. determine which quals will be evaluated by the index (as an index cond or a remote filter) in `yb_get_bitmap_index_quals`. This means that the previous change to `create_bitmap_subplan` that collected pushdown expressions can be reverted. The new function has a simpler interface, and allows for better comparing the given qual against the quals evaluated by the plan tree - now some redundant filters can be skipped. (more on this below)
2. determine which quals remain to be addressed by the Bitmap Table Scan node.
3. Account for the cost of evaluating those filters in the same manner that yb_cost_seqscan does, using `cost_qual_eval`. This allows for Bitmap And plans to be a reasonable alternative to remote filter plans.

**Moving the discourage modifier:** I also changed the DISCOURAGE_MODIFIER to affect the startup and runtime costs of Bitmap Table Scan instead of the index scans. Now we no longer incentivise Bitmap Scan plans with more work done in the Bitmap Table Scan node.

**Removing redundant tests: ** Removed some unstable tests from the yb_bitmap_scan_flags regress tests. They failed with the addition of this change, but actually don't provide anything meaningful. They answer the question: "what will the planner choose when both options are disabled?" which depends on exact details of how each plan is costed. That does not belong in a regress test.

**Removing redundant remote filters**: the planner identifies which nodes are resposible for evaluating the given conditions. The Bitmap Table scan node should evaluate anything that's not already implied by the conditions on the Bitmap Index Scans. However, with the addition of a storage index filters on Bitmap Index Scans, this logic was slightly faulty. Two lists of clauses were created, one for index conditions and one for index remote filters. Consider the conditons: `a < 10 OR (b < 10 AND remote_fllter(b))`. This might create a plan like:
```
Bitmap Table Scan
   Storage Filter: xxx
   -> Bitmap Or
        -> Bitmap Index Scan
           Index Cond: a < 10
        -> Bitmap Index Scan
           Index Cond: b < 10
           Storage Index Filter: remote_flter(b))
```
The list of index conditions was: `OR(a < 10, b < 10)`. The list of index filters was: `remote_filter(b)`. Neither of these indiviudally implied the original condition, so the planner decided it was necessary to apply the remote filters.

With this change, a single unified list of index conditions and storage index filters is returned. In this case, the list is `OR(a < 10, AND(b < 10, remote_filter(b))`. Since the original condition is implied by (actually equivalent to) this expression, the planner can now easily determine that the original conditions are already met, so there is no need to add a storage filter.
Jira: DB-12438

Test Plan:
```
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressYbBitmapScans'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgCostModelSeekNextEstimation'
```

TAQO report: {F279017}

Bitmap scan is often the most optimal plan for these queries, but less frequently the default plan with this change because of the movement of the discourage factor and the addition of the remote filter cost. While these are regressions, the feature is not yet in EA and further CBO changes are incoming, including [[ #23565 | #23565 ]] to remove the discourage factor entirely. This is a helpful step in the right direction.

Reviewers: gkukreja, mtakahara

Reviewed By: mtakahara

Subscribers: yql, dsherstobitov

Differential Revision: https://phorge.dev.yugabyte.com/D37623
timothy-e added a commit that referenced this issue Sep 11, 2024
…emote filters

Summary:
Original commit: 4ea354b / D36484

Merge conflicts:
* createplan.c:
   * `create_bitmap_subplan` declaration:
      * D36484 removed some of the parameters that were added by YB
      * Upstream realigned declarations
      * resolution: remove the parameters but follow new indentation practice
  * `create_indexscan_plan` declaration:
      * D36484 moved it to `planmain.h`
      * Upstream realigned declarations
      * resolution: move it to `planmain.h`
  * `create_bitmap_subplan`:
      * D36484 removed references to `indexpushdownquals`
      * pg15's cd1df46 moved one of the references lower in the file
      * resolution: remove from the new location
* `costsize.c`
     * `get_bitmap_index_quals`
         * D36484 copied the approach of `create_bitmap_subplan` to get index quals.
         * `ipath->indexquals` was removed in pg15.
         * Resolution: Rewrote the BitmapIndexScan case, referencing pg15's create_bitmap_subplan
* planmain.h:
  * declarations
     * D36484 moved `create_indexscan_plan` from `createplan.h` and added `yb_get_bitmap_index_quals`
     * Upstream changed indentation of neighbouring declarations
     * because these new declarations were not spaced out from their neighbours, the changes caused conflicts. Include the new declarations.
* yb_pg_select.out
   * D36484 mistakenly removed logic to account for quals implied by partial indexes, resulting in redundant storage filters being added. This is tracked by #23859. In this current diff, the BitmapIndexScan case of get_bitmap_index_quals needed to be rewritten for resolving costsize.c conflicts, and the rewritten code includes handling of partial indexes. The redundant storage filters are only introduced on master, not in the pg15 branch.
   * yb-pg15's 2750e71 removed yb ordering
   * Solution: use the pg15 test file, because the redundant filters are not added by this code.

-----

Prior to this diff, Bitmap Scan CBO did not handle remote filter costs. Evaluating a remote filter was given no cost, so it was effectively "free". Therefore AND conditions were seen as better to use remote filters for, rather than Bitmap And, because Bitmap And nodes add cost but remote filters added no cost.

To fix this, I
1. determine which quals will be evaluated by the index (as an index cond or a remote filter) in `yb_get_bitmap_index_quals`. This means that the previous change to `create_bitmap_subplan` that collected pushdown expressions can be reverted. The new function has a simpler interface, and allows for better comparing the given qual against the quals evaluated by the plan tree - now some redundant filters can be skipped. (more on this below)
2. determine which quals remain to be addressed by the Bitmap Table Scan node.
3. Account for the cost of evaluating those filters in the same manner that yb_cost_seqscan does, using `cost_qual_eval`. This allows for Bitmap And plans to be a reasonable alternative to remote filter plans.

**Moving the discourage modifier:** I also changed the DISCOURAGE_MODIFIER to affect the startup and runtime costs of Bitmap Table Scan instead of the index scans. Now we no longer incentivise Bitmap Scan plans with more work done in the Bitmap Table Scan node.

**Removing redundant tests: ** Removed some unstable tests from the yb_bitmap_scan_flags regress tests. They failed with the addition of this change, but actually don't provide anything meaningful. They answer the question: "what will the planner choose when both options are disabled?" which depends on exact details of how each plan is costed. That does not belong in a regress test.

**Removing redundant remote filters**: the planner identifies which nodes are resposible for evaluating the given conditions. The Bitmap Table scan node should evaluate anything that's not already implied by the conditions on the Bitmap Index Scans. However, with the addition of a storage index filters on Bitmap Index Scans, this logic was slightly faulty. Two lists of clauses were created, one for index conditions and one for index remote filters. Consider the conditons: `a < 10 OR (b < 10 AND remote_fllter(b))`. This might create a plan like:
```
Bitmap Table Scan
   Storage Filter: xxx
   -> Bitmap Or
        -> Bitmap Index Scan
           Index Cond: a < 10
        -> Bitmap Index Scan
           Index Cond: b < 10
           Storage Index Filter: remote_flter(b))
```
The list of index conditions was: `OR(a < 10, b < 10)`. The list of index filters was: `remote_filter(b)`. Neither of these indiviudally implied the original condition, so the planner decided it was necessary to apply the remote filters.

With this change, a single unified list of index conditions and storage index filters is returned. In this case, the list is `OR(a < 10, AND(b < 10, remote_filter(b))`. Since the original condition is implied by (actually equivalent to) this expression, the planner can now easily determine that the original conditions are already met, so there is no need to add a storage filter.
Jira: DB-12438

Test Plan:
```
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressYbBitmapScans'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgCostModelSeekNextEstimation'
```

TAQO report: {F279017}

Bitmap scan is often the most optimal plan for these queries, but less frequently the default plan with this change because of the movement of the discourage factor and the addition of the remote filter cost. While these are regressions, the feature is not yet in EA and further CBO changes are incoming, including [[ #23565 | #23565 ]] to remove the discourage factor entirely. This is a helpful step in the right direction.

Reviewers: jason, tfoucher, mtakahara

Reviewed By: jason, mtakahara

Subscribers: mtakahara, yql, dsherstobitov

Differential Revision: https://phorge.dev.yugabyte.com/D37658
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

2 participants