Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Pushdown aggregates for YB Bitmap Table Scans #21198

Closed
1 task done
timothy-e opened this issue Feb 26, 2024 · 0 comments
Closed
1 task done

[YSQL] Pushdown aggregates for YB Bitmap Table Scans #21198

timothy-e opened this issue Feb 26, 2024 · 0 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@timothy-e
Copy link
Contributor

timothy-e commented Feb 26, 2024

Jira Link: DB-10128

Description

Currently bitmap scans can avoid fetching the rows for a SELECT COUNT(*) query if no recheck is required, because it can just count the number of ybctids.

If recheck is required or we have a different aggregate (e.g. min, max, sum, avg), we need to fetch all the rows. Pushing down the aggregate would reduce the number of RPCs done by the Table Scan node to 1.

Make sure to also handle table expression pushdown.

Issue Type

kind/enhancement

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@timothy-e timothy-e added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Feb 26, 2024
@timothy-e timothy-e self-assigned this Feb 26, 2024
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue labels Feb 26, 2024
@sushantrmishra sushantrmishra removed the status/awaiting-triage Issue awaiting triage label Feb 27, 2024
timothy-e added a commit that referenced this issue Jul 3, 2024
Summary:
Following the same pattern as 55eba85 / D27427, support aggregate pushdown for YB Bitmap Table Scans.

Aggregates cannot be pushed down at the YB Bitmap Index Scan level, because we need to be sure that we only accumulate each row one, but a row might exist in each Bitmap Index Scan.

Normal index scans can't pushdown the aggregate if there are recheck conditions, because those conditions are implicitly rechecked locally. However, if the condition of a Bitmap Index Scan needs to be rechecked, it is rechecked at the Bitmap Table Scan level. Usually this can happen with a remote filter. The only times we can't push down an aggregate is if we have local filters of any type:
* basic local quals are accounted for by the existing aggregate pushdown framework.
* recheck local quals are only an issue if recheck is required.
* local quals used for when we exceed work_mem are only an issue if we exceed work_mem. However, since its impossible to know if we will or not until the bitmap scans complete, we just avoid aggregate pushdown if there are any local quals of this variety.

Determining if recheck is required can be done at the Init phase. For aggregate workflows, this works, because aggregates do not occur inside loops, so we are not concerned about recheck. For non-aggregate workflows, we need to recompute if recheck is required for each rescan. Some tests were added to validate this.

To access the `recheck_required` field at init time, we need a `YbGetBitmapScanRecheckRequired` that traverses the Bitmap tree. We might as well use the same approach to determine if recheck is required by any of the rescans that occurred. This allows `recheck` to be removed from `YbTIDBitmap`.
Jira: DB-10128

Test Plan:
```
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressYbBitmapScans'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressAggregates'
```

Tested with randgen using a slightly modified grammar to increase the number of aggregates produced.

Reviewers: amartsinchyk, jason

Reviewed By: jason

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D35787
jasonyb pushed a commit that referenced this issue Jul 3, 2024
Summary:
 Excluded: eb77547 [#21886] DocDB: Add read-time option of ysql_dump to --help output
 5b193c8 [#22807] docdb: Add gflag to control of RWCLock debug logging
 Excluded: 471c7c4 [#22967] YSQL: Import 'Repair more failures with SubPlans in multi-row VALUES lists.'
 Excluded: 8142fdc [#23013] XClusterDDLRepl: Link producer table to consumer table by table id
 7dc7da5 [DOC-403] Added PG CVE table (#22913)
 cf17238 [PLAT-14557][xCluster] DrConfigGetResp.class must be passed as the DR GET API response to swagger
 81a5b3c [PLAT-14535][PLAT-14149] Azure provider validation fixes
 15f302d [#22619] DocDB: Fix test issue with PgGetLockStatusTestRF3.TestLocksOfSingleShardWaiters
 b7c3da8 [DEVOPS-3144] build: Remove release_package_docker_test sanity test from jenkins build
 3106dc0 [PLAT-14165] Fixed unit tests and ran swagger Gen
 ab09fc1 [#23035] DocDB: Disable disk full checks on ASAN
 Excluded: f86020d [#23044] xCluster: Move xCluster members from CatalogManager to XClusterManager
 8d929e9 [PLAT-14173] UserIntentOverrides into v2 UniverseSpec
 19c6049 [PLAT-14536] Missing XmlElement dependency
 e2e9f93 [PLAT-14473][PLAT-14554]Add gflags from groups at runtime
 Excluded: b8fedf6 [#21198] YSQL: Pushdown Aggregates on YB Bitmap Table Scans

Test Plan: Jenkins: rebase: pg15-cherrypicks

Reviewers: jason, tfoucher

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D36374
timothy-e added a commit that referenced this issue Jul 4, 2024
…e Scans

Summary:
Original commit: b8fedf6 / D35787
Following the same pattern as 55eba85 / D27427, support aggregate pushdown for YB Bitmap Table Scans.

Aggregates cannot be pushed down at the YB Bitmap Index Scan level, because we need to be sure that we only accumulate each row one, but a row might exist in each Bitmap Index Scan.

Normal index scans can't pushdown the aggregate if there are recheck conditions, because those conditions are implicitly rechecked locally. However, if the condition of a Bitmap Index Scan needs to be rechecked, it is rechecked at the Bitmap Table Scan level. Usually this can happen with a remote filter. The only times we can't push down an aggregate is if we have local filters of any type:
* basic local quals are accounted for by the existing aggregate pushdown framework.
* recheck local quals are only an issue if recheck is required.
* local quals used for when we exceed work_mem are only an issue if we exceed work_mem. However, since its impossible to know if we will or not until the bitmap scans complete, we just avoid aggregate pushdown if there are any local quals of this variety.

Determining if recheck is required can be done at the Init phase. For aggregate workflows, this works, because aggregates do not occur inside loops, so we are not concerned about recheck. For non-aggregate workflows, we need to recompute if recheck is required for each rescan. Some tests were added to validate this.

To access the `recheck_required` field at init time, we need a `YbGetBitmapScanRecheckRequired` that traverses the Bitmap tree. We might as well use the same approach to determine if recheck is required by any of the rescans that occurred. This allows `recheck` to be removed from `YbTIDBitmap`.
Jira: DB-10128

Test Plan:
```
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressYbBitmapScans'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressAggregates'
```

Tested with randgen using a slightly modified grammar to increase the number of aggregates produced.

Reviewers: amartsinchyk, jason

Reviewed By: jason

Subscribers: yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D36372
timothy-e added a commit that referenced this issue Jul 11, 2024
…itmap Table Scans

Summary:
Differences:
- indexam.c:
   - conflict because yb_amgetbitmap had its third arguemnt removed by D35787 but it's parent object was renamed to `rd_indam`  from `rd_amroutine` in pg15.
- nodeYbBitmapTablescan.c:
   - rewrote `if (node->aggrefs)` to match the similar section from `nodeYbSeqScan.c`, because pg15 changed the signature of `CreateTemplateTupleDesc` and `ExecInitScanTupleSlot`
- `execnodes.h`:
   - conflict because pg15 changed the type of the last field and I added a new last field to the `YbBitmapIndexScanState` struct. Trivial to resolve

Original commit: b8fedf6 / D35787
Following the same pattern as 55eba85 / D27427, support aggregate pushdown for YB Bitmap Table Scans.

Aggregates cannot be pushed down at the YB Bitmap Index Scan level, because we need to be sure that we only accumulate each row one, but a row might exist in each Bitmap Index Scan.

Normal index scans can't pushdown the aggregate if there are recheck conditions, because those conditions are implicitly rechecked locally. However, if the condition of a Bitmap Index Scan needs to be rechecked, it is rechecked at the Bitmap Table Scan level. Usually this can happen with a remote filter. The only times we can't push down an aggregate is if we have local filters of any type:
* basic local quals are accounted for by the existing aggregate pushdown framework.
* recheck local quals are only an issue if recheck is required.
* local quals used for when we exceed work_mem are only an issue if we exceed work_mem. However, since its impossible to know if we will or not until the bitmap scans complete, we just avoid aggregate pushdown if there are any local quals of this variety.

Determining if recheck is required can be done at the Init phase. For aggregate workflows, this works, because aggregates do not occur inside loops, so we are not concerned about recheck. For non-aggregate workflows, we need to recompute if recheck is required for each rescan. Some tests were added to validate this.

To access the `recheck_required` field at init time, we need a `YbGetBitmapScanRecheckRequired` that traverses the Bitmap tree. We might as well use the same approach to determine if recheck is required by any of the rescans that occurred. This allows `recheck` to be removed from `YbTIDBitmap`.
Jira: DB-10128

Test Plan:
```
pg15_tests/test_yb_bitmap_scans.sh
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressAggregates'
./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressThirdPartyExtensionsPgHintPlan'
```

Reviewers: jason, tfoucher

Reviewed By: jason

Subscribers: yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D36384
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

3 participants