-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL] Pushdown aggregates for YB Bitmap Table Scans #21198
Labels
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Comments
timothy-e
added
area/ysql
Yugabyte SQL (YSQL)
status/awaiting-triage
Issue awaiting triage
labels
Feb 26, 2024
yugabyte-ci
added
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
labels
Feb 26, 2024
1 task
1 task
timothy-e
added a commit
that referenced
this issue
Jul 3, 2024
Summary: Following the same pattern as 55eba85 / D27427, support aggregate pushdown for YB Bitmap Table Scans. Aggregates cannot be pushed down at the YB Bitmap Index Scan level, because we need to be sure that we only accumulate each row one, but a row might exist in each Bitmap Index Scan. Normal index scans can't pushdown the aggregate if there are recheck conditions, because those conditions are implicitly rechecked locally. However, if the condition of a Bitmap Index Scan needs to be rechecked, it is rechecked at the Bitmap Table Scan level. Usually this can happen with a remote filter. The only times we can't push down an aggregate is if we have local filters of any type: * basic local quals are accounted for by the existing aggregate pushdown framework. * recheck local quals are only an issue if recheck is required. * local quals used for when we exceed work_mem are only an issue if we exceed work_mem. However, since its impossible to know if we will or not until the bitmap scans complete, we just avoid aggregate pushdown if there are any local quals of this variety. Determining if recheck is required can be done at the Init phase. For aggregate workflows, this works, because aggregates do not occur inside loops, so we are not concerned about recheck. For non-aggregate workflows, we need to recompute if recheck is required for each rescan. Some tests were added to validate this. To access the `recheck_required` field at init time, we need a `YbGetBitmapScanRecheckRequired` that traverses the Bitmap tree. We might as well use the same approach to determine if recheck is required by any of the rescans that occurred. This allows `recheck` to be removed from `YbTIDBitmap`. Jira: DB-10128 Test Plan: ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressYbBitmapScans' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressAggregates' ``` Tested with randgen using a slightly modified grammar to increase the number of aggregates produced. Reviewers: amartsinchyk, jason Reviewed By: jason Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D35787
jasonyb
pushed a commit
that referenced
this issue
Jul 3, 2024
Summary: Excluded: eb77547 [#21886] DocDB: Add read-time option of ysql_dump to --help output 5b193c8 [#22807] docdb: Add gflag to control of RWCLock debug logging Excluded: 471c7c4 [#22967] YSQL: Import 'Repair more failures with SubPlans in multi-row VALUES lists.' Excluded: 8142fdc [#23013] XClusterDDLRepl: Link producer table to consumer table by table id 7dc7da5 [DOC-403] Added PG CVE table (#22913) cf17238 [PLAT-14557][xCluster] DrConfigGetResp.class must be passed as the DR GET API response to swagger 81a5b3c [PLAT-14535][PLAT-14149] Azure provider validation fixes 15f302d [#22619] DocDB: Fix test issue with PgGetLockStatusTestRF3.TestLocksOfSingleShardWaiters b7c3da8 [DEVOPS-3144] build: Remove release_package_docker_test sanity test from jenkins build 3106dc0 [PLAT-14165] Fixed unit tests and ran swagger Gen ab09fc1 [#23035] DocDB: Disable disk full checks on ASAN Excluded: f86020d [#23044] xCluster: Move xCluster members from CatalogManager to XClusterManager 8d929e9 [PLAT-14173] UserIntentOverrides into v2 UniverseSpec 19c6049 [PLAT-14536] Missing XmlElement dependency e2e9f93 [PLAT-14473][PLAT-14554]Add gflags from groups at runtime Excluded: b8fedf6 [#21198] YSQL: Pushdown Aggregates on YB Bitmap Table Scans Test Plan: Jenkins: rebase: pg15-cherrypicks Reviewers: jason, tfoucher Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D36374
timothy-e
added a commit
that referenced
this issue
Jul 4, 2024
…e Scans Summary: Original commit: b8fedf6 / D35787 Following the same pattern as 55eba85 / D27427, support aggregate pushdown for YB Bitmap Table Scans. Aggregates cannot be pushed down at the YB Bitmap Index Scan level, because we need to be sure that we only accumulate each row one, but a row might exist in each Bitmap Index Scan. Normal index scans can't pushdown the aggregate if there are recheck conditions, because those conditions are implicitly rechecked locally. However, if the condition of a Bitmap Index Scan needs to be rechecked, it is rechecked at the Bitmap Table Scan level. Usually this can happen with a remote filter. The only times we can't push down an aggregate is if we have local filters of any type: * basic local quals are accounted for by the existing aggregate pushdown framework. * recheck local quals are only an issue if recheck is required. * local quals used for when we exceed work_mem are only an issue if we exceed work_mem. However, since its impossible to know if we will or not until the bitmap scans complete, we just avoid aggregate pushdown if there are any local quals of this variety. Determining if recheck is required can be done at the Init phase. For aggregate workflows, this works, because aggregates do not occur inside loops, so we are not concerned about recheck. For non-aggregate workflows, we need to recompute if recheck is required for each rescan. Some tests were added to validate this. To access the `recheck_required` field at init time, we need a `YbGetBitmapScanRecheckRequired` that traverses the Bitmap tree. We might as well use the same approach to determine if recheck is required by any of the rescans that occurred. This allows `recheck` to be removed from `YbTIDBitmap`. Jira: DB-10128 Test Plan: ``` ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressYbBitmapScans' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressAggregates' ``` Tested with randgen using a slightly modified grammar to increase the number of aggregates produced. Reviewers: amartsinchyk, jason Reviewed By: jason Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D36372
timothy-e
added a commit
that referenced
this issue
Jul 11, 2024
…itmap Table Scans Summary: Differences: - indexam.c: - conflict because yb_amgetbitmap had its third arguemnt removed by D35787 but it's parent object was renamed to `rd_indam` from `rd_amroutine` in pg15. - nodeYbBitmapTablescan.c: - rewrote `if (node->aggrefs)` to match the similar section from `nodeYbSeqScan.c`, because pg15 changed the signature of `CreateTemplateTupleDesc` and `ExecInitScanTupleSlot` - `execnodes.h`: - conflict because pg15 changed the type of the last field and I added a new last field to the `YbBitmapIndexScanState` struct. Trivial to resolve Original commit: b8fedf6 / D35787 Following the same pattern as 55eba85 / D27427, support aggregate pushdown for YB Bitmap Table Scans. Aggregates cannot be pushed down at the YB Bitmap Index Scan level, because we need to be sure that we only accumulate each row one, but a row might exist in each Bitmap Index Scan. Normal index scans can't pushdown the aggregate if there are recheck conditions, because those conditions are implicitly rechecked locally. However, if the condition of a Bitmap Index Scan needs to be rechecked, it is rechecked at the Bitmap Table Scan level. Usually this can happen with a remote filter. The only times we can't push down an aggregate is if we have local filters of any type: * basic local quals are accounted for by the existing aggregate pushdown framework. * recheck local quals are only an issue if recheck is required. * local quals used for when we exceed work_mem are only an issue if we exceed work_mem. However, since its impossible to know if we will or not until the bitmap scans complete, we just avoid aggregate pushdown if there are any local quals of this variety. Determining if recheck is required can be done at the Init phase. For aggregate workflows, this works, because aggregates do not occur inside loops, so we are not concerned about recheck. For non-aggregate workflows, we need to recompute if recheck is required for each rescan. Some tests were added to validate this. To access the `recheck_required` field at init time, we need a `YbGetBitmapScanRecheckRequired` that traverses the Bitmap tree. We might as well use the same approach to determine if recheck is required by any of the rescans that occurred. This allows `recheck` to be removed from `YbTIDBitmap`. Jira: DB-10128 Test Plan: ``` pg15_tests/test_yb_bitmap_scans.sh ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressAggregates' ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressThirdPartyExtensionsPgHintPlan' ``` Reviewers: jason, tfoucher Reviewed By: jason Subscribers: yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D36384
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Jira Link: DB-10128
Description
Currently bitmap scans can avoid fetching the rows for a
SELECT COUNT(*)
query if no recheck is required, because it can just count the number of ybctids.If recheck is required or we have a different aggregate (e.g. min, max, sum, avg), we need to fetch all the rows. Pushing down the aggregate would reduce the number of RPCs done by the Table Scan node to 1.
Make sure to also handle table expression pushdown.
Issue Type
kind/enhancement
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: