-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't simplify uncorrelated EXISTS sublink in some cases #823
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Wait for #818 merged first. |
my-ship-it
previously approved these changes
Dec 30, 2024
In Greenplum, different from upstream, will try to pull up the correlated EXISTS sublink which has aggregate, just like: ``` postgres=# create table t(a int, b int); postgres=# create table s (a int, b int); postgres=# explain (costs off) select * from t where exists (select sum(s.a) from s where s.a = t.a group by s.a); QUERY PLAN ------------------------------------------ Gather Motion 3:1 (slice1; segments: 3) -> Hash Join Hash Cond: (t.a = s.a) -> Seq Scan on t -> Hash -> HashAggregate Group Key: s.a -> Seq Scan on s Optimizer: Postgres query optimizer (9 rows) ``` So Greenplum changed the behavior of function `convert_EXISTS_sublink_to_join()` and `simplify_EXISTS_query()`, so `simplify_EXISTS_query()` will simplify the EXISTS sublink which has aggregate node, it will reset `targetList` and other clauses to null, just like: ``` query->targetList = NIL; query->distinctClause = NIL; query->sortClause = NIL; query->hasDistinctOn = false; ``` But when we exec uncorrelated EXISTS sublink which also has an aggregate node, we can't reset `targetList`. Otherwise, we can't make a plan for an aggregate node that doesn't have an aggregate column, just like #11849. This patch tries to fix the above issue, and the thought is **NOT** simplify uncorrelated EXISTS sublink in some situation.
Commit 0f4af19b71f2f31acd28c39d0d0a865cbf1d3bab added this test, but the `analyze` statement exposed an unrelated issue and caused CI to fail. We should explicitly analyze tables to reduce runtime in tests anyway.
Previously, in commit 32b8646, we added "M" flag to support utility mode connection to a segment who is configured as mirror in FTS configuration but is actually running like a primary. We now also add "Mq" flag to support quitting from this connection. We will make use of it in the next commit.
Keep align with upstream.
This removes FIXMEs for the compute_xid_horizon_for_tuples() API, which was introduced in 558a916. This API is used to compute what snapshots to conflict with when replaying WAL records for page-level index vacuums. Page-level index vacuums don't even arise for AO/CO tables today due to absence of HOT. Thus it is perfectly fine to leave this unimplemented. Details: (1) The amgettuple() routine for index AMs can optionally kill off tuples if they are found dead in the underlying table scan. This is done with IndexScanDesc->kill_prior_tuple. For examples: see btgettuple(), hashgettuple() and gistgettuple(). (2) The index AMs typically have a killitems function (like _bt_killitems()), which among other things sets a garbage flag on the index page (like BTP_HAS_GARBAGE) (3) Afterwards, if an index page has a garbage flag, it can undergo page level vacuum, even inside a scan (see _bt_vacuum_one_page()). This is where table_compute_xid_horizon_for_tuples() is typically called and the latestRemovedXid is recorded in the WAL (eg XLOG_BTREE_DELETE). (4) During replay on a hot standby, the latestRemovedXid is extracted from the record and then used to ResolveRecoveryConflictWithSnapshot(). Since, AO/CO tables don't have HOT chains, in index_fetch_heap(), all_dead is always set to false and thus the IndexScanDesc->kill_prior_tuple is always false. Thus, we never even perform index page-level vacuum for indexes on AO/CO tables, let alone compute the xid horizon.
This commit confirms that the tuple_lock API can be safely left unimplemented for AO/CO tables. See below for details: (1) UPSERT: ExecOnConflictUpdate() calls this, but clearly upsert is not supported for AO/CO tables. (2) DELETE and UPDATE triggers: GetTupleForTrigger() calls this, but clearly these trigger types are not supported for AO/CO tables. (3) Logical replication: RelationFindReplTupleByIndex() and RelationFindReplTupleSeq() calls this, but clearly we don't support logical replication yet for GPDB. (4) For DELETEs/UPDATEs, when a state of TM_Updated is returned from table_tuple_delete() and table_tuple_update() respectively, this API is invoked. However, that is impossible for AO/CO tables as an AO/CO tuple cannot be deleted/updated while another transaction is updating it (see CdbTryOpenTable()). (5) Row-level locking (SELECT FOR ..): ExecLockRows() calls this but a plan containing the LockRows plan node is never generated for AO/CO tables. In fact, we lock at the table level instead. See 6ebce73 for details. pstate->p_canOptSelectLockingClause = false; is done in addRangeTableEntry() for AO/CO tables, which prevents such plans. Also, ORCA falls back to planner when presented with a query carrying a locking clause.
…657) clang-12 has introduced -Wcompound-token-split-by-macro, that is causing a large amount of warnings when building PL/Perl because of its interactions with upstream Perl. This commit adds one -Wno to CFLAGS at ./configure time if the flag is supported by the compiler to silence all those warnings. Upstream perl has fixed this issue, but it is going to take some time before this is spread across the buildfarm, and we have noticed that some animals would be useful with an extra -Werror to help with the detection of incorrect placeholders (see b0cf544), dangomushi being one. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/YYr3qYa/R3Gw+Sbg@paquier.xyz Backpatch-through: 10 (cherry picked from commit 9ff47ea414c4e6ace347fc4e59866e38b9bbceaf) Co-authored-by: Michael Paquier <michael@paquier.xyz>
We don't support speculative insert (INSERT ON CONFLICT) for AO/CO tables yet. This commit does two things: (1) Confirms that its okay to keep the speculative insert APIs unimplemented, by removing the FIXMEs and adding ereport(). (2) Instead of relying on the ERROR at the table AM layer, it introduces a check at the parse analysis stage to perform the ban. This is important as we can't rely on the table AM layer always. For instance, with DO NOTHING, if there is no conflict, the table AM routine is invoked. However, if there is a conflict, then the routine doesn't get invoked and the customer would see: INSERT INTO spec_insert VALUES(1, 3) ON CONFLICT(i) DO NOTHING; INSERT 0 0 This would lead to a false impression that it is supported. So, we ban it at the parse analysis stage.
`\!rm` to test if they are located in the right directory.
…(#14556) We used to have the same logic between create_grouping_paths() and cdb_create_multistage_grouping_paths() to determine if the input is hashable / sortable. ``` can_sort = grouping_is_sortable(parse->groupClause); can_hash = (parse->groupClause != NIL && agg_costs->numPureOrderedAggs == 0 && grouping_is_hashable(parse->groupClause)); ``` The logic in create_grouping_paths() has been changed by commit b563594 because if we have grouping sets we might be able to sort / hash some but not all of them. ``` can_sort = (gd && gd->rollups != NIL) || grouping_is_sortable(parse->groupClause); can_hash = (parse->groupClause != NIL && parse->groupingSets == NIL && agg_costs->numOrderedAggs == 0 && (gd ? gd->any_hashable : grouping_is_hashable(parse->groupClause))); ``` It's not necessary to support multistage grouping sets(unhashable_col, unsortable_col). Ordinary grouping path can cover such corner cases. So just remove this merge fixme.
test_consume_xids(n int) is supposed to consume n xids but it actually consumes n+1. The reason is that we initialize 'xid' to be the next xid but during advancing it we'll rewrite it to be the xid that we just allocated (the "current xid"). Fixing it by making its meaning consistent.
GetCurrentDistributedSnapshotWithLocalMapping() has been dead since b3f300b.
avamingli
changed the title
[Draft] Don't simplify uncorrelated EXISTS sublink in some cases
Don't simplify uncorrelated EXISTS sublink in some cases
Dec 30, 2024
my-ship-it
approved these changes
Dec 31, 2024
jiaqizho
approved these changes
Dec 31, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #ISSUE_Number
What does this PR do?
Type of Change
Breaking Changes
Test Plan
make installcheck
make -C src/test installcheck-cbdb-parallel
Impact
Performance:
User-facing changes:
Dependencies:
Checklist
Additional Context
CI Skip Instructions