CherryPick Resolve a FIXME in merge_leaf_stats() etc #827

yjhjstz · 2024-12-31T07:38:21Z

Fixes #ISSUE_Number

What does this PR do?

Type of Change

Bug fix (non-breaking change)
New feature (non-breaking change)
Breaking change (fix or feature with breaking changes)
Documentation update

Breaking Changes

Test Plan

Unit tests added/updated
Integration tests added/updated
Passed make installcheck
Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Followed contribution guide
Added/updated documentation
Reviewed code for security implications
Requested review from cloudberry committers

Additional Context

CI Skip Instructions

This commit includes a couple improvements when dumping minirepros and gpsd: 1. Dump the number of segments. This is needed to replicate the plan, and often we have to ask users/customers this separately. Instead just retrieve it during the minirepro 2. Output a `set optimizer=off` for when the minirepro is being restored. The short insert statements are much faster (~10X) using planner than Orca, especially if done in debug buiild. Output this by default so I don't have to modify this manually.

The QE notice message might be truncated because MPPnoticeReceiver() limited message length about 800 bytes sent to the client. The client should not see an incomplete notice message. This commit changes its behaviour, it allows QE messages of any length (include QE identifier) to send to the client.

…. (#14108) In GPDB, expression subquery is promoted to join, e.g. select * from T where a > (select 10*avg(x) from R where T.b=R.y); in postgres, the plan is: QUERY PLAN ----------------------------------------------------------------- Seq Scan on t (cost=0.00..80364.30 rows=753 width=8) Filter: ((a)::numeric > (SubPlan 1)) SubPlan 1 -> Aggregate (cost=35.53..35.54 rows=1 width=32) -> Seq Scan on r (cost=0.00..35.50 rows=10 width=4) Filter: (t.b = y) while in GPDB, the subquery is promoted to join: QUERY PLAN --------------------------------------------------------------------------------------------- Gather Motion 3:1 (slice1; segments: 3) -> Hash Join (cost=477.00..969.11 rows=9567 width=8) Hash Cond: (t.b = "Expr_SUBQUERY".csq_c0) Join Filter: ((t.a)::numeric > "Expr_SUBQUERY".csq_c1) -> Seq Scan on t (cost=0.00..321.00 rows=28700 width=8) -> Hash (cost=472.83..472.83 rows=333 width=36) -> Subquery Scan on "Expr_SUBQUERY" -> HashAggregate Group Key: r.y -> Seq Scan on r Optimizer: Postgres query optimizer But if the T.b and R.y are different type, it will report error like: ERROR: operator 37 is not a valid ordering operator (pathkeys.c:579) The root cause is that IsCorrelatedEqualityOpExpr may generate incorrect sort operator and equal operator for SortGroupClause when the types of left and right operands of OpExpr are different. Since get_ordering_op_properties() requires the type of the left and right operands being consistent and SortGroupClause->tleSortGroupRef is referencing the target entry created from innerExpr, we teach IsCorrelatedEqualityOpExpr() construct appropriate eqOp and sortOp according to the type of innerExpr.

Previously, when using minirepro on a query that accessed materialized views, the object that the materialized view used/accessed were not dumped with the minirepro. This is because queries themselves treat materialized views as tables, not views. However, when enumerating the dependent objects for dumping the relevant DDL, we need to treat materialized views as regular views. This commit sets a flag when dumping from gp_dump_query_oids so that materialized views are treated as regular views when parsing the query.

This commit adds calls to `RelationOpenSmgr()` in `open_all_datumstreamread_segfiles()` and `openFetchSegmentFile()` to ensure that the storage manager (Smgr) relation is opened before accessing Append-Optimized (AO) segment files. Without explicitly opening the Smgr relation, accessing `rel->rd_smgr->smgr_ao` could lead to undefined behavior or errors when the Smgr is not initialized. This change ensures stability and correctness when working with AO segment files.

For GPDB5 pg_tablespace.spclocation is used For GPDB6+ pg_catalog.pg_tablespace_location(oid) is used Authored-by: Brent Doil <bdoil@vmware.com>

We don't drop partitioned tables with indexes for the pg_upgrade regression test on GPDB7. Remove the commented out commands and FIXME. Authored-by: Brent Doil <bdoil@vmware.com>

The same value is assigned to the variable at the moment of declaration (some lines before)

Co-authored-by: CharlieTT <chaotian@vmware.com> Fix crash of PG_TRY which caused by return branch in PG_TRY block. When we return value in PG_TRY, some setjump/longjump context has been messed up so that we cannot longjump to right address once ereport(ERROR) exists later, so crash will happen.

On PG12 an option SKIP_LOCKED was added to VACUUM and ANALYZE to indicate whether we need to skip tables when a lock cannot be acquired at the time. However, relevant tests don't cover all the scenarios on GPDB. This PR is to add more tests to cover the scenarios.

In GP's convert_ANY_sublink_to_join() now, there is no simplification of origin subquery, as a result sometimes the sublink cannot be pulled up, which produced a bad query plan. In GP5, there is a subquery simplify step which helped sublink pull-up, but we lose this simplification since version 6X until master, which caused the sublink pull-up doesn't work as only simple subquery is allowed to be pulled up. By this commit we brought back cdbsubselect_drop_distinct and cdbsubselect_drop_orderby, then we can got simplified subquery again and do the sublink pull-up optimization on it which produced a better plan.

The commit faa7dc0713 fix the fallback but forget to remove FIXME.

The case is to test some code path during parallel hash join but Greenplum does not support Postgres' parallel plan. Changes the FIXME to a NOTE.

Historically in GPDB6, when executing GRANT/REVOKE on a partition root, we will recurse into its child tables. We have decided to keep this behavior for 7X, but to also support the ONLY keyword so that users can choose to grant/revoke just the partition root, without recursion. The syntax is e.g.: GRANT SELECT ON ONLY foo_par TO role; Discussion: https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/ImJrvQEECwA

Now the SQL does not fallback to planner.

The name of UpdateSerializableCommandId is a bit wrong, now that SERIALIZABLE and REPEATABLE READ are not the same. From comparison, the if-check above was changed from checking IsXactIsoLevelSerializable to IsolationUsesXactSnapshot(). This commit renames it and resolves that GPDB_91_MERGE_FIXME. commit 5eb15c9 Author: Joe Conway <mail@joeconway.com> Date: Sat Sep 11 18:38:58 2010 +0000 SERIALIZABLE transactions are actually implemented beneath the covers with transaction snapshots, i.e. a snapshot registered at the beginning of a transaction. Change variable naming and comments to reflect this reality in preparation for a future, truly serializable mode, e.g. Serializable Snapshot Isolation (SSI). For the moment transaction snapshots are still used to implement SERIALIZABLE, but hopefully not for too much longer. Patch by Kevin Grittner and Dan Ports with review and some minor wording changes by me.

Historically in GPDB6, when executing ALTER TABLE SET TABLESPACE on a partition root, we will recurse into its child tables. This behavior differs from the upstream where it is not recursed. We have decided to keep this behavior for 7X, but to also support the ONLY keyword so that users can choose to alter just the partition root. Similar to other ALTER TABLE ONLY statements, the syntax is e.g.: ALTER TABLE ONLY foopart SET TABLESPACE myts Discussion: https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/ImJrvQEECwA

Gpload regression case apache#50 would fail, if OS user that installed gpdb cluster is not gpadmin. Since it specified a hardcode user gpadmin, we need to use PGUSER instead to fix it. Co-authored-by: wuchengwen <wcw190496@alibaba-inc.com>

Previous commit ebb691a03 is to fix ANY sublink contains distinct-on syntax on partial exprs in the target list of subselect. That fix method simply not to drop distinct or order-by clause and might lead to some sublink cannot be pulled up and bad performance, below is an example of such regression: select a from t where a in (select b from t1 group by b) For the above SQL, if we do not drop the group-by clause, then later planner will take the subselect as not simple and fail to pull it up to a join. This commit fixes the regression by: 1. revert the logic of ebb691a03 2. fix the issue #12656 using a correct way: if the distinct on clause's length is not the same as the length of subselect's targetlist, we do not drop the distinct clause. Authored-by: Jian Guo <gjian@vmware.com> Co-authored-by: Zhenghua Lyu <kainwen@gmail.com>

Added code to print thread id for passed or failed status to easily correlate the message result with the data directory. Some minor formatting along the way.

FIXMEs were related to adding support for checking dynamic access methods. Given we are far away from supporting all built-in access methods, dynamic support is not on the radar at all. Plus, having a new access method is not a common act at all. So, we can cross the bridge of converting RelationTypeData to a dynamic array instead of the static array when we get there. Hence, for now, resolving the FIXME by manually updating the static array to reflect all the current access methods. Note: with this change now "ao" gets split into ao_row and ao_column. This is nice as can run more granular checking if desired. Also, converted GPDB_94_MERGE_FIXME to GPDB_FEATURE_NOT_SUPPORTED in Makefile. As we are yet to test/validate/enable this tool for various index access methods - which we should do separately.

Fix typo.

Previously, gp_gettmid() returns PgStartTime as the start time of gpdb ("tmid") and saves it into instrument shared memory, but the value is wrong after converting to `TimestampTz` (which is int64). Since the value is saved in the shared memory, plus GPCC has been using 32-bit integer for tmid, adapting to int64 will require a lot of change in both GPDB's and GPCC's coed. Therefore, in this commit, gp_gettmid() converts PgStartTime to a correct int32 value instead. Co-authored-by: Violet Cheng <vcheng@vmware.com>

The effort was duplicated with set_append_path_locus(), which is called from create_append_path(). Let's just let set_append_path_locus() deal with it. Reviewed-by: Haotian Chen <chaotian@vmware.com> Reviewed-by: Zhenghua Lyu <kainwen@gmail.com>

External tables with dropped columns failed to upgrade with: ``` pg_restore: creating EXTERNAL TABLE ext_upgrade_failure_test pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 237; 1259 18142 EXTERNAL TABLE ext_upgrade_failure_test gpadmin pg_restore: [archiver (db)] could not execute query: ERROR: syntax error at or near "," LINE 14: "........pg.dropped.1........" , ^ ``` Fix pg_dump to output the ALTER TABLE DROP COLUMN DDL for external tables with dropped columns.

This commit makes the DML state nomenclature more canonical, cleans up struct declarations etc.

The backend local hash table mechanism works well enough for storing DML/DML-like command-bound state. Lift the FIXME that questions its applicability. Discussion: https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/g_RG4r_9W3s/m/VJNafJibEQAJ

Grouping them with the other uao_dml tests led to flakes. Move them to a new group. Heed the words of wisdom: "These tests use gp_select_invisible and VACUUM, and will get confused if there are concurrent transactions holding back the global xmin."

Fixes https://github.com/greenplum-db/gpdb/issues/13819 and also aligns with upstream Don't need to setup the tablespace when calling `make` or `make install`, it should be setup only for the right target like `check` or `check-tests`.

The `SET_VAR` function is indentical in gpcreateseg.sh and gpinitsystem, since they both source the file gp_bash_functions.sh Moving SET_VAR in gp_bash_functions.sh

Test case "dispatch started failing on PLANNER CI after commit a7ca9e27. Using github Issue #14428 for tracking. It seems somehow triggered by the test added in bfv_partition in aforementioned commit.

…. (#14418) As #14417 reported, there will be assertion failures in some cases when executing ModifyTable. This is because we opened all the range table relations in NoLock mode, but sometimes with no lock holding by us. We fixed this by using GpPolicyFetch instead of opening relations since we only need to judge if the range table is a partitioned table. This fix also resolved a GPDB_91_MERGE FIXME left around range table locking code.

…d unchanged. (#13499) CLUSTER operation on append-optimized tables does not compute freeze limit (frozenXid) because AO tables do not have relfrozenxid. The toast tables need to keep existing relfrozenxid value unchanged in this case.

Earlier gpcheckcat used to error out when root partition has different distribution policy than child partition. But commit a45be43 introduces a new statement `ALTER TABLE EXPAND PARTITION PREPARE` which modifies the distribution policy of the leaf partitions to be randomly distributed. Based on this a policy is good if the following conditions are satisfied: - numsegments value is equal for all root, middle and leaf level partitions - if a root is randomly distributed, then all middle, leaf level partitions must also be randomly distributed - if a root is hash distributed, then middle level should be same as root and leaf level partition can be hash on same key or randomly distributed This commit modifies the existing SQL query such that whenever a parent is hash distributed, the child can be randomly distributed if and only if it is a leaf level partition. Also added a few behave test cases for the same.

For a partitioned table to have a correct distribution policy, the numsegments value for all root, middle and leaf level partitions must be equal. This commit adds this check for gpcheckcat. Readable external partitions are excluded from this check. Sample log output for numsegments check failure: 20221103:11:36:30:098659 gpcheckcat:jnihal-a01:jnihal-[INFO]:-[FAIL] partition numsegments check 20221103:11:36:30:098659 gpcheckcat:jnihal-a01:jnihal-[ERROR]:-partition numsegments check found 1 issue(s) [ERROR]: child partition(s) have different numsegments value from the root partition. Check the gpcheckcat log for details. 20221103:11:36:30:098659 gpcheckcat:jnihal-a01:jnihal-[ERROR]:-The following tables have different numsegments value: 20221103:11:36:30:098659 gpcheckcat:jnihal-a01:jnihal-[ERROR]:--------- 20221103:11:36:30:098659 gpcheckcat:jnihal-a01:jnihal-[ERROR]:- table | affected child | parent numsegments value | child numsegments value 20221103:11:36:30:098659 gpcheckcat:jnihal-a01:jnihal-[ERROR]:- ------+----------------+--------------------------+------------------------ 20221103:11:36:30:098659 gpcheckcat:jnihal-a01:jnihal-[ERROR]:- public.sales_1_prt_10 | public.sales_1_prt_10_2_prt_asia | 3 | 1

The pg_ctl promote wait logic checks for IN_PRODUCTION state in the promoted coordinator's control file to determine if the standby coordinator has successfully been promoted or not. Unfortunately, that IN_PRODUCTION state does not guarantee that the promoted coordinator is ready to accept database connections yet. Make gpactivatestandby do a retry loop on its forced CHECKPOINT logic to guarantee that the CHECKPOINT goes through and immediately reflects the new timeline id. With this patch, intermittent gpactivatestandby Behave test flakes seen frequently in the Concourse pipeline are resolved. The logic in this patch is also actually somewhat of a revert to previous logic as we used to do the very same retry loop. The retry loop was removed because we wrongly assumed pg_ctl promote wait logic would wait until database connections were ready to be accepted. The try/catch would actually catch the error but it was being ignored (which was intentional from the previous retry logic but wrong in the non-retry logic). GPDB commit references: https://github.com/greenplum-db/gpdb/commit/1fe901d4bb9b141fd6931977eb93515982ac6edd https://github.com/greenplum-db/gpdb/commit/5868160a73ba9cfcebdb61820635301bfb8cc0c1

After we reload the config using pg_reload_conf, ic-proxy progress will call ic_proxy_server_on_signal to process, It will compare the new addrs with the old addrs, both the old and new lists must be sorted by dbid, the caller is responsible to ensure this. If the addr is removed, it will be disconnected, if some addrs are added, we will establish the connection, however, old addrs maybe not sorted by dbid in guc gp_interconnect_proxy_addresses, although it is sorted, the function uv_getaddrinfo is asynchronous and can not guarantee the addrs parsed from gp_interconnect_proxy_addresses are added sequentially to ic_proxy_addrs. So some addr may be mis-disconnected and lead to select gp_segment_id, pg_reload_conf() from gp_dist_random('gp_id') failed, Error interconnect error: connection closed prematurely Before reloading the config file. we should sort ic_proxy_addrs by dbid to avoid mis-disconnecting of addrs.

hash_search(,,HASH_REMOVE,) returns a dangling pointer that shouldn't be dereferenced.

A FIXME was left noting whether to use AccessShareLock or NoLock when opening child tables to merge their stats. Since we have acquired ShareUpdateExclusiveLock on the parent table when we are at merge_leaf_stats(), we don't need extra lock to guard against concurrent DROP of either the parent or the child (which requries AccessExclusiveLock on the parent). Concurrent UPDATE is possible but because we are not updating the table ourselves, NoLock is sufficient here.

my-ship-it

LGTM

chrishajas and others added 30 commits December 31, 2024 12:52

pg_upgrade: Resolve FIXME for spclocation column

32a71b9

For GPDB5 pg_tablespace.spclocation is used For GPDB6+ pg_catalog.pg_tablespace_location(oid) is used Authored-by: Brent Doil <bdoil@vmware.com>

pg_upgrade: Resolve test FIXME

d6871df

We don't drop partitioned tables with indexes for the pg_upgrade regression test on GPDB7. Remove the commented out commands and FIXME. Authored-by: Brent Doil <bdoil@vmware.com>

Removed unnecessary variable assignment.

ff63c32

The same value is assigned to the variable at the moment of declaration (some lines before)

Remove fixme in qp_correlated_query. (#14314)

5549eb7

The commit faa7dc0713 fix the fallback but forget to remove FIXME.

Change FIXME to note in join_hash case.

dd5d17e

The case is to test some code path during parallel hash join but Greenplum does not support Postgres' parallel plan. Changes the FIXME to a NOTE.

Remove FIXME in bfv_olap_optimizer.out. (#14319)

3d136ee

Now the SQL does not fallback to planner.

gp_replica_check: cosmetic changes to output

c78b5f9

Added code to print thread id for passed or failed status to easily correlate the message result with the data directory. Some minor formatting along the way.

make headerfile accessible to pxf extension (#14046)

3d248b7

Fix typo. (#14290)

5ecd1e7

Fix typo.

ao/co: Refactor DML states

ba1a0ab

This commit makes the DML state nomenclature more canonical, cleans up struct declarations etc.

zhjwpku and others added 7 commits December 31, 2024 12:54

Refactor to extract SET_VAR to gp_bash_functions.sh

344035c

The `SET_VAR` function is indentical in gpcreateseg.sh and gpinitsystem, since they both source the file gp_bash_functions.sh Moving SET_VAR in gp_bash_functions.sh

Workaround flaky test "dispatch" (#14429)

1763fbe

Test case "dispatch started failing on PLANNER CI after commit a7ca9e27. Using github Issue #14428 for tracking. It seems somehow triggered by the test added in bfv_partition in aforementioned commit.

yjhjstz added the cherry-pick cherry-pick upstream commts label Dec 31, 2024

yjhjstz force-pushed the pick_1230 branch from 4186b87 to 764700f Compare December 31, 2024 07:50

yjhjstz and others added 3 commits December 31, 2024 15:59

ao/co: don't dereference dangling pointers (#14468)

8f049ab

hash_search(,,HASH_REMOVE,) returns a dangling pointer that shouldn't be dereferenced.

yjhjstz force-pushed the pick_1230 branch from 764700f to d2d5f85 Compare December 31, 2024 07:59

my-ship-it previously approved these changes Dec 31, 2024

View reviewed changes

yjhjstz mentioned this pull request Dec 31, 2024

Pick Bring back cdbsubselect_drop_distinct. etc #821

Closed

12 tasks

Fix test cases of cherry-pick.

fde267e

yjhjstz dismissed my-ship-it’s stale review via fde267e December 31, 2024 10:06

yjhjstz force-pushed the pick_1230 branch from d2d5f85 to fde267e Compare December 31, 2024 10:06

my-ship-it approved these changes Dec 31, 2024

View reviewed changes

jiaqizho approved these changes Jan 2, 2025

View reviewed changes

yjhjstz merged commit 4bd2d57 into apache:main Jan 2, 2025
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CherryPick Resolve a FIXME in merge_leaf_stats() etc #827

CherryPick Resolve a FIXME in merge_leaf_stats() etc #827

yjhjstz commented Dec 31, 2024

my-ship-it left a comment

CherryPick Resolve a FIXME in merge_leaf_stats() etc #827

CherryPick Resolve a FIXME in merge_leaf_stats() etc #827

Conversation

yjhjstz commented Dec 31, 2024

What does this PR do?

Type of Change

Breaking Changes

Test Plan

Impact

Checklist

Additional Context

CI Skip Instructions

my-ship-it left a comment

Choose a reason for hiding this comment