-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PG17 compatibility (#7653): Fix test diffs in columnar schedule #7768
Conversation
8bcee73
to
15b3bf0
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## release-13.0 #7768 +/- ##
===============================================
Coverage ? 89.65%
===============================================
Files ? 274
Lines ? 59583
Branches ? 7436
===============================================
Hits ? 53418
Misses ? 4033
Partials ? 2132 |
c396ce6
to
a12026b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, good fix on columnar_chunk_filtering
test.
I am a bit reluctant on the alternative output of columnar_paths
, but I also don't have another idea, because that particular test is part of a long comparison between custom scan, index scan etc, therefore removing that test altogether would break the meaning of the test. So, I am approving, and I leave it up to you.
Reminder to change the base branch to release-13.0
and drop the "enable configure" commit
Add an alternative file for columnar_paths. The diff is because of PG17 commit: postgres/postgres@f7816ae which enabled the planner to use statistics to estimate the cardinality of CTE scans. Add a helper function for columnnar_chunk_filtering. The diff is due to: postgres/postgres@fd0398fc which changed how subquery outputs appear in expressions.
15b3bf0
to
1d11d3d
Compare
…s. (#7785) PG17 added support for identity columns in partitioned tables: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=699586315 A consequence is that a table with an identity column cannot be attached as a partition. But Citus on Postgres 17 will generate identity column for the partitions if the parent table has one (or more) identity columns when propagating distributed table DDL to worker nodes, as happens in the `generated_identity` regress test in #7768: ``` CREATE TABLE partitioned_table ( a bigint CONSTRAINT myconname GENERATED BY DEFAULT AS IDENTITY (START WITH 10 INCREMENT BY 10), b bigint GENERATED ALWAYS AS IDENTITY (START WITH 10 INCREMENT BY 10), c int ) PARTITION BY RANGE (c); CREATE TABLE partitioned_table_1_50 PARTITION OF partitioned_table FOR VALUES FROM (1) TO (50); CREATE TABLE partitioned_table_50_500 PARTITION OF partitioned_table FOR VALUES FROM (50) TO (1000); SELECT create_distributed_table('partitioned_table', 'a'); - create_distributed_table ---------------------------------------------------------------------- - -(1 row) - +ERROR: table "partitioned_table_1_50" being attached contains an identity column "a" +DETAIL: The new partition may not contain an identity column. ``` It is the Citus-generated ATTACH PARTITION statement that errors out, because the Citus-generated CREATE TABLE for the partitions included identity column definitions. The fix is straightforward - when propagating the CREATE TABLE ddl for a partition of a table with an identity column, don't include the identity column(s), they will be inherited on attaching the partition. In Citus on Postgres 16 (or less) partitions do not inherit identity; the partitions in the example would not have any identity columns so it was not an issue previously.
This is the final commit that adds PG17 compatibility with Citus's current capabilities. You can use Citus community, release-13.0 branch, with PG17.1. --------- Specifically, this commit: - Enables PG17 in the configure script. - Adds PG17 tests to CI using test images that have 17.1 - Fixes an upgrade test: see below for details In `citus_prepare_upgrade()`, don't drop any_value when upgrading from PG16+, because PG16+ has its own any_value function. Attempting to do so results in the error seen in [pg16-pg17 upgrade](https://github.com/citusdata/citus/actions/runs/11768444117/job/32778340003?pr=7661): ``` ERROR: cannot drop function any_value(anyelement) because it is required by the database system CONTEXT: SQL statement "DROP AGGREGATE IF EXISTS pg_catalog.any_value(anyelement)" ``` When 16 becomes the minimum supported Postgres version, the drop statements can be removed. --------- Several PG17 Compatibility commits have been merged before this final one. All these subtasks are done #7653 See the list below: Compilation PR: #7699 Ruleutils PR: #7725 Sister PR for tests: citusdata/the-process#159 Helpful smaller PRs: - #7714 - #7726 - #7731 - #7732 - #7733 - #7738 - #7745 - #7747 - #7748 - #7749 - #7752 - #7755 - #7757 - #7759 - #7760 - #7761 - #7762 - #7765 - #7766 - #7768 - #7769 - #7771 - #7774 - #7776 - #7780 - #7781 - #7785 - #7788 - #7793 - #7796 --------- Co-authored-by: Colm <colmmchugh@microsoft.com>
This PR fixes diffs in
columnnar_chunk_filtering
andcolumnar_paths
tests.In
columnnar_chunk_filtering
an expression(NOT (SubPlan 1))
changed to(NOT (ANY (a = (SubPlan 1).col1)))
. This is due to aPG17 commit that improved how scalar subqueries (InitPlans) and ANY subqueries (SubPlans) are EXPLAINed in expressions. The fix uses a helper function which converts the PG17 format to the pre-PG17 format. It is done this way because pre-PG17 EXPLAIN does not provide enough context to convert to the PG17 format. The helper function can (and should) be retired when 17 becomes the minimum supported PG.In
columnar_paths
, a merge join changed to a hash join. This is due to this PG17 commit, which improved the PG optimizer's ability to estimate the size of a CTE scan. The impacted query involves a CTE scan with a point predicate(a=123)
and before the change the CTE size was estimated to be 5000, but with the change it is correctly (given the data in the table) estimated to be 1, making hash join a more attractive join method. The fix is to have an alternative goldfile for pre-PG17. I tried, but was unable, to force a specific kind of join method using the GUCs (enable_nestloop
,enable_hashjoin
,enable_mergejoin
), but it was not possible to obtain a consistent plan across all supported PG versions (in some cases the join inputs switched sides).