Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PG17 compatibility (#7653): Fix test diffs in columnar schedule #7768

Merged
merged 1 commit into from
Dec 3, 2024

Conversation

colm-mchugh
Copy link
Contributor

@colm-mchugh colm-mchugh commented Nov 25, 2024

This PR fixes diffs in columnnar_chunk_filtering and columnar_paths tests.

In columnnar_chunk_filtering an expression (NOT (SubPlan 1)) changed to (NOT (ANY (a = (SubPlan 1).col1))). This is due to aPG17 commit that improved how scalar subqueries (InitPlans) and ANY subqueries (SubPlans) are EXPLAINed in expressions. The fix uses a helper function which converts the PG17 format to the pre-PG17 format. It is done this way because pre-PG17 EXPLAIN does not provide enough context to convert to the PG17 format. The helper function can (and should) be retired when 17 becomes the minimum supported PG.

In columnar_paths, a merge join changed to a hash join. This is due to this PG17 commit, which improved the PG optimizer's ability to estimate the size of a CTE scan. The impacted query involves a CTE scan with a point predicate (a=123) and before the change the CTE size was estimated to be 5000, but with the change it is correctly (given the data in the table) estimated to be 1, making hash join a more attractive join method. The fix is to have an alternative goldfile for pre-PG17. I tried, but was unable, to force a specific kind of join method using the GUCs (enable_nestloop, enable_hashjoin, enable_mergejoin), but it was not possible to obtain a consistent plan across all supported PG versions (in some cases the join inputs switched sides).

@colm-mchugh colm-mchugh self-assigned this Nov 25, 2024
@colm-mchugh colm-mchugh force-pushed the cmchugh/pg17-columnar-fixes branch from 8bcee73 to 15b3bf0 Compare November 25, 2024 16:54
Copy link

codecov bot commented Nov 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (release-13.0@5f479d5). Learn more about missing BASE report.

Additional details and impacted files
@@               Coverage Diff               @@
##             release-13.0    #7768   +/-   ##
===============================================
  Coverage                ?   89.65%           
===============================================
  Files                   ?      274           
  Lines                   ?    59583           
  Branches                ?     7436           
===============================================
  Hits                    ?    53418           
  Misses                  ?     4033           
  Partials                ?     2132           

@colm-mchugh colm-mchugh changed the title PG17 regress sanity (#7653): Fix test diffs in columnar schedule PG17 compatibility (#7653): Fix test diffs in columnar schedule Nov 26, 2024
@naisila naisila force-pushed the naisila/pg17_support branch from c396ce6 to a12026b Compare December 2, 2024 14:32
Copy link
Member

@naisila naisila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good fix on columnar_chunk_filtering test.

I am a bit reluctant on the alternative output of columnar_paths, but I also don't have another idea, because that particular test is part of a long comparison between custom scan, index scan etc, therefore removing that test altogether would break the meaning of the test. So, I am approving, and I leave it up to you.

Reminder to change the base branch to release-13.0 and drop the "enable configure" commit

@colm-mchugh colm-mchugh changed the base branch from naisila/pg17_support to release-13.0 December 3, 2024 08:47
Add an alternative file for columnar_paths. The diff is because of PG17 commit:
postgres/postgres@f7816ae
which enabled the planner to use statistics to estimate the cardinality
of CTE scans.

Add a helper function for columnnar_chunk_filtering. The diff is due to:
postgres/postgres@fd0398fc
which changed how subquery outputs appear in expressions.
@colm-mchugh colm-mchugh force-pushed the cmchugh/pg17-columnar-fixes branch from 15b3bf0 to 1d11d3d Compare December 3, 2024 08:50
@colm-mchugh colm-mchugh merged commit 698699d into release-13.0 Dec 3, 2024
120 of 121 checks passed
@colm-mchugh colm-mchugh deleted the cmchugh/pg17-columnar-fixes branch December 3, 2024 09:14
colm-mchugh added a commit that referenced this pull request Dec 18, 2024
…s. (#7785)

PG17 added support for identity columns in partitioned tables:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=699586315
A consequence is that a table with an identity column cannot be attached
as a partition. But Citus on Postgres 17 will generate identity column
for the partitions if the parent table has one (or more) identity
columns when propagating distributed table DDL to worker nodes, as
happens in the `generated_identity` regress test in #7768:
```
 CREATE TABLE partitioned_table (
     a bigint CONSTRAINT myconname GENERATED BY DEFAULT AS IDENTITY (START WITH 10 INCREMENT BY 10),
     b bigint GENERATED ALWAYS AS IDENTITY (START WITH 10 INCREMENT BY 10),
     c int
 )
 PARTITION BY RANGE (c);
 CREATE TABLE partitioned_table_1_50 PARTITION OF partitioned_table FOR VALUES FROM (1) TO (50);
 CREATE TABLE partitioned_table_50_500 PARTITION OF partitioned_table FOR VALUES FROM (50) TO (1000);
 SELECT create_distributed_table('partitioned_table', 'a');
- create_distributed_table
----------------------------------------------------------------------
-
-(1 row)
-
+ERROR:  table "partitioned_table_1_50" being attached contains an identity column "a"
+DETAIL:  The new partition may not contain an identity column.
```
It is the Citus-generated ATTACH PARTITION statement that errors out,
because the Citus-generated CREATE TABLE for the partitions included
identity column definitions. The fix is straightforward - when
propagating the CREATE TABLE ddl for a partition of a table with an
identity column, don't include the identity column(s), they will be
inherited on attaching the partition. In Citus on Postgres 16 (or less)
partitions do not inherit identity; the partitions in the example would
not have any identity columns so it was not an issue previously.
naisila added a commit that referenced this pull request Dec 24, 2024
This is the final commit that adds
PG17 compatibility with Citus's current capabilities.

You can use Citus community, release-13.0 branch, with PG17.1.

---------

Specifically, this commit:

- Enables PG17 in the configure script.

- Adds PG17 tests to CI using test images that have 17.1

- Fixes an upgrade test: see below for details
In `citus_prepare_upgrade()`, don't drop any_value when upgrading from
PG16+, because PG16+ has its own any_value function. Attempting to do so
results in the error seen in [pg16-pg17
upgrade](https://github.com/citusdata/citus/actions/runs/11768444117/job/32778340003?pr=7661):
```
ERROR:  cannot drop function any_value(anyelement) because it is required by the database system
CONTEXT:  SQL statement "DROP AGGREGATE IF EXISTS pg_catalog.any_value(anyelement)"
```
When 16 becomes the minimum supported Postgres version, the drop
statements can be removed.

---------

Several PG17 Compatibility commits have been merged before this final one.
All these subtasks are done #7653

See the list below:

Compilation PR: #7699
Ruleutils PR: #7725
Sister PR for tests: citusdata/the-process#159

Helpful smaller PRs:
- #7714
- #7726
- #7731
- #7732
- #7733
- #7738
- #7745
- #7747
- #7748
- #7749
- #7752
- #7755
- #7757
- #7759
- #7760
- #7761
- #7762
- #7765
- #7766
- #7768
- #7769
- #7771
- #7774
- #7776
- #7780
- #7781
- #7785
- #7788
- #7793
- #7796

---------

Co-authored-by: Colm <colmmchugh@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants