-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PG17 compatibility: fix plan diffs in multi_explain #7780
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## release-13.0 #7780 +/- ##
===============================================
Coverage ? 89.65%
===============================================
Files ? 274
Lines ? 59592
Branches ? 7436
===============================================
Hits ? 53427
Misses ? 4035
Partials ? 2130 |
1e55f40
to
1c39f95
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice fix, given that we mostly care about the complex subquery being pushed down in these queries.
Yet again avoiding the alternative test output, so great!
Reminder to rebase to release-13.0
, and drop the enable configure commit before merging.
c0baef0
to
7f7af62
Compare
This is the final commit that adds PG17 compatibility with Citus's current capabilities. You can use Citus community, release-13.0 branch, with PG17.1. --------- Specifically, this commit: - Enables PG17 in the configure script. - Adds PG17 tests to CI using test images that have 17.1 - Fixes an upgrade test: see below for details In `citus_prepare_upgrade()`, don't drop any_value when upgrading from PG16+, because PG16+ has its own any_value function. Attempting to do so results in the error seen in [pg16-pg17 upgrade](https://github.com/citusdata/citus/actions/runs/11768444117/job/32778340003?pr=7661): ``` ERROR: cannot drop function any_value(anyelement) because it is required by the database system CONTEXT: SQL statement "DROP AGGREGATE IF EXISTS pg_catalog.any_value(anyelement)" ``` When 16 becomes the minimum supported Postgres version, the drop statements can be removed. --------- Several PG17 Compatibility commits have been merged before this final one. All these subtasks are done #7653 See the list below: Compilation PR: #7699 Ruleutils PR: #7725 Sister PR for tests: citusdata/the-process#159 Helpful smaller PRs: - #7714 - #7726 - #7731 - #7732 - #7733 - #7738 - #7745 - #7747 - #7748 - #7749 - #7752 - #7755 - #7757 - #7759 - #7760 - #7761 - #7762 - #7765 - #7766 - #7768 - #7769 - #7771 - #7774 - #7776 - #7780 - #7781 - #7785 - #7788 - #7793 - #7796 --------- Co-authored-by: Colm <colmmchugh@microsoft.com>
Regress test
multi_explain
has two queries that have a different query plan with PG17. Here is part of the plan diff for the query labelled Union and left join subquery pushdown inmulti_explain.sql
(for the complete diff, search formulti_explain
here):The change is the same in both queries; a hash left join with subquery_1 on the outer and subquery_2 on the inner side of the join is now a nested loop left join with subquery_1 on the outer and subquery_2 on the inner; additionally, the chosen method of uniquifying the UNION in subquery_1 has changed from hashed grouping to sort followed by unique, as shown in the diff above.
The PG17 commit that caused this plan change is likely Fix MergeAppend to more accurately compute the number of rows that need to be sorted because it impacts the estimated rows counts of UNION paths. Comparing a costed plan of the query between PG16 and PG17 I noticed that with PG16 the rows estimate for the UNION in subquery_1 is 4, whereas with PG17 the rows estimate is 2. A lower rows estimate in the outer side of the join may result in nested loop looking cheaper than hash join for the left outer join, hence the plan change in the two queries where there is a UNION on the outer side of a left outer join.
The proposed fix achieves a consistent plan across all supported postgres versions by temporarily disabling nested loop join and sort for the two impacted queries; the postgres optimizer selects hash join for the outer left join and hashed aggregation for the UNION operation. I investigated tweaking the queries, but was not able to arrive at a consistent plan, and I believe the SQL operator (e.g. join, group by, union) implementations are orthogonal to the intent of the test, so this should be a satisfactory solution, particularly as it avoids introducing a second alternative output file for
multi_explain
.