Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cherry-pick commit Support intermediate aggs in Orca plans (#13707) #741

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

leborchuk
Copy link
Contributor

@leborchuk leborchuk commented Nov 29, 2024

Here I cherry-picked commit Jun 28, 2022 Support intermediate aggs in Orca plans (#13707) greenplum-db/gpdb-archive@5bf9fd5

This commit conflicts with 0a2bc5d Move per-agg and per-trans duplicate finding to the planner. - refactoring made by hlinnaka@

The interesting thing is that the commit is from November 24th, 2020 (which is more than a year older than the CBDB fork date) and is absent in the original gpdb repository. Also, I didn't find a PR where it was committed. It seems for me like a CBDB feature.

So I decided to create separate PR only for that commit

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

⚠️ To skip CI: Add [skip ci] to your PR title. Only use when necessary! ⚠️


Orca in GPDB6 has support for intermediate aggs, which isn't used in postgres.
This is useful when we have a DQA and a regular "ride-along" agg.
However, we need to differentiate when we should run the
combine/final/trans functions when this ride-along agg is present.

This commit re-adds support for intermediate aggs. The logic here is the
same as 6X, however, instead of explicitly using the aggstage, we use
the aggsplit, which is determined from the aggstage. The logic is
defined in `AGGSPLIT_INTERNMEDIATE`.

The changes in nodeAgg.c are to allow the aggref and aggstate to differ
for an aggregate. This is necessary and expected in the case of an
intermediate agg, as the loop will iterate over each aggstate->aggs, but
the aggsplit can now be different between the aggref and the aggstate.
Thus the aggsplit references are also changed to use aggref instead of
aggstate.
@my-ship-it my-ship-it requested a review from jiaqizho December 2, 2024 02:34
@jiaqizho
Copy link
Contributor

jiaqizho commented Dec 2, 2024

need waitting the ic-good-opt-on enable... current CI have not cover the ORCA in ICW tests..

@my-ship-it
Copy link
Contributor

need waitting the ic-good-opt-on enable... current CI have not cover the ORCA in ICW tests..

@edespino Ed, appreciate your help on it.

@edespino
Copy link
Contributor

edespino commented Dec 2, 2024

need waitting the ic-good-opt-on enable... current CI have not cover the ORCA in ICW tests..

@edespino Ed, appreciate your help on it.

I have a branch with the changes. My repos's workflow is running: https://github.com/edespino/cloudberry/actions/runs/12115945979

@edespino
Copy link
Contributor

edespino commented Dec 2, 2024

@my-ship-it & @jiaqizho

My branch has a single commit based on this PR branch that is necessary to enable the ic-good-opt-on test configuration. Please note, it may have diffs that may or may not be related to this PR.

My branch: https://github.com/edespino/cloudberry/tree/refs/heads/CherryPickOrcaFeb2023_3-ic-good-opt-on
My workflow: https://github.com/edespino/cloudberry/actions/runs/12115945979

@avamingli avamingli added the cherry-pick cherry-pick upstream commts label Dec 4, 2024
@leborchuk
Copy link
Contributor Author

It seems to me that the ic-good-opt-off tests are flaky.

@edespino Ed, appreciate you help. ic-good-opt-on tests from Ed branch are green.

@my-ship-it
Copy link
Contributor

It seems to me that the ic-good-opt-off tests are flaky.

@edespino Ed, appreciate you help. ic-good-opt-on tests from Ed branch are green.

@leborchuk @jiaqizho We need to investigate what flaky and fix them, thanks!

@edespino
Copy link
Contributor

edespino commented Dec 9, 2024

It seems to me that the ic-good-opt-off tests are flaky.

@edespino Ed, appreciate you help. ic-good-opt-on tests from Ed branch are green.

Thanks for flagging this. Since these flaky test failures involve the orca query optimizer behavior, we should first have an engineer familiar with the orca query optimizer review the test issues. Once they can determine whether this is a product issue or an environmental/infrastructure problem, I can assist with any infrastructure-related fixes needed. Please loop me back in after that assessment.

@leborchuk
Copy link
Contributor Author

leborchuk commented Dec 16, 2024

I see here only failed

test uao_compaction/stats         ... FAILED      333 ms (diff  123 ms)
test uao_compaction/index_stats   ... FAILED      575 ms (diff  139 ms)
test uao_compaction/index         ... ok          161 ms (diff   59 ms)
test uao_compaction/drop_column   ... FAILED      165 ms (diff  122 ms)

tests in Apache Cloudberry Build / ic-good-opt-off (pull_request) workflow run

I'm sure they will succeed if I run them again. (I cannot do it myself; I do not have the rights to launch the tests one more time.)

I opened issue to fix it - #789

@jiaqizho
Copy link
Contributor

rebase pls. ic-good-opt-on already added in CI

@jiaqizho
Copy link
Contributor

CREATE SEQUENCE complex_seq CACHE 1;
CREATE TABLE complex_ttbl (id INT4 DEFAULT NEXTVAL('complex_seq'), c COMPLEX) DISTRIBUTED BY (c);

INSERT INTO complex_ttbl(c) VALUES (COMPLEX(NEXTVAL('complex_seq'), NEXTVAL('complex_seq')));
INSERT INTO complex_ttbl(c) VALUES (COMPLEX(NEXTVAL('complex_seq'), NEXTVAL('complex_seq')));
INSERT INTO complex_ttbl(c) VALUES (COMPLEX(NEXTVAL('complex_seq'), NEXTVAL('complex_seq')));
INSERT INTO complex_ttbl(c) VALUES (COMPLEX(NEXTVAL('complex_seq'), NEXTVAL('complex_seq')));
INSERT INTO complex_ttbl(c) VALUES (COMPLEX(NEXTVAL('complex_seq'), NEXTVAL('complex_seq')));
INSERT INTO complex_ttbl(c) VALUES (COMPLEX(NEXTVAL('complex_seq'), NEXTVAL('complex_seq')));

INSERT INTO complex_ttbl(c) VALUES (COMPLEX(0, 0));
INSERT INTO complex_ttbl(c) VALUES (COMPLEX(0, -0));
INSERT INTO complex_ttbl(c) VALUES (COMPLEX(-0, 0));
INSERT INTO complex_ttbl(c) VALUES (COMPLEX(-0, -0));
SELECT COUNT(c) = 4 AS tr, COUNT(DISTINCT gp_segment_id) = 1 AS tr 
	FROM complex_ttbl 
	WHERE re(c) = 0 AND im(c) = 0;

hi @leborchuk, check this case pls, i think executor missing some implements..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick cherry-pick upstream commts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants