Use FDW to query multiple servers as shards #320

oppenheimer01 · 2023-12-04T10:55:23Z

This commit mainly meets the needs of users to query multiple clusters as external shards. fdw treats each data source as a whole without knowing its internal structure. It keeps requesters and data sources properly decoupled to maintain generality.

Add a new catalog table pg_foreign_table_seg to enable multiple shards in foreign table. The foreign table should be treated as a shard table with strewn locus. Each QE scanning the foreign should got a shard from pg_foreign_table_seg.

Considering that the size of the computing cluster and the number of shards of the foreign table may be inconsistent. Use flexible gang to generate the same number of scan nodes as foreign table shards.

Considering that the data bandwidth between different data centers is limited, we need to reduce the data transmission of fdw as much as possible. Pushing the execution node down to the remote end as much as possible can reduce data transmission.

If all tables of a subtree are distributed in the same foreign server collection, It can be pushed down. But in mpp-fdw, we should consider if a table only joinning the shared in same foreign server. So a new system attribute gp_foreign_server was add to the foreign table. If the customer add "t1.gp_foreign_server = t2.gp_foreign_server" to join condition. It should be pushed down.

We can only push down the first stage of the two-stage aggregation. Multi-stage aggregation will use some intermediate types. Some of these intermediate types are external types that can be output externally, such as count, min, max, and sum. The intermediate and final types of these types are identical. Others are more complex internal types, such as avg, whose intermediate type is inconsistent with the final type and must be converted using a final function. Since the local node in FDW serves as a standard client to exchange data with the remote server, these internal types cannot be transmitted. So some of the aggregate functions such as "avg" should not be pushed down now.

fix #ISSUE_Number

Change logs

Describe your change clearly, including what problem is being solved or what feature is being added.

If it has some breaking backward or forward compatibility, please clary.

Why are the changes needed?

Describe why the changes are necessary.

Does this PR introduce any user-facing change?

If yes, please clarify the previous behavior and the change this PR proposes.

How was this patch tested?

Please detail how the changes were tested, including manual tests and any relevant unit or integration tests.

Contributor's Checklist

Here are some reminders and checklists before/when submitting your pull request, please check them:

Make sure your Pull Request has a clear title and commit message. You can take git-commit template as a reference.
Sign the Contributor License Agreement as prompted for your first-time contribution(One-time setup).
Learn the coding contribution guide, including our code conventions, workflow and more.
List your communication in the GitHub Issues or Discussions (if has or needed).
Document changes.
Add tests for the change
Pass make installcheck
Pass make -C src/test installcheck-cbdb-parallel
Feel free to request cloudberrydb/dev team for review and approval when your PR is ready🥳

CLAassistant · 2023-12-04T10:55:31Z

All committers have signed the CLA.

github-actions

Hiiii, @oppenheimer01 welcome!🎊 Thanks for taking the effort to make our project better! 🙌 Keep making such awesome contributions!

avamingli · 2023-12-05T02:11:06Z

Hi, @oppenheimer01 thanks for your contribution!
Awesome codes and rich description. I'll review this pr soon.

src/backend/optimizer/plan/setrefs.c

src/backend/cdb/dispatcher/cdbgang.c

src/include/nodes/parsenodes.h

contrib/postgres_fdw/expected/mpp_postgres_fdw.out

src/backend/cdb/cdbgroupingpaths.c

src/backend/foreign/foreign.c

src/backend/optimizer/plan/setrefs.c

src/include/nodes/pathnodes.h

contrib/postgres_fdw/expected/mpp_postgres_fdw.out

yjhjstz · 2023-12-18T02:56:34Z

ic-singlenode-test some errors.

oppenheimer01 · 2023-12-18T06:15:32Z

ic-singlenode-test some errors.

Got it.

src/backend/catalog/heap.c

src/backend/commands/foreigncmds.c

src/backend/foreign/foreign.c

This reverts commit afbeb68.

This commit mainly meets the needs of users to query multiple clusters as external shards. fdw treats each data source as a whole without knowing its internal structure. It keeps requesters and data sources properly decoupled to maintain generality. Add a new catalog table pg_foreign_table_seg to enable multiple shards in foreign table. The foreign table should be treated as a shard table with strewn locus. Each QE scanning the foreign should got a shard from pg_foreign_table_seg. Considering that the size of the computing cluster and the number of shards of the foreign table may be inconsistent. Use flexible gang to generate the same number of scan nodes as foreign table shards. Considering that the data bandwidth between different data centers is limited, we need to reduce the data transmission of fdw as much as possible. Pushing the execution node down to the remote end as much as possible can reduce data transmission. If all tables of a subtree are distributed in the same foreign server collection, It can be pushed down. But in mpp-fdw, we should consider if a table only joinning the shared in same foreign server. So a new system attribute gp_foreign_server was add to the foreign table. If the customer add "t1.gp_foreign_server = t2.gp_foreign_server" to join condition. It should be pushed down. We can only push down the first stage of the two-stage aggregation. Multi-stage aggregation will use some intermediate types. Some of these intermediate types are external types that can be output externally, such as count, min, max, and sum. The intermediate and final types of these types are identical. Others are more complex internal types, such as avg, whose intermediate type is inconsistent with the final type and must be converted using a final function. Since the local node in FDW serves as a standard client to exchange data with the remote server, these internal types cannot be transmitted. So some of the aggregate functions such as "avg" should not be pushed down now...

github-actions bot reviewed Dec 4, 2023

View reviewed changes

oppenheimer01 force-pushed the mpp_fdw branch from b0cf507 to 86b8658 Compare December 11, 2023 04:50

yjhjstz reviewed Dec 14, 2023

View reviewed changes

src/backend/optimizer/plan/setrefs.c Outdated Show resolved Hide resolved

yjhjstz reviewed Dec 14, 2023

View reviewed changes

src/backend/optimizer/plan/setrefs.c Outdated Show resolved Hide resolved

yjhjstz reviewed Dec 14, 2023

View reviewed changes

src/backend/cdb/dispatcher/cdbgang.c Show resolved Hide resolved

yjhjstz reviewed Dec 14, 2023

View reviewed changes

src/include/nodes/parsenodes.h Outdated Show resolved Hide resolved