Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(c/driver/postgresql): Implement Foreign Key information for GetObjects #757

Merged
merged 2 commits into from
Jun 12, 2023

Conversation

WillAyd
Copy link
Contributor

@WillAyd WillAyd commented Jun 9, 2023

No description provided.

@@ -402,6 +402,193 @@ TEST_F(PostgresConnectionTest, GetObjectsGetAllFindsPrimaryKey) {
ASSERT_TRUE(seen_primary_key) << "could not find primary key for adbc_pkey_test";
}

TEST_F(PostgresConnectionTest, GetObjectsGetAllFindsForeignKey) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially tried to combine this with the primary key test, but eventually moved away doing that as this was easier to express with its own set of tables. Still a good deal of shared elements that could be shared between the tests though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could possibly factor out a helper that just drops the given table + runs the given DDL so you can ASSERT_NO_FATAL_FAILURE(CreateTable("adbc_fkey_test", "CREATE TABLE ..."));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nested loop is...messy, I have had some ideas about how to simplify this sort of thing (basically, I think Arrow could use an IDL like Protobuf, where you can take a predefined schema and generate higher-level accessors/iterators/validators)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also thought about whether some complicated tests could belong in Python instead (with the caveats around that development loop)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I don't like the loops. The IDL would be nice. Also been toying with some util helpers to abstract it in another branch, just hasn't fully materialized yet

if ((fk_catalog_name_str == "postgres") &&
(fk_schema_name_str == "public") &&
(fk_table_name_str == "adbc_fkey_test_base")) {
// TODO: the current implementation makes it so the length of
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be overthinking this but we have for instance 5 constraints that added right now in the test suite, of which only 2 are foreign keys. The fact that the adbc specification treats the column names of the source table and the usages as separate lists I think means we would have to ensure they are always the same length to know which source column maps to which foreign key table column. Is that the intent?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if it's a foreign key it should be 1:1 so we know what column is being referenced by each column in the foreign key constraint.

@WillAyd
Copy link
Contributor Author

WillAyd commented Jun 9, 2023

I don't think the R failure is related to this PR?

@@ -402,6 +402,193 @@ TEST_F(PostgresConnectionTest, GetObjectsGetAllFindsPrimaryKey) {
ASSERT_TRUE(seen_primary_key) << "could not find primary key for adbc_pkey_test";
}

TEST_F(PostgresConnectionTest, GetObjectsGetAllFindsForeignKey) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could possibly factor out a helper that just drops the given table + runs the given DDL so you can ASSERT_NO_FATAL_FAILURE(CreateTable("adbc_fkey_test", "CREATE TABLE ..."));

@@ -402,6 +402,193 @@ TEST_F(PostgresConnectionTest, GetObjectsGetAllFindsPrimaryKey) {
ASSERT_TRUE(seen_primary_key) << "could not find primary key for adbc_pkey_test";
}

TEST_F(PostgresConnectionTest, GetObjectsGetAllFindsForeignKey) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nested loop is...messy, I have had some ideas about how to simplify this sort of thing (basically, I think Arrow could use an IDL like Protobuf, where you can take a predefined schema and generate higher-level accessors/iterators/validators)

@@ -402,6 +402,193 @@ TEST_F(PostgresConnectionTest, GetObjectsGetAllFindsPrimaryKey) {
ASSERT_TRUE(seen_primary_key) << "could not find primary key for adbc_pkey_test";
}

TEST_F(PostgresConnectionTest, GetObjectsGetAllFindsForeignKey) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also thought about whether some complicated tests could belong in Python instead (with the caveats around that development loop)

@lidavidm
Copy link
Member

I think the R failure is because the upstream nanoarrow package must've gotten out of sync

@lidavidm
Copy link
Member

I kicked the job but if it doesn't pass I don't think we have to consider it a blocker (Dewey has been working on the R stuff anyways)

@lidavidm
Copy link
Member

Alright, it passes. Do you want to add any changes here, or defer those for later PRs?

@WillAyd
Copy link
Contributor Author

WillAyd commented Jun 12, 2023

I think defer any changes to subsequent PRs; not pushing those off forever just think it would be too much to solve alongside implementing this

@lidavidm lidavidm merged commit eabc5b7 into apache:main Jun 12, 2023
@lidavidm lidavidm added this to the ADBC Libraries 0.5.0 milestone Jun 12, 2023
@WillAyd WillAyd deleted the get-cols-foreign-key branch June 12, 2023 16:15
@lidavidm
Copy link
Member

Frankly given #320 and #621 I'm wondering if we shouldn't retire GetObjects in favor of non-nested alternatives. The original design perhaps favored DuckDB and its ability to query nested data too much.

@lidavidm
Copy link
Member

(...though it would be very unfortunate to have duplicate APIs for this.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants