Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(format): add info codes for supported capabilities #1649

Merged
merged 12 commits into from
Apr 12, 2024

Conversation

joellubi
Copy link
Member

@joellubi joellubi commented Mar 20, 2024

Add the following info_codes to the C, Go, and Java ABDC definitions:

  • ADBC_INFO_VENDOR_SQL
  • ADBC_INFO_VENDOR_SUBSTRAIT
  • ADBC_INFO_VENDOR_SUBSTRAIT_MIN_VERSION
  • ADBC_INFO_VENDOR_SUBSTRAIT_MAX_VERSION

Also:

  • Add support for new info_codes to flightsql and snowflake drivers
  • Refactor driverbase to simplify driverinfo and make it more extensible

@github-actions github-actions bot added this to the ADBC Libraries 0.11.0 milestone Mar 20, 2024
@lidavidm
Copy link
Member

since it's technically a format change we would need to go through a vote

@lidavidm
Copy link
Member

cancellation is always best-effort anyways

@joellubi
Copy link
Member Author

Yeah I was going to post this on the mailing list initially but figured it would be easier to review/discuss with a link to the specific changes.

Does this seem like a reasonable change to open for discussion there?

@lidavidm
Copy link
Member

Yes, there's a number of things people want (https://github.com/apache/arrow-adbc/milestone/8) that need people to drive the changes

@joellubi
Copy link
Member Author

Ok got it. I added an issue #1650 that can be added to that milestone. I'm currently working on an implementation for this that I will push to this PR and open for discussion in the mailing list.

@lidavidm
Copy link
Member

Thanks!

@joellubi
Copy link
Member Author

@lidavidm Do you have any thoughts on what the name prefix for these should be? The current precedent is either ADBC_INFO_VENDOR_ or ADBC_INFO_DRIVER_. I don't think either of those really apply here, so perhaps just ADBC_INFO_ (e.g. ADBC_INFO_READ_ONLY). Should we put that in its own group starting at 200?

@lidavidm
Copy link
Member

I think vendor applies?

@joellubi
Copy link
Member Author

Works for me. I thought that in some cases SQL/Substrait may be supported by the vendor but not (yet) the driver implementation, but maybe just returning error unimplemented is better than trying to communicate via driver_info.

I'll add the java definitions and open it up to the mailing list to discuss.

@joellubi joellubi force-pushed the supported-format-info-codes branch from cd08afe to 197a470 Compare April 1, 2024 19:31
Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late comment, I was giving this a final look

adbc.h Outdated
/// \brief Indicates whether the driver connection is read only (type: bool).
///
/// \see AdbcConnectionGetInfo
#define ADBC_INFO_VENDOR_READ_ONLY 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Should this be an actual option (GetOption/SetOption), since it in principle is mutable in other APIs? e.g. JDBC Connection#setReadOnly. Flight SQL of course doesn't let you set this, but it may apply to other things in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I actually don't need this particular option, it just seemed useful while I was adding the others. So perhaps it's a better idea to leave it out (and maybe add it to options.h). I can update this tomorrow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed this change. It turns out that adbc.h already has ADBC_CONNECTION_OPTION_READ_ONLY defined, so I didn't need to add anything to options.h.

Thanks for the review, it would've been a mistake to make this an info code.

Copy link

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pushing this along @joellubi

/// \see AdbcConnectionGetInfo
#define ADBC_INFO_VENDOR_SUBSTRAIT 4
/// \brief The minimum supported Substrait version, or null if
/// Substrait is not supported (type: utf8).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would using a type other than utf8 for a version make sense?

The ADBC version appears to have a numeric encoding:

/// \since ADBC API revision 1.1.0
#define ADBC_VERSION_1_1_0 1001000

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a numeric encoding would have some benefits, but in this case utf8 is most consistent with existing encodings of substrait version.

For context the version field of SubstraitPlan is a string in FlightSql.proto. In substrait itself Version is a message that can either use numeric major.minor.patch versions or simply be a git_hash, so string may be required.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving a string makes sense to me then 👍

@@ -459,6 +459,24 @@ const struct AdbcError* AdbcErrorFromArrayStream(struct ArrowArrayStream* stream
///
/// \see AdbcConnectionGetInfo
#define ADBC_INFO_VENDOR_ARROW_VERSION 2
/// \brief Indicates whether SQL queries are supported (type: bool).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also make a new value for ADBC_VERSION ? 1.2.0 for example?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some discussion about this in #1650. What do you think about version 1.1.1?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean for the version returned in GetInfo, or also making a new constant that would be recognized by the driver manager and such?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I'm looking through the code now for references to the constant ADBC_VERSION_1_1_0, it does seem like a non-trivial amount of work to update the logic everywhere to account for ADBC_VERSION_1_1_1. Especially for such a minor and backward-compatible change, it doesn't seem worth it.

Perhaps we could just do a comment / documentation update here to denote the minor version bump? Also, a refactor of ADBC version handling might be helpful going forward if it would simplify the logic for various feature gates that would allow us to bump versions more seamlessly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We basically treat the version as an ABI version. Adding constants doesn't change the API, so there's no need to bump.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or at least, that's my justification for not having to deal with that mess here ^_^

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think considering the ADBC version to be an ABI version makes sense. If a driver implementation were to adopt this "version" it would not require any changes in behavior, as there's no actual requirement to send or recognize the new constants.

I think it may actually be more confusing to describe this as a new version, since there are no discernible differences in behavior between 1.1.0 and 1.1.1.

Any objections to moving forward with this PR as it currently is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with that.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me too -- sorry for not responding sooner. Thank you @joellubi and @lidavidm

@joellubi joellubi marked this pull request as ready for review April 11, 2024 17:44
@joellubi joellubi requested a review from alamb April 11, 2024 17:45
@joellubi joellubi merged commit aa04aad into apache:main Apr 12, 2024
68 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants