Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Change SchemaProvider::table to return Result<Option<..> rather than Option<..> #9307

Merged
merged 1 commit into from
Feb 27, 2024

Conversation

crepererum
Copy link
Contributor

Which issue does this PR close?

Closes #9305.

Rationale for this change

SchemaProvider::table is async and clearly allows IO:

https://github.com/apache/arrow-datafusion/blob/6fad5ed7a37c50b9c200f214c3e13b0e1f0cecbc/datafusion/core/src/catalog/schema.rs#L114-L116

However this method cannot fail. Casting errors to None is semantically wrong.

What changes are included in this PR?

Change return type from Option<Arc<dyn TableProvider>> to Result<Option<Arc<dyn TableProvider>>, DataFusionError>.

Are these changes tested?

It compiles.

Are there any user-facing changes?

Breaking: Interface change.

@github-actions github-actions bot added the core Core DataFusion crate label Feb 21, 2024
@crepererum crepererum force-pushed the crepererum/issue9305 branch 3 times, most recently from fce8dab to 6cc8fc2 Compare February 21, 2024 16:29
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks @crepererum
please address the minor.

Basically I'm thinking to create a type alias for Result<Option> as such nested generics are hard to read

@crepererum
Copy link
Contributor Author

Basically I'm thinking to create a type alias for Result as such nested generics are hard to read

I agree on the readability aspects, but I also see the following counter points:

  1. The name of that alias is likely not shorter
  2. Naming a "result of an option" is hard, esp. since it should make sure it's not an "optional result". To my knowledge there is no ecosystem-wide wording for it
  3. Any alias is an indirection that the reader has to map mentally, while w/o it you can instantly see the type

@comphead
Copy link
Contributor

Basically I'm thinking to create a type alias for Result as such nested generics are hard to read

I agree on the readability aspects, but I also see the following counter points:

  1. The name of that alias is likely not shorter
  2. Naming a "result of an option" is hard, esp. since it should make sure it's not an "optional result". To my knowledge there is no ecosystem-wide wording for it
  3. Any alias is an indirection that the reader has to map mentally, while w/o it you can instantly see the type

That is also true. Maybe if we can come up with short, concise and meaningful type name

@comphead comphead added the api change Changes the API exposed to users of the crate label Feb 26, 2024
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @crepererum since it is an api change I'll wait for second reviewer.
@alamb @Jefffrey may help with the second review?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR @crepererum and @comphead for the review.

I think returning a Result<Option<..>> is somewhat common in the DataFusion APIs so this PR makes the APIs more consistent as well as fixing the issues identified in the PR description

For example in SchemaProvider register_table and deregister_table both return Result<Option<..>>:

    fn register_table(
        &self,
        name: String,
        table: Arc<dyn TableProvider>,
    ) -> Result<Option<Arc<dyn TableProvider>>> {
...

    fn deregister_table(&self, name: &str) -> Result<Option<Arc<dyn TableProvider>>> {
...

I agree that Result<Option<Arc<dyn TableProvider>>> is somewhat 🤮 to read, but as @crepererum and @comphead have pointed out there isn't an obviously better alternative

datafusion-cli/src/catalog.rs Outdated Show resolved Hide resolved
@alamb alamb changed the title refactor: SchemaProvider::table can fail refactor: Change SchemaProvider::table to return Result<Option<..> rather than Option<..> Feb 26, 2024
@crepererum crepererum merged commit 8f3d1ef into apache:main Feb 27, 2024
23 checks passed
@crepererum crepererum deleted the crepererum/issue9305 branch February 27, 2024 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SchemaProvider::table should return Result
3 participants