Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: add ExecutionPlan::file_scan_config to avoid downcasting #7175

Merged

Conversation

not-my-profile
Copy link
Contributor

I want to feature-gate AvroExec in a followup commit, so that you cannot get bitten by AvroExec::execute returning an NotImplemented error if the "avro" feature isn't enabled. (Since idiomatic Rust code should work if it compiles.)

As a preparation for that this commit gets rid of a plan_any.downcast_ref::<AvroExec>() call that couldn't be easily put behind a cfg(feature = "avro") without complicating the control flow.

Are there any user-facing changes?

Yes this PR adds the file_scan_config method to the ExecutionPlan trait.

@github-actions github-actions bot added the core Core DataFusion crate label Aug 2, 2023
Copy link
Contributor

@Dandandan Dandandan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a nice improvement to me

I want to feature-gate AvroExec in a followup commit, so
that you cannot get bitten by AvroExec::execute returning
an NotImplemented error if the "avro" feature isn't enabled.
(Since idiomatic Rust code should work if it compiles.)

As a preparation for that this commit gets rid of a
`plan_any.downcast_ref::<AvroExec>()` call that couldn't be easily put
behind a `cfg(feature = "avro")` without complicating the control flow.
@not-my-profile not-my-profile force-pushed the refactor-ExecutionPlan.file_scan_config branch from 6686b6c to 34d4e6c Compare August 2, 2023 11:46
@not-my-profile
Copy link
Contributor Author

not-my-profile commented Aug 2, 2023

I just slightly improved the doc comment for the new trait method:

-   /// Returns the [`FileScanConfig`] in case this is a physical execution plan or `None` otherwise.
+   /// Returns the [`FileScanConfig`] in case this is a data source scanning execution plan or `None` otherwise.


collector.push(file_groups);
Ok(VisitRecursion::Skip)
if let Some(file_scan_config) = plan.file_scan_config() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that certainly looks nicer

@alamb alamb merged commit 6a2d4a3 into apache:main Aug 3, 2023
21 checks passed
@alamb
Copy link
Contributor

alamb commented Aug 18, 2023

🤔 it seems like this has now introduced another dependency between datafusion and the physical plans (I am trying to cut out the physical plans into their own crate in #1754 )

I will futz with this and see if I can get an API that achieves both

@not-my-profile
Copy link
Contributor Author

not-my-profile commented Aug 18, 2023

Thanks for bringing this up here. I disagree that this PR introduced any new dependency ... it just made the existing dependency explicit. Since I think you should of course be able to implement your own data source scanning execution plan (which previously you could not because of the hard-coded downcasting in the physical plan).

So I think the solution here is to move the new file_scan_config method to a new DataSource trait (not so sure about the name) that execution plans have to implement in order to use them with the physical plan. What do you think? (take what I'm saying with a grain of salt, I'm not that familiar with datafusion).

@alamb
Copy link
Contributor

alamb commented Aug 21, 2023

So I think the solution here is to move the new file_scan_config method to a new DataSource trait (not so sure about the name) that execution plans have to implement in order to use them with the physical plan. What do you think? (take what I'm saying with a grain of salt, I'm not that familiar with datafusion).

I think that is a great idea, and I will create a PR shortly for discussion

@alamb
Copy link
Contributor

alamb commented Aug 21, 2023

Thanks - I filed #7357 to track the work / continue the conversation so it wasn't lost on this merged PR. Let's continue the conversation there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants