Skip to content

Conversation

xudong963
Copy link
Member

@xudong963 xudong963 commented Mar 25, 2025

Rationale for this change

When I upgraded the df46, It was annoying for me to do the following thing(a series of downcast_ref) and it's also easy to call the wrong method, such as mixing file_source() and data_source():

if let Some(scan_config) = self.data_source().as_any().downcast_ref::<FileScanConfig>() {
      if let Some(parquet_source) = scan_config
          .file_source()
          .as_any()
          .downcast_ref::<ParquetSource>(){...}

What changes are included in this PR?

Add the downcast_to_source method for DataSourceExec to make life easy

Are these changes tested?

Yes, I replace the existing code with the new method.

Are there any user-facing changes?

It'll be useful for users to upgrade df46.

@github-actions github-actions bot added core Core DataFusion crate substrait Changes to the substrait crate proto Related to proto crate datasource Changes to the datasource crate labels Mar 25, 2025
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Mar 26, 2025
@xudong963
Copy link
Member Author

I added the change to the upgrading doc to let users find it easily: 4682fcd

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xudong963 -- I agree this is a nice improvement. I don't think we should mention this function in the 46 upgrade guide given that the function isn't available until 47

)
}

/// Downcast the `DataSourceExec` to a specific file source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Downcast the `DataSourceExec` to a specific file source
/// Downcast the `DataSourceExec`'s `data_source` to a specific file source
///
/// Returns `None` if
/// 1. the datasource is not scanning files (`FileScanConfig`)
/// 2. The [`FileScanConfig::file_source`] is not of type <T>

# */
```

There's also a more convenient helper method `downcast_to_file_source` on `DataSourceExec`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code will not be available until DataFusion 47, so we probably need to put this into a new heading for 47 upgrade guide

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i totally forgot that 🤦‍♂️

@github-actions github-actions bot removed the documentation Improvements or additions to documentation label Mar 27, 2025
@xudong963
Copy link
Member Author

xudong963 commented Mar 27, 2025

@mertak-synnada @alamb Thanks for your review!

@xudong963 xudong963 merged commit fdb4e84 into apache:main Mar 27, 2025
27 checks passed
qstommyshu pushed a commit to qstommyshu/datafusion that referenced this pull request Mar 28, 2025
* Add downcast_to_source method for DataSourceExec

* rename

* fix conflicts

* fix cippy

* add the change to upgrading doc

* prettier

* remove

* address comments
@alamb alamb mentioned this pull request Apr 14, 2025
9 tasks
nirnayroy pushed a commit to nirnayroy/datafusion that referenced this pull request May 2, 2025
* Add downcast_to_source method for DataSourceExec

* rename

* fix conflicts

* fix cippy

* add the change to upgrading doc

* prettier

* remove

* address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate datasource Changes to the datasource crate proto Related to proto crate substrait Changes to the substrait crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants