-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support scan empty projection #7920
Conversation
very strange, |
I think that is because avro is not a default feature You probably need to run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me. Thank you.
cc @Dandandan
datafusion/core/tests/custom_sources_cases/provider_filter_pushdown.rs
Outdated
Show resolved
Hide resolved
@@ -149,10 +149,6 @@ impl OptimizerRule for PushDownProjection { | |||
{ | |||
let mut used_columns: HashSet<Column> = HashSet::new(); | |||
if projection_is_empty { | |||
let field = find_small_field(scan.projected_schema.fields()).ok_or( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can remove the [LogicalPlan::Values
] case as well (https://github.com/apache/arrow-datafusion/pull/7920/files#diff-96a0620f2e490af5e30f17cfb0b6a0070fd99656dd4957966693894c816abe0bL166-L175)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we can remove it.already fix, thanks.
FYI @ch-sc |
datafusion/core/tests/custom_sources_cases/provider_filter_pushdown.rs
Outdated
Show resolved
Hide resolved
@@ -347,13 +348,14 @@ fn build_batch( | |||
}) | |||
.collect::<Result<Vec<_>>>()?; | |||
|
|||
RecordBatch::try_new( | |||
RecordBatch::try_new_with_options( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we have more operators with empty inputs that use RecordBatch::try_new
that should move to use RecordBatch::try_new_with_options
(but are not covered in tests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have thought about this too, but I didn't come up with any more tests😥.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine to merge as is and see if we get any regressions (hopefully not), those should be easy to fix in a similar manner.
@@ -695,7 +695,7 @@ logical_plan | |||
Projection: __scalar_sq_1.COUNT(*) AS b | |||
--SubqueryAlias: __scalar_sq_1 | |||
----Aggregate: groupBy=[[]], aggr=[[COUNT(UInt8(1)) AS COUNT(*)]] | |||
------TableScan: t1 projection=[t1_id] | |||
------TableScan: t1 projection=[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
…hdown.rs Co-authored-by: Daniël Heres <danielheres@gmail.com>
Thank you @haohuaijin ! |
## Rationale Close #1461 ## Detailed Changes Bump datafusion to https://github.com/CeresDB/arrow-datafusion/commits/e21b03154, which is version 33. Some important breaking changes: - apache/datafusion#7920 - apache/datafusion#9109 ## Test Plan CI --------- Co-authored-by: jiacai2050 <dev@liujiacai.net>
Which issue does this PR close?
Closes #3214
Rationale for this change
When I want to do the query like
I encounter
Arrow error: External error: Arrow error: Invalid argument error: must either specify a row count or at least one column
. I found that the error was occurring because, after reading aRecordBatch
, we will to reproject theRecordBatch
use the partition column, and it did not support an empty projection. see below.https://github.com/apache/arrow-datafusion/blob/ba50a8b178eece7e79b100d0b73bdc9d6d3ec6d5/datafusion/core/src/datasource/physical_plan/file_scan_config.rs#L342
What changes are included in this PR?
file_scan_config.rs
andcross_join.rs
to support emptyRecordBatch
push_down_projection
to support scan table with no projectionAre these changes tested?
Are there any user-facing changes?