-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SchemaAdapterExec #2292
Comments
Note schema adapter is here: https://github.com/apache/arrow-datafusion/blob/9815ac6ecc2aee7fbbafa09c704ca81b0225221e/datafusion/core/src/physical_plan/file_format/mod.rs#L205 IOx has most of the necessary code here for this logic here: https://github.com/influxdata/influxdb_iox/blob/5488c257d1bbb9a9b2f6882444b9e88098e53fdc/query/src/provider/adapter.rs#L45-L80 |
I have some concerns about this. The problem is that this sort of assumes that we actually know at planning time what the schema for each individual file is in a |
Thank you for bringing this up, to phrase it differently to check my understanding:
I agree that there doesn't appear to be a way around this without the file operator handling the schema adaption. I will close this and update the other tickets accordingly. Thank you 👍 |
Yeah, exactly |
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Part of #2079, related to #2170
Currently schema adaption is handled within each of the file format specific operators. As described in #2079 this has a number of drawbacks.
Describe the solution you'd like
I would like a
SchemaAdapterExec
that can be created with a providedSchema
and a childExecutionPlan
. It would then adapt the schema of the batches returned by this innerExecutionPlan
to match the providedSchema
, creating null columns as necessary.This can likely reuse the existing
SchemaAdapter
FYI @matthewmturner @thinkharderdev
The text was updated successfully, but these errors were encountered: