-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document Schema metadata expectations #12736
Comments
I believe @wiedld plans to work on this |
take |
I will take a shot at documenting this |
#13305 (comment) has some additional context |
I'm not sure if this is related or not but I encountered an error during optimization:
I think the join operator is either returning the metadata from the first or second input and so an optimization that swaps the join order results in the output schema of the join operator changing and this causes the optimizer to bail. |
That definitely sounds like a bug |
Is your feature request related to a problem or challenge?
There is an (implicit) assumption that metadata attached to Schema is preserved during certain operations in DataFusion.
However, this expectation is clearly not well tested or documented (e.g. see #12733)
Describe the solution you'd like
I would like the assumptions documented
Describe alternatives you've considered
I suggest documentation on in https://docs.rs/datafusion/latest/datafusion/logical_expr/enum.LogicalPlan.html that explains the high level assumptions
Then add a note /link to that section from the optimizers:
https://docs.rs/datafusion/latest/datafusion/optimizer/trait.AnalyzerRule.html
https://docs.rs/datafusion/latest/datafusion/optimizer/trait.OptimizerRule.html
https://docs.rs/datafusion/latest/datafusion/physical_optimizer/trait.PhysicalOptimizerRule.html
My understanding of the high level assumptions are:
Examples
PROJECT(a, b+c)
--> field metadata ona
should be preserved, no field metadata onb+c
SUM(a) .. GROUP BY b
--> field metadata onb
is preserved, not ona
Additional context
No response
The text was updated successfully, but these errors were encountered: