feat: add coerce_arguments flag to UDTFs to allow skipping automatic …#20376
Open
evangelisilva wants to merge 3 commits intoapache:mainfrom
Open
feat: add coerce_arguments flag to UDTFs to allow skipping automatic …#20376evangelisilva wants to merge 3 commits intoapache:mainfrom
evangelisilva wants to merge 3 commits intoapache:mainfrom
Conversation
…coercion This allows UDTFs to handle complex arguments (like identifiers) that would otherwise fail planning when coerced against an empty schema. Fixes apache#20293
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
UDTF Argument Coercion Suppression
Which issue does this PR close?
Closes #20293.
Rationale for this change
Currently, User-Defined Table Functions (UDTFs) in DataFusion automatically undergo argument coercion and simplification before being passed to the function creator. This process happens against an empty schema (
DFSchema::empty()).If a UDTF uses arguments that contain identifiers (e.g., scan_with(index=['a', 'b'])), the simplifier fails with a
Schema error: No field named indexbecause it attempts to resolve index as a column reference. This prevents UDTFs from implementing custom argument parsing logic that relies on identifiers or complex expressions.What changes are included in this PR?
false, the raw Expr arguments are passed directly to the UDTF creator without modification.Are these changes tested?
Yes. I've added a new test module
udtf_testsin datafusion/core/src/execution/session_state.rs containing:Are there any user-facing changes?
Yes. There is a new method on the TableFunctionImpl trait. However, because it has a default implementation that returns true, it is backward compatible and will not break existing UDTF implementations. UDTF authors who need the new behavior simply need to override this method.
cc: @askalt @alamb