Skip to content

Conversation

@joroKr21
Copy link
Contributor

This was done in #12922 only for math functions.
We now generalize this fallback to all scalar UDFs.

Which issue does this PR close?

Closes #12959

Rationale for this change

See #12922

What changes are included in this PR?

Are these changes tested?

Relying on existing tests.

Are there any user-facing changes?

ColumnarValue::from_args_and_result is removed again, it's not released yet.

@github-actions github-actions bot added physical-expr Changes to the physical-expr crates functions Changes to functions implementation labels Oct 16, 2024
// If the function is not volatile and all arguments are scalars,
// we can assume that returning a one-element array is equivalent to returning a scalar.
let preserve_scalar = array.len() == 1
&& self.fun.signature().volatility != Volatility::Volatile
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can we not do this for functions that are Volatile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A volatile function is supposed to return a different result on every invocation so I don't think it's safe to do that. The example we were discussing in the earlier PR is a hypothetical rand(upper_bound: Int64) -> Int64 function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the discussion and I see where you coming from. I guess a bit confusing since that function is currently not implementable like you said in that thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess a bit confusing since that function is currently not implementable like you said in that thread.

I think so. It's currently not implementable.

I think under the current UDF invoke model, we should preserve scalar even if it is volatile. For example, we have a UDF fn random_string(len: int) -> String, which returns a random string of length len. This function is volatile, but it should return a scalar if the len parameter is a scalar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can make that change - was just leaning more on the safe side.

&& self.fun.signature().volatility != Volatility::Volatile
&& inputs
.iter()
.all(|arg| matches!(arg, ColumnarValue::Scalar(_)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should consider the case where the inputs are empty. For UDFs without args, either return a scalar directly or return an array with num_rows. Trying to convert the output array back to scalar for them seems unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch 👍

This was done in apache#12922 only for math functions.
We now generalize this fallback to all scalar UDFs.
Copy link
Member

@jonahgao jonahgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM👍, thanks @joroKr21

@alamb alamb merged commit 0ed369e into apache:main Oct 17, 2024
24 checks passed
@alamb
Copy link
Contributor

alamb commented Oct 17, 2024

Thanks @joroKr21 @eejbyfeldt and @jonahgao

@joroKr21 joroKr21 deleted the scalar-udfs branch October 17, 2024 17:29
joroKr21 added a commit to coralogix/arrow-datafusion that referenced this pull request Oct 17, 2024
…2965)

This was done in apache#12922 only for math functions.
We now generalize this fallback to all scalar UDFs.
joroKr21 added a commit to coralogix/arrow-datafusion that referenced this pull request Oct 19, 2024
…2965) (#276)

This was done in apache#12922 only for math functions.
We now generalize this fallback to all scalar UDFs.
avantgardnerio pushed a commit to coralogix/arrow-datafusion that referenced this pull request Sep 17, 2025
…2965) (#276)

This was done in apache#12922 only for math functions.
We now generalize this fallback to all scalar UDFs.
avantgardnerio pushed a commit to coralogix/arrow-datafusion that referenced this pull request Sep 18, 2025
…2965) (#276)

This was done in apache#12922 only for math functions.
We now generalize this fallback to all scalar UDFs.
avantgardnerio pushed a commit to coralogix/arrow-datafusion that referenced this pull request Sep 18, 2025
…2965) (#276)

This was done in apache#12922 only for math functions.
We now generalize this fallback to all scalar UDFs.
avantgardnerio pushed a commit to coralogix/arrow-datafusion that referenced this pull request Sep 18, 2025
…2965) (#276)

This was done in apache#12922 only for math functions.
We now generalize this fallback to all scalar UDFs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation physical-expr Changes to the physical-expr crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ensure that scalar functions fulfil the ColumnarValue contract

4 participants