Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalar UDF support (with arrow support and function overloading) #407

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

matthewgapp
Copy link

@matthewgapp matthewgapp commented Dec 1, 2024

This PR adds scalar UDF support via the C API, for functions that want to work with the data chunk or arrow types. Also includes support for function overloading via the function set API.

Would be ✨ lovely ✨ if the scalar (and aggregate function) API supported zero-copy arrow FFI API like the query results do. Right now we have to allocate record batches before using them.

It would also be nice to provide a safer way to get values from vectors based on the duck vector's datatype. This is doable without changes to the C API, but this PR was getting a little too meaty as is so I might follow up with that improvement in a separate PR.

@matthewgapp matthewgapp force-pushed the matt/feat/scalar-udf-support branch from 7cc172d to aebc114 Compare December 1, 2024 17:32
@matthewgapp matthewgapp changed the title Scalar UDF support Scalar UDF support (with arrow support and function overloading) Dec 1, 2024
@samansmink
Copy link
Collaborator

Thanks for the PR! I need to fix CI for this repo but I'm a little swamped at the moment. I will get to it ASAP

@matthewgapp
Copy link
Author

Thanks for the PR! I need to fix CI for this repo but I'm a little swamped at the moment. I will get to it ASAP

Appreciate it @samansmink, lmk if you have any questions

@matthewgapp
Copy link
Author

@samansmink any chance you'll get to this soon? Thanks!

@matthewgapp
Copy link
Author

matthewgapp commented Jan 11, 2025

@samansmink just following up here :)

@era127
Copy link
Contributor

era127 commented Jan 12, 2025

I was able to use this PR to make a scalar udf that accepted an array type as the input. However, I ran into a limitation with Lists (not related to this PR). The ListVector type doesn't expose the entries vector of offset/vector values, and it wasn't possible to understand how to split up the child vector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants