-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(rust): Too-strict SQL UDF schema validation #20202
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #20202 +/- ##
==========================================
- Coverage 79.63% 79.10% -0.53%
==========================================
Files 1564 1572 +8
Lines 217804 219931 +2127
Branches 2477 2465 -12
==========================================
+ Hits 173439 173985 +546
- Misses 43796 45378 +1582
+ Partials 569 568 -1 ☔ View full report in Codecov by Sentry. |
What does this fix? This only removes checks. What is the issue at hand? |
Imagine I want to define a UDF SELECT noise(sum(a)), noise(sum(b)) FROM data This is not possible with the current implementation, because I don't see a reason to have these checks, but I also don't have a complete understanding of the Polars codebase, so there may be ramifications I'm missing. |
Option 2: change the
to: // ++++++++++++++++++++
fn get_udf(&self, name: &str, fields: Vec<Field>) -> PolarsResult<Option<UserDefinedFunction>>; So that the implementation could tailor the returned I think If you'd rather, I could update the PR in this direction. |
After thinking about it some more, I already bake schema validation logic into |
c8f1bab
to
d277416
Compare
@ritchie46: Updated to remove |
Closes #15155
Over-eager schema validation is unfortunately rendering user-defined SQL functions unusable. Not sure why registered UDFs need to hardcode/pre-specify the names and dtypes of prospective arguments-- the plugin can handle invalid calling patterns much more flexibly. Reducing the strictness also allows SQL plugins to be variadic, just like Python plugins are.
I wrote this PR with a different and simpler approach because the original PR from @uhlarmarek has stalled: #15159
The SQL functionality seems useful, but we can't move forward with it until we can register UDFs. Thanks for the great work as always!