-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: add kwargs that are deserializable by user defined structs #29
Conversation
pyo3-polars-derive/src/lib.rs
Outdated
@@ -12,14 +12,21 @@ fn create_expression_function(ast: syn::ItemFn) -> proc_macro2::TokenStream { | |||
use pyo3_polars::export::*; | |||
// create the outer public function | |||
#[no_mangle] | |||
pub unsafe extern "C" fn #fn_name (e: *mut polars_ffi::SeriesExport, len: usize) -> polars_ffi::SeriesExport { | |||
pub unsafe extern "C" fn #fn_name (e: *mut polars_ffi::SeriesExport, len: usize, kwargs: *const std::os::raw::c_char) -> polars_ffi::SeriesExport { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in, the intent is to json- (or whatever-else-) serialize it on the python side and unwrap it back here, or something along the lines? If so, that's a bit cheesy for sure 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the idea. And then users can work the JsonValue
.
Any idea of something more ergonomic? 👀?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with floats in particular that's a problem waiting to happen; and with polars being a numeric library that may be VERY undesirable...
Like use pass X.000000000 and get Y.99999999999 due to json serialization/deserialization and then spend days debugging it. In some cases that may not matter, but in some cases it may change the output.
If we go this way (i.e. serialization/deserialization using some standard protocol), then why do we have to use text protocol JSON at all, for internal purposes? Can we use any binary serde-compatible serialization protocol, that can pack args from within Python? (and btw is the serialization meant to be done in pure Python, or in Rust in py-polars?) Something like msgpack? Need to brush up on the formats.
Another thought... why not allow the 'receiver' side (that is, the user Rust code for the plugin) to simply serde-deserialize the kwarg struct from, say, raw bytes instead of poking manually at those fields and types? This would also alleviate the need for manual error handling during kwargs deserialization etc, and would make it feel more statically typed and rusty overall.
(The only question is then how to serialize those args on the 'sender' side, where exactly does the serialization take place and which protocol to use).
I.e. wouldn't it be nice if you could
// 'receiver' side
#[derive(Deserialize)]
struct Kwargs<'a> {
float_arg: f64,
string_arg: &'a str,
}
let args = serde_something::from_bytes(&args_payload)?;
// or even better, to hide the format that's being used internally
let args: Kwargs = args.parse()?; // assuming there's some nice wrapper etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point.
We serialize the kwargs on the python side as we are not sure what kwargs users will use. I am not entirely sure which algorithm we can use in such a case. There is a serde_pickle we can try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Save from the lifetime on the kwargs string, this worked like a charm! Thanks for keeping me honest here ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 @ritchie46 Looks great! Re: ergonomics, wonder if Option<>
has to always be there around kwargs? Because.. if it's DefaultKwargs
then an empty dict is valid kwargs anyways, just like in Python. And if you require your own kwargs you can always add an Option<>
yourself if needed (to either the whole type or some/all of the fields)?
(disclaimer, I haven't read the rust part of this in full because I couldn't find the correct branch 😆)
I.e., wouldn't it be more ergonomic for both the case when you don't need the kwargs (so you don't have to write _kwargs: Option<DefaultKwargs>
) and when you require kwargs (so you don't have to .unwrap()
the Option) to support both
fn foo(inputs: &[Series]) // don't need _kwargs: Option<...>
fn bar(inputs: &[Series], kwargs: Kwargs) // no unwrap, all checked prior to this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can do that as well. Need to do more proc-macro inspection for that.
No description provided.