Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: add kwargs that are deserializable by user defined structs #29

Merged
merged 9 commits into from
Oct 17, 2023

Conversation

ritchie46
Copy link
Member

No description provided.

@@ -12,14 +12,21 @@ fn create_expression_function(ast: syn::ItemFn) -> proc_macro2::TokenStream {
use pyo3_polars::export::*;
// create the outer public function
#[no_mangle]
pub unsafe extern "C" fn #fn_name (e: *mut polars_ffi::SeriesExport, len: usize) -> polars_ffi::SeriesExport {
pub unsafe extern "C" fn #fn_name (e: *mut polars_ffi::SeriesExport, len: usize, kwargs: *const std::os::raw::c_char) -> polars_ffi::SeriesExport {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in, the intent is to json- (or whatever-else-) serialize it on the python side and unwrap it back here, or something along the lines? If so, that's a bit cheesy for sure 😆

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the idea. And then users can work the JsonValue.

Any idea of something more ergonomic? 👀?

Copy link

@aldanor aldanor Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think with floats in particular that's a problem waiting to happen; and with polars being a numeric library that may be VERY undesirable...

Like use pass X.000000000 and get Y.99999999999 due to json serialization/deserialization and then spend days debugging it. In some cases that may not matter, but in some cases it may change the output.

If we go this way (i.e. serialization/deserialization using some standard protocol), then why do we have to use text protocol JSON at all, for internal purposes? Can we use any binary serde-compatible serialization protocol, that can pack args from within Python? (and btw is the serialization meant to be done in pure Python, or in Rust in py-polars?) Something like msgpack? Need to brush up on the formats.


Another thought... why not allow the 'receiver' side (that is, the user Rust code for the plugin) to simply serde-deserialize the kwarg struct from, say, raw bytes instead of poking manually at those fields and types? This would also alleviate the need for manual error handling during kwargs deserialization etc, and would make it feel more statically typed and rusty overall.

(The only question is then how to serialize those args on the 'sender' side, where exactly does the serialization take place and which protocol to use).

I.e. wouldn't it be nice if you could

// 'receiver' side
#[derive(Deserialize)]
struct Kwargs<'a> {
   float_arg: f64,
   string_arg: &'a str,  
}

let args = serde_something::from_bytes(&args_payload)?;

// or even better, to hide the format that's being used internally
let args: Kwargs = args.parse()?; // assuming there's some nice wrapper etc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point.

We serialize the kwargs on the python side as we are not sure what kwargs users will use. I am not entirely sure which algorithm we can use in such a case. There is a serde_pickle we can try.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Save from the lifetime on the kwargs string, this worked like a charm! Thanks for keeping me honest here ;)

Copy link

@aldanor aldanor Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 @ritchie46 Looks great! Re: ergonomics, wonder if Option<> has to always be there around kwargs? Because.. if it's DefaultKwargs then an empty dict is valid kwargs anyways, just like in Python. And if you require your own kwargs you can always add an Option<> yourself if needed (to either the whole type or some/all of the fields)?

(disclaimer, I haven't read the rust part of this in full because I couldn't find the correct branch 😆)

I.e., wouldn't it be more ergonomic for both the case when you don't need the kwargs (so you don't have to write _kwargs: Option<DefaultKwargs>) and when you require kwargs (so you don't have to .unwrap() the Option) to support both

fn foo(inputs: &[Series]) // don't need _kwargs: Option<...>
fn bar(inputs: &[Series], kwargs: Kwargs) // no unwrap, all checked prior to this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can do that as well. Need to do more proc-macro inspection for that.

@ritchie46 ritchie46 changed the title not sure about this feat!: add json serializable arguments Oct 6, 2023
@ritchie46 ritchie46 changed the title feat!: add json serializable arguments feat!: add kwargs that are deserializable by user defined structs Oct 16, 2023
@ritchie46 ritchie46 merged commit f3548ec into main Oct 17, 2023
2 checks passed
@ritchie46 ritchie46 deleted the more_args branch October 17, 2023 05:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants