-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Extend SliceBox and unify Array creation from owned data #231
Conversation
The CI failure has #232 as its same root cause, however I am unsure how to fix this and still be able to store e.g. @davidhewitt Is it possible to make general statements on how often |
Sorry for the slow reply. My laptop keyboard is having issues so I'm not keeping up with all discussions at the moment.
|
Thank you! (and sorry for my late reply... we have long new-year holidays in Japan). |
Efficiency and simplicity: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I left some comments,
src/owner.rs
Outdated
pub(crate) struct Owner { | ||
ptr: *mut u8, | ||
len: usize, | ||
cap: usize, | ||
drop: unsafe fn(*mut u8, usize, usize), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't Owner
a too broad name? I prefer to PySliceContainer
, SliceContainer
, SliceOwner
and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went back to the "container" naming, i.e. PySliceContainer
for the type and container: C
for the bindings.
src/owner.rs
Outdated
let cap = 0; | ||
let drop = drop_boxed_slice::<T>; | ||
|
||
mem::forget(data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to Box::into_raw
for readability. We use mem::forget
for various usages, but into_raw
has only one purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me too. :-) The problem is that Box::into_raw
will yield *mut [T]
, i.e. a fat pointer, for the boxed slice which does not apply to the Vec<_>
case. Hence I manually deconstructed the boxed slice to be able to handle both types in a more or less uniform way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh OK I got it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://doc.rust-lang.org/std/primitive.pointer.html#method.as_mut_ptr this would work but it's still unstable too 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we can use std::slice::from_raw_parts
for Vec
side, but either approach is not so good, so either is fine for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think the current approach is a reasonable compromise as it handles both types in the same way. I did add another FIXME
comment w.r.t. Box::into_raw
though.
src/owner.rs
Outdated
Self { | ||
ptr, | ||
len, | ||
cap, | ||
drop, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to avoid
Self {
ptr: data.as_ptr() as *mut u8,
len: data.len(),
cap: 0,
drop: drop_boxed_slice::<T>,
}
?
Readability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mem::forget
call consumes data
which is why I had to put these into separate bindings.
By providing type erasure for both Box<[T]> and Vec<T> we can avoid having to transform Vec<T> and Array<A, D> into boxed slices which can potentially re-allocate.
Both create an array from existing data, they just differ w.r.t. how the how the owner on the Python heap is handled.
|
||
/// Utility type to safely store Box<[_]> or Vec<_> on the Python heap | ||
pub(crate) struct PySliceContainer { | ||
ptr: *mut u8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, numpy does not allow mutating the array that has a parent array, so this can be *const
in theory. So, let's try it if you are also curious if it works 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think owning raw pointers are *mut
by convention.
Just noticed that I did not add a changelog entry for |
@kngwyu This is a follow-up to #230 and #233 and strictly optional, but I think getting rid of the pointer manipulations in
Array: IntoPyArray
is nice and unifying the different code paths should reduce the maintenance burden.