-
Notifications
You must be signed in to change notification settings - Fork 675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Null terminated slices for exec #358
base: master
Are you sure you want to change the base?
Conversation
☔ The latest upstream changes (presumably #357) made this pull request unmergeable. Please resolve the merge conflicts. |
037c46f
to
ffe8d74
Compare
ffe8d74
to
14c0ad2
Compare
Made some changes to accomodate for the notes:
Still waiting on the string story to be figured out before determining the exact conversion traits to be used here instead of |
☔ The latest upstream changes (presumably #416) made this pull request unmergeable. Please resolve the merge conflicts. |
@arcnmx Would you still be interested in merging this? Given how long this has sit I'd like to either merge this or close this PR. |
I am, as it is nice to be able to allocate the memory for exec ahead of time. Considering that this exposes a new type, I'm still looking for some feedback. |
@arcnmx I'm not too familiar with the A question that does come up here as well is if there are any other APIs in So it's hard to gauge how "good" of an API this is because I only see the low-level code modifications and there are no tests or examples to gauge how this compares to the old/current way of doing things. That being said, this does seem on the right track for the "ideal" API |
Regarding including information in the commit, agreed. The new types and functions certainly could use documentation before actually merging this. Buf if you're unfamiliar with exec, there are a few things to be aware of here:
Essentially, this PR exposes this "null terminated string array" as a type so that it can be created and allocated beforehand if needed. The side goal here is to still allow impromptu calls to |
@arcnmx Thanks for the additional background, though that was already my takeaway from this. I think the disconnect here is that everything looks fine to me on this, I don't see a downside to this. Removing allocations is always great assuming it doesn't make the API too difficult to use, and I don't have any perspective on what the resultant API looks like here, nor am I aware how the old API was used, without seeing actual example code. Hopefully that makes more sense. And I do think it's worth moving forward with this based on what I'm seeing, and once an example is written up for this I can be 100% on my evaluation of the API this PR proposes. |
Actually, based on your comments, it's actually unsafe the way it's implemented now, because our current API forces a |
Well, hm, again a few main points:
If you're using |
I agree with all your points, I just used the word "unsafe" rather loosely here; what I really meant to say is that |
Alright! So then the question then is, do you see anything wrong with the current design and API/type proposed in this PR? On a high level or even just bikeshedding on type/function names. Documentation is definitely a point that should be addressed. |
Sure, here we go!
|
|
Let's keep it internal then, but make sure it's very modular so it'd be easy to rip out should we find one or should it be useful to others.
Is there a reason that we should have an owned version versus making people just create arrays and slices from them? Is there enough need to resize this array that it's worth the whole extra type?
Then lets do |
The type exists for the same reasons why
Sounds good. |
I don't know if you needed or wanted feedback to that last comment, but I think we should just keep going with what you have now with |
14c0ad2
to
9506b74
Compare
Rebased and addressed a few of the items discussed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks like a pretty nice interface. I would ask that since you're touching these functions that you update the cfgs to match our one-per-line style and improve doc comments for the exec functions you modified.
Also, I'm on mobile, so I can't see the rest of the file, but do we have examples or tests for these functions? It's easiest to tell how good the API is by having compiled examples in our doc comments.
src/lib.rs
Outdated
|
||
impl<T> TerminatedSlice<T> { | ||
/// Instantiate a `TerminatedSlice` from a slice ending in `None`. Returns | ||
/// `None` if the provided slice is not properly terminated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this return a Result
? This seems like you're using Optipn
to return an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can create an empty error type for this.
src/lib.rs
Outdated
} | ||
|
||
/// Owned variant of `TerminatedSlice`. | ||
pub struct TerminatedVec<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should link to TerminatedSlicr if we're only going to put the documentation on that type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, should probably duplicate it anyway. Linking to the type is good too though, what's the syntax for that look like?
src/lib.rs
Outdated
impl<T> TerminatedVec<T> { | ||
/// Instantiates a `TerminatedVec` from a `None` terminated `Vec`. Returns | ||
/// `None` if the provided `Vec` is not properly terminated. | ||
pub fn from_vec(vec: Vec<Option<T>>) -> Option<Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this should return a Result type
} | ||
} | ||
|
||
impl<T> AsRef<TerminatedSlice<T>> for TerminatedVec<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to put this impl up with the Slice implementation code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean by this. I tend to follow a type definition with the traits that it impls or provides/enables. You mean this should be moved before the definition of TerminatedVec
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, disregard this comment.
src/lib.rs
Outdated
} | ||
} | ||
|
||
impl<'a, T: 'a> IntoRef<'a, TerminatedSlice<&'a T>> for &'a TerminatedSlice<&'a T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to put this and the followng impl up with the Slice implementation code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I see this one should probably be moved up to follow the TerminatedSlice
definition, although the ones above and below it are more related to TerminatedVec
.
unsafe { | ||
libc::execv(path.as_ptr(), args_p.as_ptr()) | ||
}; | ||
try!(path.with_nix_path(|cstr| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the question mark operator instead of try!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing! Note that this is inconsistent with the rest of the file, should I just convert them all while I'm at it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to convert any code outside of what you're creating/modifying.
These functions are currently tested in Any comment on impl'ing |
Misc notes:
|
} | ||
|
||
/// Coercion of `CStr` iterators into an argument that can be passed to `exec`. | ||
impl<'a, T: AsRef<CStr> + 'a, I: IntoIterator<Item=T>> IntoRef<'a, TerminatedSlice<&'a c_char>> for I { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something that makes sense to have globally-defined? Or can this be integrated into the function definition instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This impl is pretty central to allowing exec to accept either:
&TerminatedSlice
(usually actually aTerminatedVec
created byterminate()
)&[CStr]
and similar (specifically anyIntoIterator
whereItem: AsRef<CStr>
)
... particularly the latter, which enables the old behaviour of allocate-at-call-site. One slightly less global approach might be to move the IntoRef
trait and its impls into the unistd
module and possibly give it a name that isn't as generic sounding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option would be to simply remove the old style and require the use of TerminatedVec::terminate_cstr
any time exec is called, whether the Vec is being preallocated or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a big fan of reducing options and improving consistency. This is why I've been asking for code examples, as I'm not familiar with how people use this API regularly. I want to make sure we know how people use it regularly and our API is easy for the 90% case and possible for the last 10% case. Or something along those lines.
All the examples I see involve a hard-coded array. And in those situations, I think a term_vec!()
macro makes a lot of sense; this would look like a regular vec!
call and avoid all of the wrapping in Option
. But I don't know how people build up these arrays programmatically or if that's ever done in practice. Is that a common operation? I'd think that is handled nicely by the API provided here, but I don't know exactly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, examples are going to use a hard-coded array because they're just examples. Real world usage varies of course, and can be hard-coded in simple cases, or generated from a config file or argv like in the use case that I originally wrote these changes for.
Most uses for exec I can imagine would use data coming from a dynamic source rather than being hard-coded. The convenient no-preallocation variant mostly comes into play when using exec in a program that doesn't fork and just intends to replace the whole process - I'm not sure how common that is, I guess if you're writing a tool like env(1) or something. A single-threaded program that does fork applies too, although that's not an assumption one should make if writing a library that doesn't know what kind of program might be using it.
} | ||
} | ||
|
||
impl<T> AsRef<TerminatedSlice<T>> for TerminatedVec<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, disregard this comment.
unsafe { | ||
libc::execv(path.as_ptr(), args_p.as_ptr()) | ||
}; | ||
try!(path.with_nix_path(|cstr| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to convert any code outside of what you're creating/modifying.
|
||
/* | ||
* | ||
* ===== Null terminated slices for exec ===== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this comment header.
* | ||
*/ | ||
|
||
use std::ops::{Deref, DerefMut}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this use
statements to the top of the file.
|
||
/// An error returned from the [`TerminatedSlice::from_slice`] family of | ||
/// functions when the provided data is not terminated by `None`. | ||
#[derive(Clone, PartialEq, Eq, Debug)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not Copy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, missed that.
|
||
/// An error returned from [`TerminatedVec::from_vec`] when the provided data is | ||
/// not terminated by `None`. | ||
#[derive(Clone, PartialEq, Eq)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing Copy
and Debug
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug is defined below this (it's manually implemented because we want to impl it regardless of whether T: Debug
or not), and Copy
cannot be derived as it contains a Vec
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, sorry I didn't read that closely enough!
@@ -191,7 +192,7 @@ fn test_initgroups() { | |||
} | |||
|
|||
macro_rules! execve_test_factory( | |||
($test_name:ident, $syscall:ident, $exe: expr $(, $pathname:expr, $flags:expr)*) => ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're modifying this macro, can you create documentation for it so it's known how it's called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea how to begin to do this honestly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You modified the macro, is there no documentation you can offer her to help others at least understand your change to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a convoluted test generator macro that became more complex with this change (because there are now separate actions to argument initialization vs coercion for use in exec). I can describe some of the intent but it's not going to make all that much sense without reading the whole macro and seeing what goes where.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything you know would be a great improvement over what's there now! I'll probably be the one going in there and trying to improve the documentation, so anything you could do to help me out would be appreciated! You at least understand this macro somewhat and I wouldn't think a short blurb and an explanation of your variant would be too difficult to write.
See https://github.com/nix-rust/nix/blob/master/src/unistd.rs#L888
We have things running in a single thread right now I believe because we limit the test harness to a single process. We'd like to move to tests being run in parallel eventually, so writing thread-safe tests should be attempted at least. As for the API, we should ideally have examples of both use cases, one where it was preallocated and one where it was not.
I think we're going to need to figure out the whole
I saw that, I'll give it an extra look through. |
/// An error returned from the [`TerminatedSlice::from_slice`] family of | ||
/// functions when the provided data is not terminated by `None`. | ||
#[derive(Clone, PartialEq, Eq, Debug)] | ||
pub struct NotTerminatedError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this another enum variant within NixError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like unnecessary bloat to NixError
for an error case specific to a single function call that no one will ever actually use or call in 99% of scenarios.
Also chances are if someone does use it, they'll already have created the data to be terminated and actually want the unsafe
variant anyway (or if you're avoiding unsafe then failure of the call will be treated as an internal assertion failure that's just abort
/unwrap()
/unreachable!()
). I can't really imagine a scenario where this error would ever be propagated up a method chain or actually used with try!()
, which is partially why the original implementation simply used an Option
- using an error type is neater, but also kind of overkill?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd make the argument that it's less bloat than all the code implementing a new error type, but I understand your point.
As to why this should be an Error
type is because the semantics behind that better match what's happening rather than Option
. If there is a failure, it should be an error. Option
s aren't used to denote failure. There is a grey area where it can go either way, but I think this is a clear case for Error.
So let's go ahead and leave this error as is. A goal of mine is to unify the error handling within nix
to use the new failure
crate, so this will likely get cleaned up along with that change.
@@ -191,7 +192,7 @@ fn test_initgroups() { | |||
} | |||
|
|||
macro_rules! execve_test_factory( | |||
($test_name:ident, $syscall:ident, $exe: expr $(, $pathname:expr, $flags:expr)*) => ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything you know would be a great improvement over what's there now! I'll probably be the one going in there and trying to improve the documentation, so anything you could do to help me out would be appreciated! You at least understand this macro somewhat and I wouldn't think a short blurb and an explanation of your variant would be too difficult to write.
The goal here is to make allocations optional when passing arguments to the
exec
family of functions. AVec<CString>
works (and will allocate behind the scenes) while a&[Option<&c_char>]
will avoid them as long as you assert that it's null-terminated.Mostly posting this for feedback on the design; it needs some consideration and probably won't work on old rust versions due to
AsRef<CStr>
; probably best to wait for #230 andNixString
to be figured out for that, and also propagate conversion errors for normal string types.Notes:
FnOnce
approach was used for the interface instead of normal methods and associated types because it's the only way to use higher ranked lifetimes (well, without forcing the use of a concrete type).