-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🔒🌯 Provide LLVMTypeWrapper
trait, add more safe wrappers
#223
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, 4 of 4 files at r2, all commit messages.
Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @vadorovsky)
src/llvm/types/di.rs
line 54 at r2 (raw file):
type Target = LLVMMetadataRef; /// Constructs a new [`DIFile`] from the given `metadata`.
looks like this comment should go on the trait now.
src/llvm/types/di.rs
line 131 at r2 (raw file):
type Target = LLVMValueRef; /// Constructs a new [`DIType`] from the given `value`.
looks like this comment should go on the trait now.
src/llvm/types/di.rs
line 190 at r2 (raw file):
type Target = LLVMValueRef; /// Constructs a new [`DIDerivedType`] from the given `value`.
looks like this comment should go on the trait now.
src/llvm/types/di.rs
line 259 at r2 (raw file):
type Target = LLVMValueRef; /// Constructs a new [`DICompositeType`] from the given `value`.
looks like this comment should go on the trait now.
src/llvm/types/di.rs
line 369 at r2 (raw file):
type Target = LLVMValueRef; /// Constructs a new [`DISubprogram`] from the given `value`.
looks like this comment should go on the trait now.
src/llvm/types/ir.rs
line 214 at r2 (raw file):
type Target = LLVMValueRef; /// Constructs a new [`MDNode`] from the given `value`.
looks like this comment should go on the trait now.
src/llvm/types/ir.rs
line 221 at r2 (raw file):
/// instance of [LLVM `MDNode`](https://llvm.org/doxygen/classllvm_1_1MDNode.html). /// It's the caller's responsibility to ensure this invariant, as this /// method doesn't perform any valiation checks.
"valiation" typo throughout
src/llvm/types/ir.rs
line 243 at r2 (raw file):
/// It's the caller's responsibility to ensure this invariant, as this /// method doesn't perform any validation checks. pub(crate) unsafe fn from_metadata_ref(
could we just inline this method?
src/llvm/types/ir.rs
line 251 at r2 (raw file):
/// Constructs an empty metadata node. /// Constructs a new [`MDNode`] from the given `value`.
detritus
src/llvm/types/ir.rs
line 333 at r2 (raw file):
type Target = LLVMValueRef; /// Constructs a new [`Function`] from the given `value`.
looks like this comment should go on the trait now.
src/llvm/types/mod.rs
line 7 at r2 (raw file):
type Target: ?Sized; unsafe fn from_ptr(ptr: Self::Target) -> Self;
why is it unsafe?
Previously, tamird (Tamir Duberstein) wrote…
The most of pointer types coming from LLVM-C API (through llvm-sys) are more vague than the C++ ones. There is But good that you asked. I didn't think about it before, but there are functions like |
b7ee374
to
3f77c07
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 11 unresolved discussions (waiting on @tamird)
src/llvm/types/di.rs
line 54 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
looks like this comment should go on the trait now.
Done.
src/llvm/types/di.rs
line 131 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
looks like this comment should go on the trait now.
Done.
src/llvm/types/di.rs
line 190 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
looks like this comment should go on the trait now.
Done.
src/llvm/types/di.rs
line 259 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
looks like this comment should go on the trait now.
Done.
src/llvm/types/di.rs
line 369 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
looks like this comment should go on the trait now.
Done.
src/llvm/types/ir.rs
line 214 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
looks like this comment should go on the trait now.
Done.
src/llvm/types/ir.rs
line 221 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
"valiation" typo throughout
Done.
src/llvm/types/ir.rs
line 243 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
could we just inline this method?
Done.
src/llvm/types/ir.rs
line 251 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
detritus
Done.
src/llvm/types/ir.rs
line 333 at r2 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
looks like this comment should go on the trait now.
Done.
src/llvm/types/mod.rs
line 7 at r2 (raw file):
Previously, vadorovsky (Michal Rostecki) wrote…
The most of pointer types coming from LLVM-C API (through llvm-sys) are more vague than the C++ ones.
There is
LLVMValueRef
which can be pretty much anything -Value
,Instruction
,GlobalVariable
or any of dozens of classes inheriting fromValue
. Therefore constructing a wrapper corresponding to a C++ type (e.g.Instruction
) fromLLVMValueRef
is unsafe, unless you perform a direct check.But good that you asked. I didn't think about it before, but there are functions like
IsAMDNode
orIsAInstruction
for performing such checks. Let me check if they cover all the types. If yes, I could make it method safe and just make sure we have checks everywhere. If no, I would either insist on keeping this unsafe or provide a separate method/trait for types for which the check functions exist.
Done, I made the method safe and I added checks in each implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r3, all commit messages.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @vadorovsky)
src/llvm/mod.rs
line 3 at r3 (raw file):
mod di; mod iter; pub mod types;
what does it mean for this to be pub?
src/llvm/types/mod.rs
line 9 at r3 (raw file):
pub enum LLVMTypeError { #[error("invalid pointer type, expected {0}")] InvalidPointerType(&'static str),
nit: maybe we can use TypeId here? https://doc.rust-lang.org/stable/std/any/struct.TypeId.html
EDIT: seems like no because there are sometimes multiple admissible types, but i think we do want the unexpected type to be shown
src/llvm/types/mod.rs
line 11 at r3 (raw file):
InvalidPointerType(&'static str), #[error("null pointer")] NullPointer,
it would be neat to lift the null check to the caller using NonNull, but to do it would be a bit convoluted, you'd need something like
trait Pointer {
type PointedTo;
}
impl<T> Pointer for *mut T {
type PointedTo = T;
}
and then
trait LLVMTypeWrapper {
type Target: Pointer;
fn from_ptr(ptr: std::ptr::NonNull<Self::Target::PointedTo>) -> ...
}
unclear if it's worth it, but i'll let you decide
src/llvm/types/mod.rs
line 15 at r3 (raw file):
pub trait LLVMTypeWrapper { type Target: ?Sized;
how can target be unsized? aren't we always returning a pointer which is word-sized?
src/llvm/types/mod.rs
line 20 at r3 (raw file):
fn from_ptr(ptr: Self::Target) -> Result<Self, LLVMTypeError> where Self: Sized;
is this bound required? I'm surprised
src/linker.rs
line 81 at r3 (raw file):
MissingBitcodeSection(PathBuf), /// Interaction with an LLVM type failed.
"interaction"?
src/llvm/types/di.rs
line 158 at r3 (raw file):
_marker: PhantomData, }), _ => Err(LLVMTypeError::InvalidPointerType("DIType")),
seems like it would be nice to emit the unexpected type
src/llvm/types/di.rs
line 236 at r3 (raw file):
unsafe { let value = LLVMGetOperand(self.value_ref, DIDerivedTypeOperand::BaseType as u32); // PANICS: We are sure that the pointer type is correct. There is
these comments can easily be expect
s, or perhaps a helper function with a descriptive name so we don't need 6 copies of this comment.
src/llvm/types/ir.rs
line 86 at r3 (raw file):
if unsafe { !LLVMIsAMDNode(value_ref).is_null() } { let mdnode = MDNode::from_ptr(value_ref)?; return Ok(Value::MDNode(mdnode));
super nit: drop all the return
s and use else
for Other
src/llvm/types/ir.rs
line 240 at r3 (raw file):
) -> Result<Self, LLVMTypeError> { let value_ref = unsafe { LLVMMetadataAsValue(context, metadata) }; MDNode::from_ptr(value_ref)
isn't this the kind of place where you have chosen to panic?
src/llvm/types/ir.rs
line 359 at r3 (raw file):
let subprogram = unsafe { LLVMGetSubprogram(self.value_ref) }; NonNull::new(subprogram).map(|_| unsafe { // PANICS: We are sure that the pointer type is correct. There is
this API is weird, we're treating nullptr as None here while it's normally an error. This again goes to pushing null checks to the caller.
src/llvm/di.rs
line 340 at r3 (raw file):
for function in self.module.functions_iter().map(Function::from_ptr) { let mut function = function?;
instead of this, we should push the LLVMTypeWrapper::from_ptr call into the llvm_iterator!
macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @tamird)
src/llvm/mod.rs
line 3 at r3 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
what does it mean for this to be pub?
I did that to be able to import use crate::llvm::types::LLVMTypeError
outside of llvm module.
src/llvm/types/ir.rs
line 240 at r3 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
isn't this the kind of place where you have chosen to panic?
Not really. This from_metadata_ref
method takes any LLVMMetadataRef
pointer and converts it into our MDNode
wrapper. It can result in an error in case caller provides a null pointer or some other subtype of Metadata
which is not MDNode
.
I'm fine with panicking in situations when some method is creating an LLVMValueRef
or LLVMMetadataRef
internally (e.g. by getting an operand of the current item) and we are absolutely sure nothing can go wrong (unless LLVM itself has a major bug - let's say, getting operands doesn't work as expected or the type hierarchy is broken).
src/llvm/types/ir.rs
line 359 at r3 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
this API is weird, we're treating nullptr as None here while it's normally an error. This again goes to pushing null checks to the caller.
In LLVM-C API, a null pointer might mean three things:
- the object you requested doesn't exist, but your request was still valid and that lack of existence is fine (the existence of the object is optional) - this is the case here
- the object doesn't exist and it's not fine, the lack of existence breaks the logic
- the caller screwed up and passed something invalid - that's the case in the other places where I'm returning
Result
I think our Rust wrappers should differentiate between first case (where using Option
is appropriate) and the latter cases.
In case of LLVMGetSubprogram
, which ends up calling Function::getSubprogram
from LLVM C++ API, having a nullptr
result is valid, because this code:
uses cast_or_null
. cast_if_present
and cast_or_null
mean "convert this type if not null, otherwise return null" and is generally used in places which return optional objects.
src/llvm/types/mod.rs
line 11 at r3 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
it would be neat to lift the null check to the caller using NonNull, but to do it would be a bit convoluted, you'd need something like
trait Pointer { type PointedTo; } impl<T> Pointer for *mut T { type PointedTo = T; }and then
trait LLVMTypeWrapper { type Target: Pointer; fn from_ptr(ptr: std::ptr::NonNull<Self::Target::PointedTo>) -> ... }unclear if it's worth it, but i'll let you decide
Let me try that, requiring NonNull
sounds great from the point of view of handling null pointers correctly.
BTW, my long-term intention is to wrap absolutely every type and remove the direct from_ptr
usage, especially in src/linker/di.rs
.
src/llvm/types/mod.rs
line 15 at r3 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
how can target be unsized? aren't we always returning a pointer which is word-sized?
You're right, we should require Sized
here.
src/llvm/types/mod.rs
line 20 at r3 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
is this bound required? I'm surprised
Yeah, removing that bound results in this error:
Compiling bpf-linker v0.9.13 (/home/vadorovsky/src/bpf-linker)
error[E0277]: the size for values of type `Self` cannot be known at compilation time
--> src/llvm/types/mod.rs:18:39
|
18 | fn from_ptr(ptr: Self::Target) -> Result<Self, LLVMTypeError>;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time
|
note: required by an implicit `Sized` bound in `Result`
--> /home/vadorovsky/.rustup/toolchains/nightly-x86_64-unknown-linux-musl/lib/rustlib/src/rust/library/core/src/result.rs:528:17
|
528 | pub enum Result<T, E> {
| ^ required by the implicit `Sized` requirement on this type parameter in `Result`
help: consider further restricting `Self`
|
18 | fn from_ptr(ptr: Self::Target) -> Result<Self, LLVMTypeError> where Self: Sized;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @tamird)
src/llvm/types/ir.rs
line 359 at r3 (raw file):
Previously, vadorovsky (Michal Rostecki) wrote…
In LLVM-C API, a null pointer might mean three things:
- the object you requested doesn't exist, but your request was still valid and that lack of existence is fine (the existence of the object is optional) - this is the case here
- the object doesn't exist and it's not fine, the lack of existence breaks the logic
- the caller screwed up and passed something invalid - that's the case in the other places where I'm returning
Result
I think our Rust wrappers should differentiate between first case (where using
Option
is appropriate) and the latter cases.In case of
LLVMGetSubprogram
, which ends up callingFunction::getSubprogram
from LLVM C++ API, having anullptr
result is valid, because this code:uses
cast_or_null
.cast_if_present
andcast_or_null
mean "convert this type if not null, otherwise return null" and is generally used in places which return optional objects.
To put it in easier words - LLVMGetSubprogram
/Function::getSubprogram
tolerates the case when the given function has no subprogram.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @tamird)
src/llvm/types/ir.rs
line 240 at r3 (raw file):
Previously, vadorovsky (Michal Rostecki) wrote…
Not really. This
from_metadata_ref
method takes anyLLVMMetadataRef
pointer and converts it into ourMDNode
wrapper. It can result in an error in case caller provides a null pointer or some other subtype ofMetadata
which is notMDNode
.I'm fine with panicking in situations when some method is creating an
LLVMValueRef
orLLVMMetadataRef
internally (e.g. by getting an operand of the current item) and we are absolutely sure nothing can go wrong (unless LLVM itself has a major bug - let's say, getting operands doesn't work as expected or the type hierarchy is broken).
To put it in easier words:
- I'm panicking when I'm sure that caller's input can't result in getting a
nullptr
. - I'm returning the error when there is no such guarantee.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @vadorovsky)
src/llvm/mod.rs
line 3 at r3 (raw file):
Previously, vadorovsky (Michal Rostecki) wrote…
I did that to be able to import
use crate::llvm::types::LLVMTypeError
outside of llvm module.
maybe pub use llvm::types::LLVMTypeError
would be better?
src/llvm/types/ir.rs
line 240 at r3 (raw file):
Previously, vadorovsky (Michal Rostecki) wrote…
To put it in easier words:
- I'm panicking when I'm sure that caller's input can't result in getting a
nullptr
.- I'm returning the error when there is no such guarantee.
Thanks for explaining. I thought there was the guarantee here, but I understand now that there isn't.
src/llvm/types/ir.rs
line 359 at r3 (raw file):
Previously, vadorovsky (Michal Rostecki) wrote…
To put it in easier words -
LLVMGetSubprogram
/Function::getSubprogram
tolerates the case when the given function has no subprogram.
Yeah, that's fine, I was just pointing out that the responsibility for checking for null is bleeding across the boundary. Hopefully my other suggestion will clean all this up.
src/llvm/types/mod.rs
line 11 at r3 (raw file):
BTW, my long-term intention is to wrap absolutely every type and remove the direct
from_ptr
usage, especially insrc/linker/di.rs
.
Sounds great to me.
src/llvm/types/mod.rs
line 20 at r3 (raw file):
Previously, vadorovsky (Michal Rostecki) wrote…
Yeah, removing that bound results in this error:
Compiling bpf-linker v0.9.13 (/home/vadorovsky/src/bpf-linker) error[E0277]: the size for values of type `Self` cannot be known at compilation time --> src/llvm/types/mod.rs:18:39 | 18 | fn from_ptr(ptr: Self::Target) -> Result<Self, LLVMTypeError>; | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ doesn't have a size known at compile-time | note: required by an implicit `Sized` bound in `Result` --> /home/vadorovsky/.rustup/toolchains/nightly-x86_64-unknown-linux-musl/lib/rustlib/src/rust/library/core/src/result.rs:528:17 | 528 | pub enum Result<T, E> { | ^ required by the implicit `Sized` requirement on this type parameter in `Result` help: consider further restricting `Self` | 18 | fn from_ptr(ptr: Self::Target) -> Result<Self, LLVMTypeError> where Self: Sized;
Thanks for checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @tamird)
src/llvm/types/mod.rs
line 9 at r3 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
nit: maybe we can use TypeId here? https://doc.rust-lang.org/stable/std/any/struct.TypeId.html
EDIT: seems like no because there are sometimes multiple admissible types, but i think we do want the unexpected type to be shown
What's a bit tricky (both about using TypeId and showing the unexpected type) is that we don't cover all IR/DI types from LLVM. We define only the ones we need.
I'm open for ideas, but for now the only one coming to my mind is having two type variants, maybe something like
pub enum LLVMTypeError {
#[error("invalid pointer type, expected {0}, got {0}")]
InvalidPointerType(TypeId, TypeId),
#[error("unknown pointer type, expected {0}")]
UnknownPointerType(TypeId),
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @vadorovsky)
src/llvm/types/mod.rs
line 9 at r3 (raw file):
Previously, vadorovsky (Michal Rostecki) wrote…
What's a bit tricky (both about using TypeId and showing the unexpected type) is that we don't cover all IR/DI types from LLVM. We define only the ones we need.
I'm open for ideas, but for now the only one coming to my mind is having two type variants, maybe something like
pub enum LLVMTypeError { #[error("invalid pointer type, expected {0}, got {0}")] InvalidPointerType(TypeId, TypeId), #[error("unknown pointer type, expected {0}")] UnknownPointerType(TypeId), }
You could write
pub struct LLVMTypeError {
got: TypeId,
want: Vec,
}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @tamird)
src/llvm/di.rs
line 340 at r3 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
instead of this, we should push the LLVMTypeWrapper::from_ptr call into the
llvm_iterator!
macro.
I love the idea, just - I would need to port all iterators to use LLVMTypeWrapper. Some types (like
src/llvm/types/mod.rs
line 11 at r3 (raw file):
Previously, tamird (Tamir Duberstein) wrote…
BTW, my long-term intention is to wrap absolutely every type and remove the direct
from_ptr
usage, especially insrc/linker/di.rs
.Sounds great to me.
Actually, does it have to be as complicated as in your snippet?
This code compiles:
pub trait LLVMTypeWrapper {
type Target: Sized;
/// Constructs a new [`Self`] from the given pointer `ptr`.
fn from_ptr(ptr: NonNull<Self::Target>) -> Result<Self, LLVMTypeError>
where
Self: Sized;
fn as_ptr(&self) -> Self::Target;
}
And I think it's correct. It takes NonNull
during the wrapper initialization. It returns just a raw pointer when requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @tamird)
src/llvm/types/mod.rs
line 11 at r3 (raw file):
Previously, vadorovsky (Michal Rostecki) wrote…
Actually, does it have to be as complicated as in your snippet?
This code compiles:
pub trait LLVMTypeWrapper { type Target: Sized; /// Constructs a new [`Self`] from the given pointer `ptr`. fn from_ptr(ptr: NonNull<Self::Target>) -> Result<Self, LLVMTypeError> where Self: Sized; fn as_ptr(&self) -> Self::Target; }And I think it's correct. It takes
NonNull
during the wrapper initialization. It returns just a raw pointer when requested.
This trick should work if I use the non-ref types from llvm-sys. So basically, instead of using LLVMValueRef
(which is an alias to *mut LLVMValue
), I could define type Target = LLVMValue
, which would result in NonNull<LLVMValue>
And ofc as_ptr
would need to return *mut Self::Target
, sorry for the mistake in my snipper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @vadorovsky)
src/llvm/di.rs
line 340 at r3 (raw file):
Previously, vadorovsky (Michal Rostecki) wrote…
I love the idea, just - I would need to port all iterators to use LLVMTypeWrapper. Some types (like
Comment cut off?
src/llvm/types/mod.rs
line 11 at r3 (raw file):
Previously, vadorovsky (Michal Rostecki) wrote…
This trick should work if I use the non-ref types from llvm-sys. So basically, instead of using
LLVMValueRef
(which is an alias to*mut LLVMValue
), I could definetype Target = LLVMValue
, which would result inNonNull<LLVMValue>
And ofc
as_ptr
would need to return*mut Self::Target
, sorry for the mistake in my snipper.
I think in reality you can never get your hands on a non-ref type. See LLVMValue
which is an uninhabitable type.
But if you can, great. The only reason for my complicated snippet is indeed to get at the pointed-to type from the pointer type.
LLVMTypeWrapper
traitLLVMTypeWrapper
trait, add more safe wrappers
ff12197
to
c7c20c3
Compare
Instead of keeping raw pointer fields (of types like `LLVMValueRef`, `LLVMMetadataRef`) public and defining constructors with different names, provide the `LLVMTypeWrapper` trait with `from_ptr()` and `as_ptr()` methods. This will allow to convert all safe wrappers from and to raw pointers with one method, which is going to be helpful for building macros for them. On top of that, add more wrappers: * `Module` * `BasicBlock` * `GlobalAlias` * `GlobalVariable` * `Argument` Use these wrappers in iterators, make sure they don't expose the raw pointers to the callers.
Instead of keeping raw pointer fields (of types like
LLVMValueRef
,LLVMMetadataRef
) public and defining constructors with different names, provide theLLVMTypeWrapper
trait withfrom_ptr()
as_ptr()
method for all of them.This will allow to convert all safe wrappers from and to raw pointers with one method, which is going to be helpful for building macros for them.
Depends on #222
This change is