-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Guarantee that heap::EMPTY.offset(0)
has defined behaviour
#25718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The LLVM docs can't be talking about literal |
"allocated" is just the actual terminology they use in the documentation. I have no idea what it properly entails (LLVM vagueness prevails).
(emphasis mine) via http://llvm.org/docs/LangRef.html#getelementptr-instruction |
I can't imagine heap::EMPTY is allocated by any reasonable definition, though. |
Yes, I know the terminology used by the docs, in just pointing out that one has to be careful to avoid conflating it with In any case, due to the zero size, I think there are definitions of allocated which work fine with |
(One way to guarantee this would be to lower |
Special-casing ZSTs covers half the usecase, but it doesn't cover creating the iterator for a empty slice of non-ZSTs. e.g. let vec: Vec<u8> = Vec::new();
for i in vec { .. } where let mut start = vec.ptr;
let end = start.offset(vec.len); // does this work if vec.ptr == heap::EMPTY and vec.len == 0?
while start != end {
let elem = ptr::read(start);
// ... use elem
start = start.offset(1);
} |
Adding a zero check to every offset operation seems bad. Vec should probably add a special case for zero length as done in #24604, and if you want to just mess around with possibly "invalid" pointers, there's the |
I was more hoping at codegen time we could just convince LLVM that everything's alright (or possibly someone with a better understanding could simply determine that this isn't actually a concern -- certainly LLVM doesn't seem to actually produce UB when we do this today). A check on every offset is definitely a non-starter. |
|
@bluss That discussion doesn't seem relevant to |
I talked this out with @sunfish on IRC. Evidently allocation is a very abstract concept for LLVM where it's basically "as long as you're consistent it's cool". This means that we can pretend heap::EMPTY is an allocated ZST without telling LLVM anything, and so offset-by-0 is legal since that's 1-past-the-end. However offset-by-0 off Therefore we don't need to worry about ZSTs (yay!) but do need to care about empty arrays. As such I'm closing this as basically resolved. |
Currently we have an awkward situation where
heap::EMPTY
(1 as *mut _
) is frequently used to represent zero-sized types or other "non-allocations". It is often desirable to offset-by-0 from this in general code to avoid complicating things with special-cases for ZSTs or empty array iterators.Unfortunately it is possible to interpret the LLVM GEP docs to determine that
heap::EMPTY.offset(0)
(which is marked as inbounds unconditionally) has undefined behaviour, because heap::EMPTY is not actually allocated. I see two ways forward for this: markheap::EMPTY
as allocated in some way for LLVM, or ensureoffset(0)
is always safe. I favour the latter, simply because it's plausible to want to offset by 0 off some other ptrs. However the former is acceptable if this is not possible.The text was updated successfully, but these errors were encountered: