-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document explicitly that Weak::from_raw(ptr::null()) is ub #114525
Conversation
r? @cuviper (rustbot has picked a reviewer for you, use r? to override) |
To me, the existing documentation seems pretty clear about the fact that you are only allowed to call I worry that drawing attention to specific implementation details might end up being more confusing to the reader, by muddying the message of what is and isn't allowed. I suppose it could still make sense to point out that |
Yeah, the null pointer is just one of many pointers that you are not allowed to use, so it's somewhat confusing to put special emphasis on it. |
Right, the my concern was just if
I have put special emphasis on null pointers, because the internally used type is
I've considered changing the docs to explicitly allow the result of |
It can take in a null pointer if Just because |
Right -- if |
That second sentence is more or less what this PR is adding to the docs, to make this explicit. I read the current docs, but they were confusing enough for me to open #114517.
Yeah for particular values like 0xadfe it makes less sense, maybe someone thinks they can initialize some
Yes. Clarifying that situation is what this PR is about. |
library/alloc/src/rc.rs
Outdated
@@ -2818,6 +2818,8 @@ impl<T: ?Sized> Weak<T> { | |||
impl<T: ?Sized, A: Allocator> Weak<T, A> { | |||
/// Returns a raw pointer to the object `T` pointed to by this `Weak<T>`. | |||
/// | |||
/// # Safety |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is strange to see a "safety" comment on a safe function
I find the docs with your change more confusing than without. The new paragraph is a big hypothetical and hard to follow. Maybe try expressing what the underlying problem is in a different way? I understand you were confused by the current docs, but I think more people will be confused by your new docs than by the current docs, so this is not an improvement. The current docs explicitly state "must still own its potential weak reference", which already means "you can only call this exactly as often as |
1a32f3e
to
92e8760
Compare
Good point. I'd added it to add more structure and make it obvious that this is relevant for unsafe things people might do with it. That was my understanding of the
Thanks for the feedback, I have pushed a new version that I think is better, do you like it?
That is also something that's hard to understand tbh, because there is two things here at play: the thing the reference points to, and also the weak reference itself, which is an abstract concept encoded by the |
library/alloc/src/rc.rs
Outdated
/// functions, not what these functions can maybe return, or are documented to potentially return. | ||
/// For example, [`into_raw`] can return dangling pointers, but this doesn't allow you to create | ||
/// a dangling pointer yourself and pass it to `from_raw`. This rule holds even if that pointer | ||
/// created via some other means has the same value than some output you have observed from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it has entirely the same value then you can use it. The subtle thing is that "same value" on pointers means "same address and same provenance".
What is that last sentence supposed to defend against?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first was to explain that even though into_raw
might return some dangling pointer, including possibly the null pointer, from_raw
does not allow you to create an arbitrary dangling pointer. The second was to explain that even if the address is the same (or whatever else goes into the PartialEq
/Hash
impls on pointers), it's still not allowed. I don't know how provenance is defined, and if the docs still say that provenance's definition is vague, then I think explaining the impact here in clear words is helpful. I thought provenance was only a concept for valid pointers for example, didn't think that different invocations of ptr::invalid_mut()
with the same address also yielded different provenances, instead of just having one and the same "INVALID" provenance (whereas proper allocations each create their own provenance).
Also, even if provenance allows people to use ptr::invalid_mut(usize::MAX)
, which is equivalent to Weak::into_inner(Weak::new())
, we still want to forbid that, because we reserve changing the implementation in the future, e.g. to use null pointers instead of usize::MAX
. That's at least how I understand the current documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second was to explain that even if the address is the same (or whatever else goes into the PartialEq/Hash impls on pointers), it's still not allowed.
But the first already says that. It sounds like you are saying the same thing 3 different times using different words, and I still find that more confusing than helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I'll remove the repetition, but I'll keep the part with Weak::new
as @cuviper likes it.
library/alloc/src/rc.rs
Outdated
/// Note that `from_raw` expects values that actually originated from a call to one of these | ||
/// functions, not what these functions can maybe return, or are documented to potentially return. | ||
/// For example, [`into_raw`] can return dangling pointers, but this doesn't allow you to create | ||
/// a dangling pointer yourself and pass it to `from_raw`. This rule holds even if that pointer | ||
/// created via some other means has the same value than some output you have observed from | ||
/// [`into_raw`]: use [`Weak::new`] instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Note that `from_raw` expects values that actually originated from a call to one of these | |
/// functions, not what these functions can maybe return, or are documented to potentially return. | |
/// For example, [`into_raw`] can return dangling pointers, but this doesn't allow you to create | |
/// a dangling pointer yourself and pass it to `from_raw`. This rule holds even if that pointer | |
/// created via some other means has the same value than some output you have observed from | |
/// [`into_raw`]: use [`Weak::new`] instead. | |
/// Note that `from_raw` expects values that actually originated from a call to one of these | |
/// functions and have not been used with `from_raw` yet, | |
/// not what these functions can maybe return, or are documented to potentially return. | |
/// For example, [`into_raw`] can return dangling pointers, but this doesn't allow you to create | |
/// a dangling pointer yourself and pass it to `from_raw` |
library/alloc/src/rc.rs
Outdated
/// Note that `from_raw` expects values that actually originated from a call to one of these | ||
/// functions, not what these functions can maybe return, or are documented to potentially return. | ||
/// For example, [`into_raw`] can return dangling pointers, but this doesn't allow you to create | ||
/// a dangling pointer yourself and pass it to `from_raw`. This rule holds even if that pointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is better, thanks.
I am still somewhat concerned though, do we have to write a whole paragraph explaining the meaning of "must have originated from" every single time we use that phrase? Is that phrase really so unclear? Should we use another phrase that is more clear? Will people read this, then see "must have originated from" elsewhere where we do not have a whole paragraph about this notion, and conclude "they had a clarification there, but not here, hence they obviously mean something else by 'must have originated' because otherwise they would have added the same clarification"?
There s a real danger with having too many clarifications, which is that people think they describe a special case rather than clarifying a general principle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have to write a whole paragraph explaining the meaning of "must have originated from" every single time we use that phrase?
The obvious solution to that problem is to link the phrase to a centralized location (e.g. the rust reference) giving the explanation of what it means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I guess that would be the alternative here.
With my suggestion above applied, I can live with this PR, though I am on the fence about whether it truly is an improvement -- I'm happy to leave that to the libs-api reviewer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still somewhat concerned though, do we have to write a whole paragraph explaining the meaning of "must have originated from" every single time we use that phrase?
The main problem here is that as_ptr
(and into_inner
linking to it) says that it can potentially return null or invalid pointers. We say this to be future proof, not because we return null pointers, but this creates this potential misunderstanding that I'm trying to guard against.
With valid allocations, it makes sense that you require the pointer to come from a specific function, to prevent the malloc + delete problem. But with invalid ones it seems weird to do that.
One can very easily assume that this was written only keeping the valid pointers in mind, but not the invalid ones. Often even specifications leave something unspecified and then implementers interpret it differently.
Of course there is other misunderstandings too, and yeah eventually you run into the situation where you are creating more misunderstandings than fixing, but I think this PR is improving the situation.
There s a real danger with having too many clarifications, which is that people think they describe a special case rather than clarifying a general principle.
Good point, that's what the word "Note" at the start tries to accomplish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the Weak::new
reference is good, but we could pare down to mostly just that suggestion. If you're trying to second-guess the pointer origin, this is probably what you actually need.
This comment has been minimized.
This comment has been minimized.
@cuviper do you like the new version? |
/// a dangling pointer yourself and pass it to `from_raw`. Even if it has the same address | ||
/// as a pointer created by [`into_raw`], use [`Weak::new`] instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// a dangling pointer yourself and pass it to `from_raw`. Even if it has the same address | |
/// as a pointer created by [`into_raw`], use [`Weak::new`] instead. | |
/// a dangling pointer yourself and pass it to `from_raw`, even if it has the same address | |
/// as a pointer created by [`into_raw`]. Use [`Weak::new`] instead. |
/// return, or are documented to potentially return. | ||
/// For example, [`into_raw`] can return dangling pointers, but this doesn't allow you to create | ||
/// a dangling pointer yourself and pass it to `from_raw`. Even if it has the same address | ||
/// as a pointer created by [`into_raw`], use [`Weak::new`] instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same diff as above
/// or is documented to potentially return. | ||
/// For example, [`into_raw`] can return dangling pointers, but this doesn't allow you to create | ||
/// a dangling pointer yourself and pass it to `from_raw`. Even if it has the same address | ||
/// as a pointer created by [`into_raw`], use [`Weak::new`] instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and yet another time
@rustbot author |
yes sorry I didn't get to them, I will try to take a look soon. |
@est31 any updates on this? thanks |
The PR is a year old and I didn't get to it. I think it's time to close it. Should I find the motivation to work on it again, I'll open a new PR. |
Fixes #114517 . I was confused for a moment, as the docs were very terse on the question of whether
Weak::from_raw(ptr::null())
is ub or not. It is UB, as the implementation usesNonNull
, and this is intended, butas_ptr
andinto_raw
andinto_raw_and_alloc
still can return null pointers in the future.