Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions core::ptr::dangling/_mut<T>() -> *const/*mut T #45527

Closed
wants to merge 3 commits into from

Conversation

bluss
Copy link
Member

@bluss bluss commented Oct 25, 2017

Add functions to create a dangling aligned non-null raw pointer. The
use case is to fix all the places we use "1 as *mut T", in
particular in the slice iterators.

The aligned pointer has a value like 0x1, 0x4, 0x8 etc. depending on the
alignment of the pointed-to type. It is useful when a non-null pointer
is required like in references, slices, boxes, and other places.

NOTE: This changes the observable behaviour of slice iterators.
Previously, it would always yield 0x1 pointers (as references) for ZST,
including types like [SomeType; 0]. They now use an aligned non-null
pointer.

Expose std::ptr::dangling/_mut as new unstable functions with intent
on later stabilization, because if this low level trick is needed in
collections, the rest of the Rust ecosystem will need them as well.

@rust-highfive
Copy link
Collaborator

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

@bluss
Copy link
Member Author

bluss commented Oct 25, 2017

Choices, choices.

  1. We could have two versions, one for const and one for mut? I don't know if we should bother with that.
  2. Should it be a static method on raw pointers instead of a free function?

@Gankra
Copy link
Contributor

Gankra commented Oct 25, 2017

This seem fine. API should exactly match ptr::null(_mut), imo.

@petrochenkov
Copy link
Contributor

empty sounds too benign, IMO, almost like default or null, as if it were something valid and not a hack to workaround non-null/alignment attributes.

@bluss
Copy link
Member Author

bluss commented Oct 25, 2017

@petrochenkov I see that, maybe "dangling" is better? empty follows the existing Unique/Shared method names.

@petrochenkov
Copy link
Contributor

@bluss
invalid_nonnull_aligned? 😄

@bluss
Copy link
Member Author

bluss commented Oct 25, 2017

Is std::ptr::nonnull() too confusing? It could std::ptr::dummy() instead, for the dummiest but simplest name.

@Gankra
Copy link
Contributor

Gankra commented Oct 25, 2017

I vote dangling if we can't use empty. It's clearly !!!scary!!! but easy to learn and remember.

@kennytm kennytm added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 25, 2017
@Mark-Simulacrum Mark-Simulacrum added the relnotes Marks issues that should be documented in the release notes of the next release. label Oct 25, 2017
@Mark-Simulacrum
Copy link
Member

Marking as relnotes due to the potential breakage with ZST iterators.

@bluss bluss changed the title Add function core::ptr::empty<T>() -> *mut T Add functions core::ptr::dangling/_mut<T>() -> *const/*mut T Oct 25, 2017
@bluss
Copy link
Member Author

bluss commented Oct 25, 2017

Rewritten to use std::ptr::dangling/_mut.

///
/// This function is safe to call, but the raw pointer it returns does not
/// point to any value.
pub fn dangling<T>() -> *const T {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this function be const, just like null()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah great idea, align_of is const so it should be no problem.

@bluss
Copy link
Member Author

bluss commented Oct 25, 2017

Renaming Unique::empty() / Shared::empty() to the new "dangling" names should probably happen as well, but it could be another PR?

@bors
Copy link
Contributor

bors commented Oct 27, 2017

☔ The latest upstream changes (presumably #45285) made this pull request unmergeable. Please resolve the merge conflicts.

@durka
Copy link
Contributor

durka commented Oct 29, 2017

Please add #[rustc_const_unstable] attributes (we should add this to the unmarked-api lint).

@durka
Copy link
Contributor

durka commented Oct 29, 2017

Or maybe we can hold off adding those until the functions themselves are stabilized?

/// # Safety
///
/// This function is safe to call, but the raw pointer it returns does not
/// point to any value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is not necessarily true. For types with very large alignments, this may point to outside the 0th page (the page that’s guaranteed to be unmapped on most systems) and actually point to something (e.g. manually mapped 1st page). I would reword this to not mention value validity and rather mention that dereferencing the pointer is invalid, or something along the lines.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first draft was that dereferencing the pointer is invalid. While that's true in the "read memory" sense, in the higher level Rust semantic, dereferencing a pointer to ZST is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄 I'm having trouble finding the right words that navigate all of this.

Copy link
Member Author

@bluss bluss Nov 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think any integer_constant as *const T pointer is invalid to use by these rules, since it's not based on anything http://llvm.org/docs/LangRef.html#pointer-aliasing-rules

So it must be treated as "truly dangling" or use some highly implementation defined way to access an actual value through that pointer.

Copy link
Member Author

@bluss bluss Nov 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that sense, it seems best to just say it does not point to any value. If it's equal to a bona fide pointer to a value, that's incidental?

Copy link
Member

@nagisa nagisa Nov 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think any integer_constant as *const T pointer is invalid to use by these rules

This isn’t exactly true, because pretty much any low level code that depends on stuff at well known addresses would be invalid (embedded/low level/OS code relies heavily on that, for example). I believe this rule from the page you’ve linked applies here:

An integer constant other than zero or a pointer value returned from a function not defined within LLVM may be associated with address ranges allocated through mechanisms other than those provided by LLVM. Such ranges shall not overlap with any ranges of addresses allocated by mechanisms provided by LLVM.

This rule does not forbid creation and use of such address ranges, it only forbids overlap with address ranges generated by concepts understood by LLVM (alloca, malloc, ...).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I have pushed a change that has no Safety section, because I have no useful advice apart from what's already in the other doc.

@bluss
Copy link
Member Author

bluss commented Nov 2, 2017

Updated to simplify docs after discussion with @nagisa. We can't say so much more than non-null and aligned about the return value.

@nagisa
Copy link
Member

nagisa commented Nov 3, 2017

I have reviewed the functional changes, but am uncomfortable taking a decision whether the breakage this potentially causes is acceptable. I feel that it is, but people "higher up" might feel otherwise.

@bluss
Copy link
Member Author

bluss commented Nov 4, 2017

Box of Zst has already made the corresponding breaking change, and this catches up with that in the slice iterator.

@Gankra
Copy link
Contributor

Gankra commented Nov 4, 2017

Yeah when I changed Vec I think I shook the ecosystem loose from expecting the old heap::EMPTY (they now check <= alignof).

@shepmaster shepmaster added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Nov 11, 2017
@shepmaster
Copy link
Member

Randomly reassigning to...

r? @alexcrichton

@rust-highfive rust-highfive assigned alexcrichton and unassigned aturon Nov 11, 2017
@alexcrichton
Copy link
Member

I think I may not be quite up to speed on this area, but mind explaining again the motivation behind these functions? For example is the current iteration today incorrect? Memory unsafe?

@bluss
Copy link
Member Author

bluss commented Nov 12, 2017

Shared/Unique (So Rc, Box, Vec etc) had the corresponding change of pointer value in #41064. @gankro would know most.

My idea is that this is something we're gradually making more correct so that we end up more consistent. For example that even ZST pointer values are aligned. As usual, it's hard to find any memory safety issues with any pointer to ZST.

I think that if Box<[T; 0]> has a pointer that is aligned for [T; 0], then std::slice::Iter<[T; 0]> should emit references (&[T; 0]) that match that. This PR changes the slice iterator in that way.

@alexcrichton
Copy link
Member

I feel like internal changes are fine, but I'm just wary to increase the API surface area of the standard library because it "makes us feel good" rather than having a technical driver under the hood?

@scottmcm
Copy link
Member

For example is the current iteration today incorrect? Memory unsafe?

I think that, technically, we're currently relying on UB. This code

unsafe { std::ptr::read(1 as *const [i64;0]); }

gets trans'd as

call void @llvm.memcpy.p0i8.p0i8.i64(i8* %3, i8* %4, i64 0, i32 8, i1 false)

specifying an alignment to the llvm intrinsic that's a lie.

Probably the pass ordering means that the 0-byte memcpy is DCE'd before any kind of value propagation can assume the low three bits of the pointer are zero, but I'm pretty sure LLVM would be allowed to replace this code with unreachable since "if the call to this intrinsic has an alignment value that is not 0 or 1, then the caller guarantees that both the source and destination pointers are aligned to that boundary".

@Gankra
Copy link
Contributor

Gankra commented Nov 14, 2017

It can be considered a bug that accesses to ZSTs ever make it to llvm.

@nagisa
Copy link
Member

nagisa commented Nov 15, 2017 via email

@alexcrichton
Copy link
Member

@scottmcm thanks for the info! That definitely sounds bad and like possible UB. I agree with @gankro though in that we probably shouldn't be generating that in the first place?

In that sense I'm still a little unclear on the motivation here in the sense that there doesn't seem to be a fundamental technical reason for why we would want such functions.

@bluss
Copy link
Member Author

bluss commented Nov 18, 2017

Moving towards aligned references is something we should do, now that it has begun. The soft thing about that is that it's a gradual process, tightening the screws in various places before we are completely in shape.

The motivation for the functions was the rather generic one: Expose std::ptr::dangling/_mut as new unstable functions with intent on later stabilization, because if this low level trick is needed in collections, the rest of the Rust ecosystem will need them as well. I sounds like it's too much of a bother to do it that way. In this case we can easily abstain since users can reimplement the function instead.

@scottmcm
Copy link
Member

I do think it makes sense for the rule to be "reading through a pointer requires that it's aligned", without a "unless the size is zero" exception. And whether the memcpy is right or not, it feels like a ZST load ought to trans as something, if only an assume(p != 0).

Whether there's a need for these methods to be stabilized is a fair question, though. Something similar to dangling() can even be done as [].as_ptr() these days...

@alexcrichton
Copy link
Member

@bluss I'm afraid that doesn't answer my question though? I still don't know why we want to move towards aligned raw pointers in the first place?

@scottmcm but this still doesn't feel like a technical argument in favor of more API surface area in the standard library? It would seem natural to me that a ZST does absolutely nothing in trans (no instructions emitted) so in that sense I'm still unsure on the motivation to add new stable surface area to the standard library.

@bluss
Copy link
Member Author

bluss commented Nov 18, 2017

@alexcrichton I see. It's not about raw pointers, it's to be used by all the other ones (Unique, Shared, &, etc).

@alexcrichton
Copy link
Member

Ok I think that makes sense to me! So we'll be handing out safe references from those types and those require valid alignment? (in that &T should always be aligned for any T).

Right now though you can't pull a "safe" reference without using unsafe, though?

@bors
Copy link
Contributor

bors commented Nov 21, 2017

☔ The latest upstream changes (presumably #45039) made this pull request unmergeable. Please resolve the merge conflicts.

Add functions to create a dangling aligned non-null raw pointer. The
use case is to fix all the places we use "1 as *mut T" is used, in
particular in the slice iterators.

The aligned pointer has a value like 0x1, 0x4, 0x8 etc. depending on the
alignment of the pointed-to type. It is useful when a non-null pointer
is required like in references, slices, boxes, and other places.

NOTE: This changes the observable behaviour of slice iterators.
Previously, it would always yield 0x1 pointers (as references) for ZST,
including types like `[SomeType; 0]`. They now use an aligned non-null
pointer.

Expose std::ptr::dangling/_mut as new unstable functions with intent on
later stabilization, because if this low level trick is needed in
collections, the rest of the Rust ecosystem will need them as well.
@bluss
Copy link
Member Author

bluss commented Nov 21, 2017

(Rebased & bonus squashing into logical parts)

@bluss
Copy link
Member Author

bluss commented Nov 23, 2017

Closing with intent on picking up again after the slice iterator implementation change is worked out. (#46223)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes Marks issues that should be documented in the release notes of the next release. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.