Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic Pointer to Field #2708

Closed
wants to merge 30 commits into from
Closed

Conversation

RustyYato
Copy link

@RustyYato RustyYato commented Jun 5, 2019

Rendered

This RFC aims to provide a generic way to talk about the fields on types! Then go one step further and allow smart pointers to project to those fields!

struct Foo { bar: Bar }
let x: Pin<&Foo> = ...;
let y = x.project(Foo.bar);

here is a very alpha crate that implements some of the ideas laid on in this RFC.

@RustyYato
Copy link
Author

This RFC had some previous discussion here on internals

I would like to focus the discussion first on if we actually need this feature in the first place.
I think we do, as it make smart pointers first-class in a way that only references and Box<_> currently are, i.e. only references and Box<_> are allowed to project to fields, everything else must go through them first. This RFC is the first step towards changing that.

@Diggsey
Copy link
Contributor

Diggsey commented Jun 6, 2019

I wrote this crate which seems to do pretty much the same thing: https://crates.io/crates/field-offset

@RustyYato
Copy link
Author

RustyYato commented Jun 6, 2019

Cool, unfortunately according to the current unsafe guidelines your crate is unsound to use on any type that contains references or NonZero* types because of the, std::mem::zeroed(), which is insta-UB. Otherwise it looks like we came up with similar formulations of the solution.

I hadn't thought about chaining field offsets, but that doesn't seem to be mission critical, so we can decide on that later.

What do you think of this RFC?

@Diggsey
Copy link
Contributor

Diggsey commented Jun 6, 2019

Cool, unfortunately according to the current unsafe guidelines your crate is unsound to use on any type that contains references or NonZero*

The crate was written before the existence of the unsafe code guidelines when this was considered a valid pattern (the variable is never accessed). In practice, this does not cause any problems, but it will be nice to be able to stop relying on implementation details of the compiler once MaybeUninit is stabilised.

What do you think of this RFC?

One of the considerations for whether something should be added as a language feature, particularly a complex and niche one such as this, is whether it can be implemented as a library (either as part of the standard library or a separate crate).

It would be nice to highlight in the RFC what the benefits are of having this be a language feature as opposed to a library, and why they justify the additional complexity in the language. The motivation section currently says:

This feature cannot be implemented as a library effectively because it depends on the layouts of types, so it requires integration with the Rust compiler until Rust gets a stable layout (which may never happen).

But this is greatly overstating the impossibility of implementing this in a library.

@RustyYato
Copy link
Author

RustyYato commented Jun 6, 2019

The crate was written before the existence of the unsafe code guidelines when this was considered a valid pattern

Ok, I just noticed that this crate was last updated 3 years ago.

but it will be nice to be able to stop relying on implementation details of the compiler once MaybeUninit is stabilised.

MaybeUninit won't solve this because the only qay to get a member of a type is to have a value, reference or box, all of which need to be a valid instance of the type to work.

particularly a complex

This feature isn't particularly complex to implement, although it does have some far reaching implications. Some of these implications were analyzed on internals, but I suspect that there is more that we may have missed.

The only parts that absolutely have to be implemented in the compiler is the Field trait. Everything else is a library built on top of that. In loght of this, we could shave this proposal down to just the pointer projections as associated functions on raw pointers, and get rid of the Project trait entirely.

But this is greatly overstating the impossibility of implementing this in a library

How would you get the necessary offsets of fields soundly in light of the current unsafe guidelines? I don't see how.

Keep in mind that the compiler already has all the information available, this is a way of exposing that information safely.

@comex
Copy link

comex commented Jun 6, 2019

MaybeUninit won't solve this because the only qay to get a member of a type is to have a value, reference or box, all of which need to be a valid instance of the type to work.

You can do it through a raw pointer.

@RustyYato
Copy link
Author

RustyYato commented Jun 6, 2019

@comex How? You can't access the field through a raw pointer without already knowing its offset. If you can we can close this and implement the whole proposal as a library. But seeing as all libraries that try to provide this use UB via null pointer dereference, or mem::zeroed/undefined, or requires marking types at the definition. The last is unacceptable because it doesn't scale at all, everyone must think about it even if they don't want to.
Also note that C++, a language woth less restrictions than Rust, built it into the language. This seems to be compelling evidence that it is not possible to do safely.

References:

using null pointer: crate and link to offending code
using mem::zeroed crate and link to offending code
marking types at definition: crate

I searched through hundreds of crates on crates.io to find these examples.

@ahicks92
Copy link

ahicks92 commented Jun 6, 2019

If the core issue here is that you need the offset of a field in order to implement something like this as a library, then maybe this should be a offsetof RFC.

As someone who has used more than a little C++, the need for pointers to members comes up rarely if ever,. Most of the time it can be done another way (i.e. there's a trick for some trees where instead of having left and right you put them in an array instead, then you can mirror algorithms by passing an additional index. I don't have a reference for this handy). I see why it might be more valuable to Rust, but I feel like in C++ it was added for the sake of completeness and possibly to mollify C programmers who use offsetof tricks in anger. In general I think a lot of early C++ just happened, and that drawing conclusions from it about what is and isn't necessary is a mistake, especially since one of Rust's selling points is not being C++ (even if the complexity is getting up there these days).

If there's going to be a syntax to refer to a field descriptor, the next obvious question from the perspective of someone new to Rust finding this seems to me to be "how do I refer to a field's type?" That might be quite useful in macro land, and something worth doing; reserving Type.field for that might be beneficial.

I feel like motivation and guide-level explanation need to be fleshed out more in the direction of someone wanting to learn what the feature is. Guide-level explanation seems to be reference-level explanation, and motivation is written such that you have to already know why you want the feature in order to understand it (that's not quite right; finding better wording is failing me).

But I do like what this is getting at.

@RustyYato
Copy link
Author

RustyYato commented Jun 6, 2019

"how do I refer to a field's type?"

Using the Field trait which has the associated type Type that tells what the type of the field is.

I feel like motivation and guide-level explanation need to be fleshed out more in the direction of someone wanting to learn what the feature is.

Yes, they need to be fleshed out more and reworded in general.

@spunit262
Copy link

An idea just came to me. We already have projection though references via patterns so we could extend that to pointers.
I'm not actually sure if this is a good idea, but it is no new syntax, just building on what already exists.
As pointers are not required to point anywhere, any pattern that require inspecting the pointee would not be allowed.

struct Foo { a: u32, b: i16, c: f64, }

// works today
fn by_ref(Foo { b, .. }: &Foo) -> &i16 { b }

fn by_ptr(Foo { b, .. }: *mut Foo) -> *mut i16 { b }

A safe offset_of as the pointer is explicitly allow to not point at a valid object.

macro_rules! offset_of {
    ($t:path => $f:tt) => {
        match core::ptr::null<$t>() {
            $t { $f: a, .. } => a as usize
        }
    }
}

@petrochenkov
Copy link
Contributor

Related issue - #1287 (about the Object::field syntax).

It conflicts with associated values indeed, but object.field also conflicts with method calls and we've been living with it mostly successfully so far.
(Some disambiguated syntax for the conflicting cases is probably still needed, similarly to <Type [as Trait]>::Assoc for associated items.)

@Pauan
Copy link

Pauan commented Jun 6, 2019

It conflicts with associated values indeed, but object.field also conflicts with method calls and we've been living with it mostly successfully so far.

In this proposal it's being used on a type, not a value. So it's not object.field, it's Type.field. So what situations would that cause conflicts?

@petrochenkov
Copy link
Contributor

@Pauan
That sentence is not talking about this proposal, but about the stable object.field syntax applied to values, which is usually disambiguated successfully, but occasionally you have to write things like (object.field)() to get the desired meaning.

@Pauan
Copy link

Pauan commented Jun 6, 2019

@petrochenkov Ah, sorry, I misunderstood. Thanks for the correction.

@RustyYato
Copy link
Author

RustyYato commented Jun 6, 2019

@petrochenkov @spunit262 syntax discussions are off topic for now, I would like to focus on if this feature is needed, and is it sound. If either answer to no, then this proposal may end up scraped, so syntax discussions now are not productive yet. I used the dot syntax as a placeholder for whatever syntax we end up with.

@eaglgenes101
Copy link

If the core issue here is that you need the offset of a field in order to implement something like this as a library, then maybe this should be a offsetof RFC.

Rust has been moving in the direction of making unsafe code more ergonomic so that there's less tedium associated with it, and therefore less chance for mistakes. Being able to get a properly offset *mut U field pointer from *mut T directly through projection would be less tedious than having to perform manual pointer offsetting and type cast juggling for the same result.

@ahicks92
Copy link

ahicks92 commented Jun 6, 2019

@eaglgenes101
I don't disagree but the core objection raised by @Diggsey is whether this can be implemented as a library. Giving them the ability to implement their crate properly does solve the ergonomics issue for the unsafe code case. There would be slight ergonomics gains if it was in the language, I suppose.

I think the more interesting thing about unsafe code here is that having unsafe code in std, or a compiler intrinsic, or whatever means that it's by trusted people by most definitions of trusted, and enables the ecosystem to be 100% safe code for some of these use cases.

@RustyYato
Copy link
Author

RustyYato commented Jun 6, 2019

The bare minimum proposal that we can strip this RFC down to is this

/// Opaque type that can only be constructed by the compiler.
struct FieldDescriptor<F: Field> { ... }

/// Represents a field on some type, must be implemented by the compiler, and only the compiler
unsafe trait Field {
    /// The type that the fields belongs to
    type Parent: ?Sized;
    /// The type of the field itself
    type Type: ?Sized;
    /// Describes the data needed to convert from *[const|mut] Self::Parent to *[const|mut] Self::Type
    const FIELD_DESCRIPTOR: FieldDescriptor<Self>;
}

impl<T: ?Sized> *const T {
    fn project<F: Field<Parent = T>>(self, field: F) -> *const F::Type {
        // safe projection, where invalid pointers are handled in a safe, but implementation defined way
    }
    
    unsafe fn project_unchecked<F: Field<Parent = T>>(self, field: F) -> *const F::Type {
        // unsafe projection, where using invalid pointers are UB
    }
}

impl<T: ?Sized> *mut T {
    fn project<F: Field<Parent = T>>(self, field: F) -> *mut F::Type {
        // safe projection, where invalid pointers are handled in a safe, but implementation defined way
    }
    
    unsafe fn project_unchecked<F: Field<Parent = T>>(self, field: F) -> *mut F::Type {
        // unsafe projection, where using invalid pointers are UB
    }
}

Everything else could be punted to a separate library, i.e. everything related to Project. These functions could be implemented as intrinsics or otherwise. These functions will live inside of std::ptr, and the Field trait and FieldDescriptor type can live in either a new module or in std::marker.

I find this to be just the right size for such a niche feature, only a handful of associated functions, a type, and a trait.

RustyYato added 2 commits June 6, 2019 10:49
as it controls unsafe code, `Field` must be an unsafe trait
text/0000-ptr-to-field.md Outdated Show resolved Hide resolved
text/0000-ptr-to-field.md Outdated Show resolved Hide resolved
@taiki-e
Copy link
Member

taiki-e commented Sep 26, 2019

@KrishnaSannasi

@taiki-e yes, sorry my blanket assertion was wrong, pin-utils is indeed safe. But pin-utils doesn't fit the usecase of this RFC: A general way to access any field through any smart pointer, not just pins.

pin-utils::{unsafe_pinned, unsafe_unpinned} is not safe... My link is about pin-project.

This may be fine for non-owning smart pointers like references or pins thereof, but not for owning smart pointers like Rc or Box.

Yeah, but I think GAT is needed to handle references generically.

Also pin project offers a different api from what I was thinking of, by projecting all the fields at once.

Yeah, but if referring to each field, you need to call .as_mut() every time. On the other hand, if the projection method takes &mut Pin<&mut Self> to avoid this, a lifetime problem will occur. (taiki-e/pin-project#65, rust-lang/rust#54934, taiki-e/pin-project#47 (comment))

I think the pin-project's approach is preferred at least until these issues are addressed.

@RustyYato
Copy link
Author

@ckaran the raw reference operator is being fast tracked to stabilization because of how critical it is, but I'm not sure when it will land on nightly.

@taiki-e

pin-utils::{unsafe_pinned, unsafe_unpinned} is not safe... My link is about pin-project

sorry, typo on my part.

Yeah, but I think GAT is needed to handle references generically

Nope, we can do this without GAT, I'll post a minimal crate later today. See that for details.

I think the pin-project's approach is preferred at least until these issues are addressed.

This is the sort of stuff that I want to experiment with in a crate before putting it in std.

@ckaran
Copy link

ckaran commented Sep 26, 2019

@KrishnaSannasi

...minimal crate later today.

Looking forward to seeing it! 👍

@RustyYato
Copy link
Author

RustyYato commented Sep 26, 2019

@ckaran @taiki-e I have updated this pull request with the crate.

If anyone wants to help improve this crate, I would love the help just send in a PR or issue!

@ckaran
Copy link

ckaran commented Sep 26, 2019

@KrishnaSannasi I plan on looking at it, but will likely be slow; my PhD is taking up a lot of my time right now.

@mjbshaw
Copy link
Contributor

mjbshaw commented Oct 20, 2019

This RFC would be a lot stronger if it was building on top of a proper compile-time reflection API.

@RustyYato
Copy link
Author

With a compile time reflection API and &raw [const|mut] this could be implemented as a library, in fact we can already almost implement this as just a library.

@ckaran
Copy link

ckaran commented Oct 21, 2019

@mjbshaw Are you suggesting that this library should be build on top of something like @dtolnay's reflect crate? If so, what changes would you suggest?

@ckaran
Copy link

ckaran commented Oct 21, 2019

@KrishnaSannasi Do you mind if I make some formatting changes to your crate? It's a little easier for me to read code comments if the comments wrap at 80 characters, but I'm not sure if you're interested in that.

@RustyYato
Copy link
Author

@ckaran, sure just send in a PR and I'll merge it.

@KodrAus KodrAus added the Libs-Tracked Libs issues that are tracked on the team's project board. label Jul 29, 2020
@scottmcm
Copy link
Member

scottmcm commented Sep 2, 2020

We discussed this in a lang team backlog grooming meeting today.

That ended up in a place that was similar to this comment from @RustyYato:

Given that everything in this RFC could be built on top of raw references I think that we could just build this as a library on top of that. <#2708 (comment)>

More specifically, we thought it would be good for this to follow a similar process to what safe-transmute did: see how far this can get while entirely in a library, then specify exactly what the minimal lang change requested to make this work nicely would be, with the rest of it up to libs to decide how it should look.

@scottmcm
Copy link
Member

(For extra clarity: from a lang perspective we'd postpone this, but this is also tagged for @rust-lang/libs so I don't want to do that unilaterally.)

@scottmcm
Copy link
Member

Hmm, well, no comments from libs so I'm going to move to

@rfcbot fcp close

The lang team discussion on this was that we're not ready to take the lang parts of this. We'd like to see how far this can get as a library solution (perhaps based around the raw pointer macros internally) and then might be open to a focused lang change to address a particularly-troublesome part.

@rfcbot
Copy link
Collaborator

rfcbot commented Nov 10, 2020

Team member @scottmcm has proposed to close this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-close This RFC is in PFCP or FCP with a disposition to close it. labels Nov 10, 2020
@dtolnay
Copy link
Member

dtolnay commented Feb 10, 2021

Approving on behalf of withoutboats, as per rust-lang/team#526.

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Feb 10, 2021
@rfcbot
Copy link
Collaborator

rfcbot commented Feb 10, 2021

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Feb 20, 2021
@rfcbot
Copy link
Collaborator

rfcbot commented Feb 20, 2021

The final comment period, with a disposition to close, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC is now closed.

@rfcbot rfcbot added to-announce closed This FCP has been closed (as opposed to postponed) and removed disposition-close This RFC is in PFCP or FCP with a disposition to close it. labels Feb 20, 2021
@rfcbot rfcbot closed this Feb 20, 2021
@matthieu-m matthieu-m mentioned this pull request Sep 30, 2022
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-expressions Term language related proposals & ideas A-impls-libstd Standard library implementations related proposals. A-product-types Product type related proposals A-raw-pointers Proposals relating to raw pointers. A-traits-libstd Standard library trait related proposals & ideas A-typesystem Type system related proposals & ideas closed This FCP has been closed (as opposed to postponed) finished-final-comment-period The final comment period is finished for this RFC. Libs-Tracked Libs issues that are tracked on the team's project board. T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC. to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.