RFC: truly unsized types #709

mzabaluev · 2015-01-22T22:16:05Z

Further subdivide unsized types into dynamically-sized types, implementing
an intrinsic trait DynamicSize, and types of indeterminate size. References
for the latter kind will be thin, while allocating slots or copying values
of such types is not possible outside unsafe code.

Rendered

mzabaluev · 2015-01-22T22:24:15Z

Examples where this is needed (part courtesy of @SSheldon on the forum):
#592
SSheldon/rust-objc#6
http://www.reddit.com/r/rust/comments/292h3g/c_structs_ending_in_zerolength_arrays_and_ffi/cigugy6

Cc @brson @Kimundi

ftxqxd · 2015-01-23T22:35:52Z

What exactly is the difference between ‘truly’ unsized types and DSTs? The only difference I can see is that DSTs use fat pointers, while ‘truly’ unsized types use thin pointers. I don’t think that’s a necessary distinction to make, because they both have the same restriction: they cannot be used without being behind a pointer. If we, for example, introduced fat pointers that have more than one word of extra data (a feature I’ve wanted a few times), they wouldn’t need extra traits for each separate size, so why should having zero bytes of extra data be any different?

mzabaluev · 2015-01-24T04:53:21Z

The only difference I can see is that DSTs use fat pointers, while ‘truly’ unsized types use thin pointers.

DSTs are dynamically sized types, meaning that the size of the value is known while the value exists. To reconstruct a reference to a DST from a raw pointer, one has to obtain the size. This RFC proposes to lift this restriction in cases when the size of the holistic value is not needed immediately or is unknown.

Ericson2314 · 2015-01-25T19:26:18Z

A huge area this would help is disambiguating function pointers and functions. Basically it would be cool if fn(A..) -> B is the type of function themselves, and then &'a fn(A..) -> B is the type of function pointers, 'static being the common case. While I prefer this even out of elegance alone, it potentially really help with safe dynamic linking, etc.

There was a thread around this, but I'm afraid and I can't find it. IIRC @eddyb said it wouldn't work exactly because the unsized vs dynamically sized problem.

mzabaluev · 2015-01-25T21:31:31Z

@Ericson2314: you might be referring to #661.

Ericson2314 · 2015-01-26T22:51:09Z

Hmm, that didn't have the conversation with eddyb I remember, but that's definitely one example of people wanting safer dynamic linking. Might of been a conversation about JITing, where perhaps this is even more useful (I've never heard of unlinking dynamically linked libraries).

ftxqxd · 2015-01-26T23:09:34Z

@Ericson2314 Is this discuss post on fn lifetimes the discussion you were thinking about?

Ericson2314 · 2015-01-26T23:21:16Z

@P1start well, not exactly as I remembered, but that's probably it. Thanks!

Diggsey · 2015-01-29T02:39:52Z

This is a bit of a crazy idea, but you could give DynamicSize an associated type (Ptr) which the compiler would automatically use for references to that type. This would eliminate the need for special handling of slices and trait objects within the compiler.

This type would fullfill the role of a raw pointer, *T by implementing Deref, from which the compiler constructs an implicit reference type &T which enforces the correct borrow rules, but otherwise acts like the Ptr type. A &T then implicitly converts only to DynamicSize::Ptr, but not *T.

The Unsized bound is equivalent to DynamicSize<Ptr = *T>, ie. a raw pointer.

eddyb · 2015-01-29T04:27:02Z

My understanding of "unsized" is that it is the result of "unsizing", which turns static type info into dynamic values. Not that I have a better name for the types that lack any size information whatsoever.

Kimundi · 2015-01-29T09:21:26Z

I agree with @Diggsey that a generalization of DST metadata might be more worthwhile than adding special cases to the possible DST values. At least, if those special cases add additional syntax and semantic, like in this proposal.

mzabaluev · 2015-01-29T10:17:32Z

@Kimundi, what additional syntax you are referring to? The only visible changes this proposal adds are a new marker type and the DynamicSize trait, both pretty conventional as per the current syntax.

Semantics do change, but I think there are two different use cases currently lumped in with Sized:

The Sized bound tells that the value has a statically known size, so it can be copied or moved.
Lifting the Sized bound to accommodate DSTs, while keeping the assumption that the size of the value is known at runtime.

I'm not entirely sure that case 2 is a real concern, since an implementation of a generic trait parametric on an ?Sized type would need to specify the DST in contexts where the size is needed. So there may be little need to use the DynamicSize bound explicitly. However, I haven't gone over this in a formal way and I'm not familiar with the intricacies of the type system, so your help is appreciated with proving or disproving my assumption.

Diggsey · 2015-01-29T17:19:17Z

@mzabaluev
My suggestion is in no way an objection to this RFC - I think this has a chance of being accepted for 1.0 which would be great, while mine obviously doesn't, and I think with this RFC in place, expanding on DynamicSize can be done in a backward compatible way.

With regard case 2: there was a PR somewhere to allow "size_of_val" and friends to work on DSTs, so the distinction between DynamicSize and !Sized is definitely needed, although there's still a bit of a grey area between the cases of "raw pointer, but can figure out the size at runtime (not necessarily in an efficient way)" vs "raw pointer, can't figure out size/type has no concept of size".

Kimundi · 2015-01-29T19:16:28Z

@mzabaluev: I meant the addition of the additional marker types and traits as additional "syntax".

In my opinion, there is no difference between unsized and dynamically sized - in both cases being sized refers the knowing the size at compiletime, and in both cases there is some runtime mechanism for finding it out.

Whether that runtime mechanism involves storing the size directly (slices), or in form of a vtable (trait objects), or as part of the data structure pointed-at (CStr) seems unrelated to that core distinction to me.

I'm not saying that a DST value with a thin pointer representation is not useful, I just don't think it needs to be its own "thing".

Diggsey · 2015-01-29T21:55:14Z

@Kimundi I think there is a useful difference between thin/fat pointer DSTs, which is that thin pointers can be transmuted between each other, and passed to C/C++ code as a raw pointer. It's quite easy to come up with examples where the generic constraints should be "is a thin pointer" rather than "is not a DST" (for example, compatibility with void*).

Ericson2314 · 2015-01-29T22:09:11Z

In my opinion, there is no difference between unsized and dynamically sized - in both cases being sized refers the knowing the size at compiletime, and in both cases there is some runtime mechanism for finding it out.

It was my understanding that this would support cases where you couldn't find out. This is needed for functions.

mzabaluev · 2015-01-29T22:53:35Z

@Kimundi even the types with a size that can be calculated from content may have sufficiently different performance characteristics for this operation (e.g. O(N) for C strings vs O(1) for DSTs) so genericity may not be desirable. Anyway, there doesn't seem to be a generic way to calculate the size of a DST value, so the only difference is whether a fat pointer is required to represent a reference.

mzabaluev · 2015-01-30T09:09:01Z

@Diggsey If I understood your proposal about DynamicSize::Ptr correctly, the DynamicSize trait is meant for the DSTs in their current form only, and a thin-pointer requirement would still need a negative bound on that (or a positive bound on its complement provided by the compiler), right?

Diggsey · 2015-01-30T16:58:19Z

@mzabaluev Originally, I was thinking of something like this:

All types become either Sized or DynamicSize, depending solely on whether their size is known at compile-time.
DynamicSize has an associated type, Ptr
What used to be "Unsized" is now just "DynamicSize<Ptr = *T>", ie. uses a thin pointer. (edit: it's been pointed out that even *T is not always a thin pointer, see below for alternative)

However, it might make more sense like this:

The Unsized trait has an associated type, Ptr
All types whose size is known at compile-time, implement Sized, unless they opt-out
Unsized is implemented automatically for all T: Sized, with Ptr = *T
All other types must implement Unsized directly
Some !Sized types may implement RuntimeSized, which has a method to calculate the size of a value at runtime. This would include all current DSTs.
All Sized types implement RuntimeSized automatically.

So the useful bounds become:

Has thin pointer => size_of(::Ptr) == size_of(isize)
Has compile-time size => T: Sized
Has runtime size => T: RuntimeSized
And the complements of the above, using !

And for any type, it's pointer type can be obtained via:

::Ptr

SSheldon · 2015-01-30T17:10:39Z

"DynamicSize<Ptr = *T>", ie. uses a thin pointer

@Diggsey, fyi *T is not guaranteed to be a thin pointer; for types with fat references, *T is also fat (like *str, *Trait, *[T])

Diggsey · 2015-01-30T17:17:30Z

@SSheldon Ah, I didn't realise that - I've updated my previous post to reflect that.

mzabaluev · 2015-01-30T22:40:41Z

@Diggsey:

Has thin pointer => size_of(::Ptr) == size_of(isize)

That's quite a mouthful, and I don't think current Rust allows expressions as bounds. But if bounds like that could actually be used, a convenience trait could be provided to assert it.

alexcrichton · 2015-02-03T22:42:23Z

Given #738 it looks like we may end up removing all of the various marker structs, in which case adding a new NotSized may stick out a bit.

Perhaps we could beef up the compiler to consider types unsized such as:

struct CStr {
    data: c_char,
    marker: Phantom<[c_char]>,
}

In this case we'd basically be saying that the compiler for type analysis should consider CStr as containing [c_str] but for representation purposes it only has one c_char field.

(just a thought)

pnkfelix · 2015-02-05T21:45:56Z

postponing for post 1.0; cannot spend time thinking about this.

(Also, RFC is thin on details, at least for a change of this size.)

Added RFC: truly unsized types

44fae70

mzabaluev added 3 commits January 23, 2015 00:46

Properly link to the negative bounds RFC PR

ffe327b

Editorial

9ed6176

More editorial

0538214

mzabaluev mentioned this pull request Jan 26, 2015

RFC: Convention for constructing lifetime-bound values from raw pointers #556

Merged

dgrunwald mentioned this pull request Jan 27, 2015

Empty structs should not need #[repr(C)] #750

Closed

SSheldon mentioned this pull request Jan 31, 2015

mem::swap doesn't work with Objects SSheldon/rust-objc#6

Open

pnkfelix closed this Feb 5, 2015

pnkfelix added the postponed RFCs that have been postponed and may be revisited at a later time. label Feb 5, 2015

pnkfelix mentioned this pull request Feb 5, 2015

RFC: truly unsized types #813

Closed

nikomatsakis mentioned this pull request Mar 23, 2015

RFC: Function pointers reform #996

Closed

strega-nil mentioned this pull request Mar 3, 2016

Custom Dynamically Sized Types for Rust #1524

Closed

petrochenkov added T-lang Relevant to the language team, which will review and decide on the RFC. and removed postponed RFCs that have been postponed and may be revisited at a later time. labels Feb 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: truly unsized types #709

RFC: truly unsized types #709

mzabaluev commented Jan 22, 2015

mzabaluev commented Jan 22, 2015

ftxqxd commented Jan 23, 2015

mzabaluev commented Jan 24, 2015

Ericson2314 commented Jan 25, 2015

mzabaluev commented Jan 25, 2015

Ericson2314 commented Jan 26, 2015

ftxqxd commented Jan 26, 2015

Ericson2314 commented Jan 26, 2015

Diggsey commented Jan 29, 2015

eddyb commented Jan 29, 2015

Kimundi commented Jan 29, 2015

mzabaluev commented Jan 29, 2015

Diggsey commented Jan 29, 2015

Kimundi commented Jan 29, 2015

Diggsey commented Jan 29, 2015

Ericson2314 commented Jan 29, 2015

mzabaluev commented Jan 29, 2015

mzabaluev commented Jan 30, 2015

Diggsey commented Jan 30, 2015

SSheldon commented Jan 30, 2015

Diggsey commented Jan 30, 2015

mzabaluev commented Jan 30, 2015

alexcrichton commented Feb 3, 2015

pnkfelix commented Feb 5, 2015

RFC: truly unsized types #709

RFC: truly unsized types #709

Conversation

mzabaluev commented Jan 22, 2015

mzabaluev commented Jan 22, 2015

ftxqxd commented Jan 23, 2015

mzabaluev commented Jan 24, 2015

Ericson2314 commented Jan 25, 2015

mzabaluev commented Jan 25, 2015

Ericson2314 commented Jan 26, 2015

ftxqxd commented Jan 26, 2015

Ericson2314 commented Jan 26, 2015

Diggsey commented Jan 29, 2015

eddyb commented Jan 29, 2015

Kimundi commented Jan 29, 2015

mzabaluev commented Jan 29, 2015

Diggsey commented Jan 29, 2015

Kimundi commented Jan 29, 2015

Diggsey commented Jan 29, 2015

Ericson2314 commented Jan 29, 2015

mzabaluev commented Jan 29, 2015

mzabaluev commented Jan 30, 2015

Diggsey commented Jan 30, 2015

SSheldon commented Jan 30, 2015

Diggsey commented Jan 30, 2015

mzabaluev commented Jan 30, 2015

alexcrichton commented Feb 3, 2015

pnkfelix commented Feb 5, 2015