Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsized Rvalues #1909

Merged
merged 4 commits into from
Feb 8, 2018
Merged

Unsized Rvalues #1909

merged 4 commits into from
Feb 8, 2018

Conversation

arielb1
Copy link
Contributor

@arielb1 arielb1 commented Feb 19, 2017

This is the replacement to RFC #1808. I will write the "guaranteed optimizations" section tomorrow.

Rendered

Summary comment from Sept 6, 2017

cc @nikomatsakis @eddyb @llogiq @whitequark @briansmith

Copy link
Contributor

@jonas-schievink jonas-schievink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, seems like a step in the right direction. The details still need some fleshing out, of course.

a3) cast expressions
- this seems like an implementation simplicity thing. These can only be trivial casts.
b) The RHS of assignment expressions must always have a Sized type.
- Assigning an unsized type is impossible because we don't know how much memory is available at the destination. This applies to ExprAssign assignments and not to StmtLet let-statements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got all munched together by Markdown, might want to put it in a list

let x = [0u8; n]; // x: [u8]
let x = [0u8; n + (random() % 100)]; // x: [u8]
let x = [0u8; 42]; // x: [u8; 42], like today
let x = [0u8; random() % 100]; //~ ERROR constant evaluation error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error definitely wants a note explaining what's going on, it looks very confusing otherwise

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this limitation can be easily worked around.

let n = 0;
let x = [0u8; n + (random() % 100)]; // OK

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this rule sounds good to me as long as the error message includes something like this:

note: if you want this array's size to be computed at runtime, move the expression into a temporary variable:
    let n = random() % 100;
    let x = [0u8; n];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to just get rid of the restriction and allow random % 100 to work. Programmers have the intuition that one can replace let n = random() % 100; f(n) with f(random() % 100) and IMO it isn't worth fighting that intuition.

In the furture Rust should create some equivalent to constexpr for expressions that guarantees that such-marked expressions are evaluated at compile time. That would be a more general mitigation for the concern here.

Copy link
Contributor

@petrochenkov petrochenkov Feb 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briansmith
As I understand, this is mostly an implementation concern.
If captured local value is the indicator, then type of the array is immediately known, but detecting "constexprness" for an arbitrary expression may be unreasonably hard to do in time.

Copy link
Contributor

@nikomatsakis nikomatsakis Feb 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given how hard it was for us to find the [T; n] syntax I despair of finding another syntax though. =) I guess one question is how often people will want to create VLA on the stack, and particularly VLA where the length is not compile-time constant, but also doesn't use any local variables or other inputs!

It seems like access to a static (as opposed to const) could also trigger the "runtime-dependent" rule. For example, the following does not compile today (and rightly so, I think):

static X: usize = 1;

fn main() {
    let v = [0; X];
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's tricky, although I'd argue that in a type, e.g. [i32; X], reading the static is perfectly fine and observes the value just-after-initialization (which also makes accessing a static before it's fully initialized a regular on-demand cycle error).

Copy link

@engstad engstad Feb 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For backward compatibility, I would assume that syntax has to be different. I feel that the current proposal is way too similar to the old syntax. As a programmer, I would very much like to be able to search for places where the stack could grow unboundedly; and I think the syntax should help in that effort.

The type syntax is fine. [T] is sufficiently different from [T; N]. The constructor for VLA should reflect that initialization happens at runtime. Perhaps, instead, something along the lines of:

fn f(n: usize) {
    let v = stack_alloc::init::<[u32]>(n, 0);
    let u: [i32] = stack_alloc::init_with(n, |i: usize| 2 * i);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a programmer, I would very much like to be able to search for places where the stack could grow unboundedly; and I think the syntax should help in that effort.

Uh, this also happens everywhere you have recursion. How about having a lint / editor command for this instead?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, unbounded stack-use due to recursion would also be nice to detect, but that's a different issue.

As far as using lint goes, that would do its job, but not everyone lints their code. I'm talking about "understanding what the code does by looking at it". With the current proposal, it is hard to figure out if it is an unbounded allocation or a statically bound allocation on the stack.


However, without some way to guarantee that this can be done without allocas, that might be a large footgun.

One somewhat-orthogonal proposal that came up was to make `Clone` (and therefore `Copy`) not depend on `Sized`, and to make `[u8]` be `Copy`, by moving the `Self: Sized` bound from the trait to the methods, i.e. using the following declaration:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to note that removing a supertrait is a breaking change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but Sized is...special, in this case.

Also getting rid of the arbitrary limit of 32 array elements would make up for a lot...let's see a crater run before arguing further.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference to the concrete type [u8] here is confusing. Did you mean [T] or something different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean T where T: Copy


In Unsafe code, it is very easy to create unintended temporaries, such as in:
```Rust
unsafe fnf poke(ptr: *mut [u8]) { /* .. */ }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/fnf/fn

@Ericson2314 Ericson2314 mentioned this pull request Feb 20, 2017
@Ericson2314
Copy link
Contributor

Ericson2314 commented Feb 20, 2017

I like the idea of Clone: ?Sized, but I think clone_from can only properly be written with out pointers?

edit Oops, I thought clone_from was a new method. I was thinking we could do something like

// implementation can assume `*self` and `*source` have the same size. 
unsafe fn move_from_unsized(&out self, source: &Self);

This is similar to things that would be useful for emplacement.

@withoutboats withoutboats added the T-lang Relevant to the language team, which will review and decide on the RFC. label Feb 20, 2017
@whitequark
Copy link
Member

@arielb1 This doesn't reach being a replacement for #1808, primarily because, as written, the lifetime of the value is only until the end of the block. What are your plans for extending the lifetime of the value past that?

let x = [0u8; n]; // x: [u8]
let x = [0u8; n + (random() % 100)]; // x: [u8]
let x = [0u8; 42]; // x: [u8; 42], like today
let x = [0u8; random() % 100]; //~ ERROR constant evaluation error

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to just get rid of the restriction and allow random % 100 to work. Programmers have the intuition that one can replace let n = random() % 100; f(n) with f(random() % 100) and IMO it isn't worth fighting that intuition.

In the furture Rust should create some equivalent to constexpr for expressions that guarantees that such-marked expressions are evaluated at compile time. That would be a more general mitigation for the concern here.

length: len_,
data: *s
});
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this really needs to be supported:

#[derive(Copy, Clone)]
struct Elem {
    // ...
    value: [T],
}

I understand the concern expressed above about uses of Box adding surprising allocas. However, I think this feature is actually more likely to be used by code, like mine, that is trying to avoid Box completely. In particular, if we're willing/able to use the heap instead of the stack then we don't miss stack-allocated VLAs nearly as much.

Copy link
Contributor Author

@arielb1 arielb1 Feb 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clone::clone has Self: Sized, so this should work depending on the intelligence of derive(Clone) (and more to the point, implementing Clone/Copy yourself would work).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arielb1

Clone::clone has Self: Sized, so this should work depending on the intelligence of derive(Clone) (and more to the point, implementing Clone/Copy yourself would work).

I'm confused by this. How could this work unless we modify the Clone trait?

});
```

However, without some way to guarantee that this can be done without allocas, that might be a large footgun.
Copy link

@briansmith briansmith Feb 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with large allocas is already a problem today without this feature. In fact this feature helps resolve the existing problem.

Consider:

#[allow(non_snake_case)] // Use the standard names.
pub struct RSAKeyPair {
    n: bigint::Modulus<N>,
    e: bigint::PublicExponent,
    p: bigint::Modulus<P>,
    q: bigint::Modulus<Q>,
    dP: bigint::OddPositive,
    dQ: bigint::OddPositive,
    qInv: bigint::Elem<P, R>,
    qq: bigint::Modulus<QQ>,
    q_mod_n: bigint::Elem<N, R>,
    r_mod_p: bigint::Elem<P, R>, // 1 (mod p), Montgomery encoded.
    r_mod_q: bigint::Elem<Q, R>, // 1 (mod q), Montgomery encoded.
    rrr_mod_p: bigint::Elem<P, R>, // RR (mod p), Montgomery encoded.
    rrr_mod_q: bigint::Elem<Q, R>, // RR (mod q), Montgomery encoded.
}

Oversimplifying a bit, each one of those values is ideally an [u8] with a length of between 128-512 bytes. But, instead, I currently am in the process of making them [u8; 1024] because Rust doesn't have VLAs (Other parts of my code need 1024-byte values for these types, but the RSAKeyPair operates on shorter values).. So, currently I'd need about 13KB or 26KB of stack space when constructing one of these already today, already. VLAs would actually reduce the expected stack space usage, even without implementing any sort of solution to this problem.

Copy link

@eternaleye eternaleye Feb 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briansmith: So, I don't think the RSAKeyPair use case is actually viable even with the most generous expansion of this, because it not only needs VLAs, requires a struct with more than one unsized member.

This is something completely unsupported in Rust at present, poses major layout difficulties, would require multiple different runtime sizes (which is a big thing to extend fat-pointers to do), and I'm hugely unsure if LLVM will even permit this without a great deal of effort, as Clang doesn't even support VLAIS, a notable hole in its GCC compatibility.

I suspect that use case is considerably better handled by const-dependent types, and possibly a later extension of DST to support unsizing a type with a single const-dependent parameter, to a type with none (and a fat-pointer, fattened by that value)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briansmith: So, I don't think the RSAKeyPair use case is actually viable even with the most generous expansion of this, because it not only needs VLAs, requires a struct with more than one unsized member.

Good point that VLAs won't direct solve this problem. However, my main point here is that the problem with Box potentially allocating large numbers of giant values on the stack before they are put on the heap already exists, for any large structure with large elements.

I think the compiler just needs to put in a little effort to ensure that it properly optimizes (minimizes) stack usage for code using this pattern:

let large_vla_1 = ....;
let boxed_vla_1 = Box::new(large_vla_1);
let large_vla_2 = ....;
let boxed_vla_2 = Box::new(large_vla_2);
...
let large_vla_n = ....;
let boxed_vla_n = Box::new(large_vla_n);

In particular, it should be optimized into this:

let boxed_vla_1;
let boxed_vla_2;
...
let boxed_vla_n;
{
    let large_vla_1 = ....; // alloca
    boxed_vla_1 = Box::new(large_vla_1);
} // pop `large_vla_1` off the stack;
{
    let large_vla_2 = ....; // alloca
    boxed_vla_2 = Box::new(large_vla_2);
} // deallocate `large_vla_2`.
...
{
    let large_vla_n = ....; // alloca
    boxed_vla_n = Box::new(large_vla_n);
} // deallocate `large_vla_n`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briansmith That's the point of the box expression (#809 / rust-lang/rust#22181).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briansmith

Unfortunately, placement of allocas is entirely controlled by LLVM in both of your code cases. I believe that it should generate the "optimal" code in both cases, as long as you don't take references to large_vla_N before boxing them.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@briansmith Would it be possible to solve the problem with raw byte arrays and autogenerated unsafe code?


The way this is implemented in MIR is that operands, rvalues, and temporaries are allowed to be unsized. An unsized operand is always "by-ref". Unsized rvalues are either a `Use` or a `Repeat` and both can be translated easily.

Unsized locals can never be reassigned within a scope. When first assigning to an unsized local, a stack allocation is made with the correct size.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not allow reassignment with the requirement that the size must be the same, with a dynamic check, similar to how copy_from_slice() works?

In my case, I'd have local values of struct Elem { value: [usize] }, and all the local values in a function of type Elem would use the same length for the value, so (hopefully) the compiler could easily optimize the equal-length checks away.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because a = b; is not supposed to be doing magic dynamic checks behind your back. You can use copy_from_slice if you want.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because a = b; is not supposed to be doing magic dynamic checks behind your back. You can use copy_from_slice if you want.

a = b[0] does a “magic dynamic check behind your back” and it's already considered acceptable. This isn't any different, except it probably will be easier to optimize away than the array index check.

Copy link

@eternaleye eternaleye Feb 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is different, and no, a = b[0] is neither "magic" nor "behind your back".

It assigns the result of an indexing operation, which has a syntactic representation. All such cases, OpAssign included, have some extra syntax on one side of the assignment or another.

Assignment of VLAs would not, which is what makes it both "magic" and "behind your back".

(And furthermore, it's worth noting that all of the dynamic checks in such cases come from the Op, not the Assign.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the b[0], you mean? The problem with the check on assignment is that assignment expressions normally can't panic, so an very innocuous-looking assignment statement could cause runtime panics, especially because variants of vec[0..0] = *w; would compile and panic, confusing Python programmers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with the check on assignment is that assignment expressions normally can't panic, so an very innocuous-looking assignment statement could cause runtime panics, especially because variants of vec[0..0] = *w; would compile and panic, confusing Python programmers.

I think this is also good to note in the RFC.

Would at least the following work?

#[derive(Clone)]
struct Elem([usize]);

I believe an implementation of Clone for such a type could be implemented manually:

impl Clone for Elem {
    pub fn clone(&self) -> Self {
        let value_len = self.0.len();
        let mut r = Elem([0; value_len]);
        r.0.copy_from_slice(&self.0);
        r
    }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tricky thing. In general, unsized return values can't quite work without some additional machinery because the caller has no idea how much space to allocate. In other words, here, how is the caller to know that you are returning an Elem with the same length as the input had?

To handle that properly would I think require making some kind of (erased) parameter that lets you say that the length of the return value is the same as the input. (Similar to how Java wildcard capture can allow you to do some things kind of like that, or full on existentials.) I would be more precise but I'm kind of at a loss for how that would look in Rust syntax at the moment.

A related thought I had once is that we could potentially make a trait with -> Self methods be object safe: this works precisely because the underlying type must be sized, and because we control the impls enough to make sure (statically) that the returned value will always be of the same type that the receiver has (so we can allocate the right amount of space by examining how much space the receiver wants).

I could imagine trying to do something similar but with array lengths.

s1.1
} else {
s2.1
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean s1.data here? StringData does not have a .1 member.

}
```

"captures a variable" - as in RFC #1558 - is used as the condition for making the return be `[T]` because it is simple, easy to understand, and introduces no type-checking complications.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree that it is simple to understand, at least I need to read the RFC + comments 3 times to see why [0u8; random() % 100] is error while [0u8; n + random() % 100] is fine.

I don't see why #1558's rule should apply to VLA, there is no difference between random() % 100 and n + random() % 100 that makes the former unsuitable for VLA.

I'd rather you have a core::intrinsic::make_vla::<u8>(0u8, random() % 100) than having such a strange rule.

Copy link
Contributor

@glaebhoerl glaebhoerl Feb 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW the "previous" rule suggested by @eddyb was that only array-repeat-literals with their expected type explicitly specified as being [T] would be allowed to be VLAs, that is:

fn foo(n: usize) {
    let x = [0u8; n]; // error
    let x: [u8] = [0u8; n]; // OK
    let x = [0u8; random() % 100]; // error
    let x: [u8] = [0u8; random() % 100]; // OK

    fn expects_slice(arg: &[u8]) { ... }

    expects_slice(&[0u8; n + random()]); // also OK

    fn expects_ref<T: ?Sized>(arg: &T) { ... }

    expects_ref(&[0u8; random()]); // error!

    expects_ref::<[T]>(&[0u8; random()]); // presumably OK!
}

This might be less ergonomic (presumably why the new rule was chosen instead), but I do like the explicitness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the explicitness, but I worry this doesn't give well with the principle that explicit type annotations are "just another" way to constrain type inference, rather than required à la C or Go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "expected type" rule is also fine. Not sure which of these is easier to explain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we need to make VLA easy to construct by overloading the [x; n] syntax, it is an advanced feature which normally no one should use it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is an advanced feature which normally no one should use it.

I would be fine with a totally different syntax (in general I concede syntax to others).

I disagree that we should conclude that VLAs are an advanced feature, though. I think that it's better to start off trying to make VLAs as natural and ergonomic as possible, and see if they actually get abused in such a way that people write bad or dangerous (stack-overflowing) code with them. If, based on initial experience, the feature turns out to actually be dangerous enough to justify making it harder to use, then the syntax can be changed.

@arielb1
Copy link
Contributor Author

arielb1 commented Feb 20, 2017

@whitequark

@arielb1 This doesn't reach being a replacement for #1808, primarily because, as written, the lifetime of the value is only until the end of the block. What are your plans for extending the lifetime of the value past that?

I don't see any clear design for that, even in #1808.

@eddyb
Copy link
Member

eddyb commented Feb 20, 2017

@whitequark Something like 'fn would be orthogonal to this (and has to work for Sized values too).

@whitequark
Copy link
Member

@eddyb Fair.

@burdges
Copy link

burdges commented Feb 20, 2017

I'm worried about ambiguity from overloading the [T; n] syntax too. Afaik, we do not know exactly how const fn will shake out yet, so [T; foo()] could change types as APIs change elsewhere. And [T; foo!()] has that problem immediately. These errors might be harder to locate than you'd expect. At minimum, there are enough hiccups going back and forth between [T] and [T; n] that one cannot argue that overloading [x; n] makes teaching the language easier.

There are plenty of new syntax that avoids any ambiguity in the types: Anything like [x| n] [x: n], [x](n), [x][n], [x;; n], [x, n], [x; * n], [x; alloca n], etc. Anything involving a word like alloca whether a keyword like box, a function like uninitialized, or a macro. Any sort of "slice comprehension notation" that consumes an iterator to make a VLA, like [x : 0..n], [expr(i) for i in 0..n], etc.

In fact, you could easily build a &mut [u8] VLA from a &str with a slice comprehension notation like

let hw = &mut [ x for x in "hello world".as_bytes().iter() ];

Edit You could even make a &mut str this way if you first define unsafe fn from_utf8_unchecked_mut(v: &mut [u8]) -> &mut str { ::core::mem::transmute(v) } but you'd probably wrap all that in a macro called str_dup or something.

@eternaleye
Copy link

eternaleye commented Feb 20, 2017

@burdges: One option I think would be viable is mut, phrased like your alloca example:

const n = ...;
let m = ...;
let w = [0u8; 3] // [u8; 3] as today
let x = [0u8; n] // [u8; n] as today
let y = [0u8; m] // Error, as today
let z = [0u8; mut m] // [u8], as a VLA
let q = [0u8; mut n] // [u8], as a VLA

This has several advantages:

  1. Const-dependent types seem to have converged on using the const keyword for marking where values are permitted as generic parameters, but only when constant
  2. Arrays really are the one pre-existing const-dependent type
  3. We can't require const for the existing array length without breaking compat, and it makes the common case verbose
  4. mut is the natural inverse of const, is short, is already reserved, and is meaningless in that context today
  5. It's easily extended (at a later date) to allowing (restricted) runtime-dependent types

@ahicks92
Copy link

@kennytm
I don't think this is an advanced feature that no one should use, I think it is a feature that everyone should use all the time. It's basically C's alloca, which can be and is used often enough.

If the concern is that people will accidentally overflow the stack, yes, they will. But I can already do that in so, so many ways. I don't know what the size of the struct from the library over there is; I don't know what the size of the stack-allocated array is either. How are these cases significantly different?

What I see here is "Here is a thing that can avoid heap allocation for small arrays", and that's brilliant, and I want it.

@petrochenkov
Copy link
Contributor

@camlorn
I never expected to see "C's alloca" and "a feature that everyone should use" in one sentence.
What code bases use alloca often enough?

@burdges
Copy link

burdges commented Feb 20, 2017

If one wants the function or macro route, then collect_slice makes a nice name. If intrinsics could return an unsized value, then the signature might look like

fn collect_slice<T,I>(itr: I) -> [T] where T: Sized, I: Iterator<T>+ExactSizeIterator

I think that's equivalent to any sort of comprehension notation without the new syntax being quite so new.

@arielb1
Copy link
Contributor Author

arielb1 commented Feb 20, 2017

@burdges

Ordinary functions can't really return DSTs - you can't alloca into your caller's stackframe, and there's a chicken-and-egg problem where the callee can't figure out how much space to allocate before it calls the function.

@Ericson2314
Copy link
Contributor

Ericson2314 commented Feb 20, 2017

you can't alloca into your caller's stackframe

I'd think tail calls with dynamically sized args and this are somewhat comparable. Both should be possible in at least some cases. Certainly lots of work and out of scope for the initial version, however.

@briansmith
Copy link

Ordinary functions can't really return DSTs - you can't alloca into your caller's stackframe, and there's a chicken-and-egg problem where the callee can't figure out how much space to allocate before it calls the function.

This just means that functions that return DSTs need a different calling convention than other functions, right? Callee pops stack frame except for its result, which is left on the stack. The caller can calculate the size of the result by remembering the old value of SP and subtracting the new value of SP.

@arielb1
Copy link
Contributor Author

arielb1 commented Feb 20, 2017

@briansmith

That would requires implementing that calling convention through LLVM. Plus locals in the callee function would get buried beneath the stack frame.

Also, this would mean that the return value is not by-reference, so you would not get the benefits of RVO.

@briansmith
Copy link

That would requires implementing that calling convention through LLVM.

Of course.

To me it seems very natural to expect that Rust would drive many changes to LLVM, including major ones. I think it would be helpful for the Rust language team to explain their aversion to changing LLVM somewhere so people can understand the limits of what is reasonable to ask for, somewhere (not here). My own interactions with LLVM people gave me the impression that LLVM is very open to accepting changes, especially for non-Clang languages.

@llogiq
Copy link
Contributor

llogiq commented Feb 20, 2017

For the record, #1808 mandated type ascription for unsized types to steer clear of ambiguity, e.g. let x : [usize] = [0; n];.

I also think requiring the size expression to somehow be special is a bad idea, both design- and implementation-wise.

@glaebhoerl
Copy link
Contributor

mut is the natural inverse of const

I... I don't think this is true at all in Rust (and honestly I'm kind of surprised so many people apparently think it is).

The issue is that const ("constant"; "a value that doesn't change") has two potential meanings in a programming language:

  • "Not mutable", or (if applied to a value) "runtime constant". This is what const means in C and C++.

  • "Compile-time constant", that is, a value already known at compile-time. This is what const means in Rust. (And, if my memory's not playing tricks, Pascal?)

I've always disfavored our use of the const keyword for this purpose, for this reason - it means something different than it does in C, which is where most people are familiar with it from.

(...someone at this point is itching to bring up *const. Yes, our usage of const is also inconsistent. Given that *const is most often used for C FFI, I think this syntax is mostly justifiable by thinking of that const there as being "in the C sense", so that our *const means the same thing as their const*.)

Anyway: even if const were to mean the other thing, mut still means "mutable", which is the opposite of "immutable", and not of "compile-time". It very much does not mean "runtime value". This is kind of like suggesting we use "dark weight" to describe something that is difficult to lift because "dark" is the opposite of "light".

@eternaleye
Copy link

@glaebhoerl: Mm, that's fair. However, on the one hand I do think there's a meaningful relationship there (in a sense, that value is mutable across invocations of the function), and on the other hand, there really isn't a better keyword for it. The closest is probably do, and using that for this feature would likely be a bit... bikeshed-inducing.

@glaebhoerl
Copy link
Contributor

that value is mutable across invocations of the function

So are function arguments and lets :)

For that matter, if we have an "opposite of const" it's let.

(I'm not convinced there's a need for any kind of special syntax, though. The meaning of [0; random()] is obvious enough and natural. Whether the runtime characteristics justify putting up extra guardrails remains to be shown - ideally we'd find out from practical experience.)

@burdges
Copy link

burdges commented Feb 20, 2017

I suppose you can always define a macro to populate your VLA from an iterator or closure or whatever, like

macro_rules! collect_slice { ($itr:expr) => {
    {
        let itr = $itr;
        let len = ExactSizeIterator::len(itr);
        let mut vla = unsafe { [uninitalized(): len] };
        for (i,j) in vla.iter_mut().zip(itr) { *i = *j; }
        vla
    }
} }

What about simply [x: n] for a VLA? It parallels [x; n] nicely without the ambiguity. It's even forward compatible with comprehension notations if anybody ever wants those.

@eternaleye
Copy link

eternaleye commented Dec 2, 2017 via email

@whitequark
Copy link
Member

whitequark commented Dec 2, 2017

To add to this, some architectures do not even have pages at all (and yet still work fine with stack probing), e.g. Cortex-M3 if you put the stack at the bottom of the RAM.

@bill-myers
Copy link

bill-myers commented Jan 3, 2018

I might have missed a mention of it, but it's important to note that plain "alloca" is not enough to implement this feature.

In particular, plain allocas never get freed until the function returns, which means that if you have an unsized variable declared in a loop, the stack usage will now be proportional to the number of loop iterations, which is catastrophic.

Instead, the stack pointer needs to be rewinded every loop iteration (and in general whenever an alloca goes out of scope), which probably requires LLVM changes, although it might possible to get away with just altering the stack pointer via inline assembly.

Also, this is fundamentally incompatible with 'fn-scoped allocas, so if they are added the language needs to forbid 'fn-scoped allocas when unsized rvalues are in scope.

@crlf0710
Copy link
Member

crlf0710 commented Jan 3, 2018

@nikomatsakis @eddyb it's been a while since Sept the fcp completed, any chance to get this merged? Thanks a lot.

@whitequark
Copy link
Member

In particular, plain allocas never get freed until the function returns, which means that if you have an unsized variable declared in a loop, the stack usage will now be proportional to the number of loop iterations, which is catastrophic.

This is not catastrophic. In fact, for certain use cases (using the stack as a local bump pointer allocator) it is necessary and desirable. Unsized rvalues must be used together with lifetime ascription to let the compiler free them.

Instead, the stack pointer needs to be rewinded every loop iteration (and in general whenever an alloca goes out of scope), which probably requires LLVM changes, although it might possible to get away with just altering the stack pointer via inline assembly.

LLVM has @llvm.stacksave and @llvm.stackrestore intrinsics for this.

@aturon
Copy link
Member

aturon commented Feb 7, 2018

This RFC has been (very belatedly!) merged!

Tracking issue

@aturon aturon merged commit 6c3c48d into rust-lang:master Feb 8, 2018
scottlamb added a commit to scottlamb/moonfire-nvr that referenced this pull request Feb 23, 2018
I mistakenly thought these had to be monomorphized. (The FnOnce still
does, until rust-lang/rfcs#1909 is implemented.) Turns out this way works
fine. It should result in less compile time / code size, though I didn't check
this.
@whitequark whitequark mentioned this pull request May 13, 2018
@TheDan64
Copy link

TheDan64 commented Jun 5, 2018

As I understand it, a goal of this RFC is to make unsized trait objects possible. So, would this make Vec<Trait> and friends (ie HashMap<X, Trait>) possible? Vec<Box<Trait>> and Vec<&Trait> are both frustrating types to work with today when needing heterogeneous collections.

@Diggsey
Copy link
Contributor

Diggsey commented Jun 5, 2018

@TheDan64 no: trait objects are always unsized, and types like Vec cannot work with unsized types because they store their elements in a contiguous block of memory - that's not going to change.

This RFC allows you to pass unsized values (including trait objects) to functions by value (ie. move them into the function) and to store unsized values directly on the stack.

@kennytm
Copy link
Member

kennytm commented Jun 5, 2018

@TheDan64 Note that even if we allowed Vec<dyn Trait>, it still cannot store heterogeneous items (all elements must have the same type, determined by a single vtable stored elsewhere).

@TheDan64
Copy link

TheDan64 commented Jun 5, 2018

It'd be interesting if somehow Vec<Trait> / Vec<dyn Trait> could use an anonymous enum under the hood, so that all heterogeneous elements of the Vec are the same width without all of the boilerplate of actually creating an enum for every type implementing Trait

@kennytm
Copy link
Member

kennytm commented Jun 5, 2018

That's not possible because you don't know how big the type will be. Consider this, in crate A we have:

pub trait Trait {
    fn do_something(&self);
}
impl Trait for u8 { ... }
impl Trait for u64 { ... }

let mut TRAITS: Vec<dyn Trait> = vec![1u8, 2u64, ...];

And then in crate B, we write:

extern crate crate_a;

struct MyHugeStruct([u64; 8192]);
impl crate_a::Trait for MyHugeStruct {
    fn do_something(&self) {}
}

...

crate_a::TRAITS.push_back(MyHugeStruct([0u64; 8192]));

so how do crate_a know there is going to be a 64 KiB item while allocating the Vec and generate the anonymous enums?

@TheDan64
Copy link

TheDan64 commented Jun 5, 2018

Good point!

@burdges
Copy link

burdges commented Jun 5, 2018

Is there any facility for Vec<dyn Trait> to enforce that all elements have the same type? I think no because lifetimes are erased too early in the compilation process.

I'd think if Vec<dyn Trait> made sense then it must accept heterogeneous types, so it likely makes no sense. In principle, one could make ArenaVec<dyn Trait> like types that operate as a vector of owning references though.

@eddyb
Copy link
Member

eddyb commented Jun 5, 2018

Is there any facility for Vec<dyn Trait> to enforce that all elements have the same type? I think no because lifetimes are erased too early in the compilation process.

I'd think if Vec<dyn Trait> made sense then it must accept heterogeneous types, so it likely makes no sense. In principle, one could make ArenaVec<dyn Trait> like types that operate as a vector of owning references though.

Where do lifetimes come into it?

My view on it is that Vec<T> is a resizeable Box<[T]>, so the next logical question is: what would [dyn Trait] mean? Well, at any given time, it must be one type, i.e. [T; n] where T: Trait.

Working our way from a more explicit dynamic existential notation, Box<[dyn Trait]> would be roughly sugar for Box<dyn<n: usize> [dyn<T: Trait> T; n]>.
Something very important here is that without indirection (or with this RFC, stack ownership with usage patterns that can be implemented through indirection), dyn can't function, so dyns necessarily "bubble up" to just around the pointer, so our earlier type must be equivalent to dyn<T: Trait, n: usize> Box<[T; n]>, with homogenous element types, as all static arrays are.

This makes Vec<dyn Trait> be dyn<T: Trait> Vec<T>, which may be useful if you have many values of the same type, but you want to forget the exact type. One real-world use of this that has come up in the past is game engines, where you might want Vec<Vec<dyn Entity>> (or similar), to keep all the values of the same type together, for processing speed and storage benefits.

@burdges
Copy link

burdges commented Jun 5, 2018

I see, so fn foo() -> Vec<dyn Trait> as -> dyn<T: Trait> Vec<T> acts much like-> Vec<impl Trait> in that you get an anonymized type T which you cannot insert, pop, etc., but differs by avoiding monomorphisation on this anonymized type, by instead using a bigger vtable that points to code that knows the size, or maybe looks up the size in the vtable.

As for lifetimes, I'd expect a Vec<Vec<dyn Entity>> requires some reflection to get back the inner Vec<T> for modifications, which breaks if BorrowedCat<'a> : Entity, so you actually want Entity : Any, right?

Now why would Vec<Vec<dyn Entity>> be Vec<dyn<T: Entity> Vec<T>> not dyn<T: Entity> Vec<Vec<T>>? I suppose there must be a rule for this dyn<T: Entity> .. construct so that &'a Trait is dyn<T: Trait> &'a T, MutexGuard<'a,dyn Trait> is dyn<T: Trait> MutexGuard<'a,T>, etc.? It's just building the vtable for the innermost type containing all that particular dyn Trait perhaps? if type Foo<T> = HashMap<T,Vec<T>> then Foo<dyn Trait> is dyn<T: Trait> HashMap<T,Vec<T>>, yes?

@eddyb
Copy link
Member

eddyb commented Jun 5, 2018

I suppose there must be a rule

Vec<T> contains a pointer to T (or to [T], rather - hence the comparison to Box<[T]>).
The indirection is key, and where the dyn "bubbles up to".
It has nothing specific to do with Vec<T> being generic over T.

You don't need reflection for modifications, at least not all of them (specifically, adding new elements is a problem, but looking at/removing existing ones is fine), and there's nothing special about Vec - if you need reflection, you end up using Any - there's no other mechanism to do so.

Note that Vec<Vec<dyn Entity + 'arena>> is perfectly fine, but then Entity can't inherit Any.

@burdges
Copy link

burdges commented Jun 6, 2018

Yes, I suppose the vtable "handles the reflection" for removing elements. I suppose dyn<T: Trait> HashMap<T,..> has this indirection through Trait : Borrow<K> probably, which sounds quite restrictive.

Anyways dyn bubbling up represents vtable expansion, so one might even want say dyn<T: AddAssign<&T>> (Vec<T>, HashMap<K,T>) to express that AddAssign is object safe here, although not normally, and you can use it between elements in these two data structures.

@Centril Centril added A-typesystem Type system related proposals & ideas A-dst Proposals re. DSTs labels Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dst Proposals re. DSTs A-typesystem Type system related proposals & ideas final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.