change handling of subtyping to be less fragile after analysis #112651

lcnr · 2023-06-15T09:43:35Z

edit: no longer up-to date. the current plan is to add the explicitly add the source type to MIR casts instead

We should add a MIR pass after analysis (and before any optimizations) which makes subtyping explicit in the MIR. This is a robust way to fix #107205, preventing similar issues in the future. I don't quite know how to do that, probably by adding an explicit ProjectionElem::Cast (probably reusing OpaqueCast for that).

The text was updated successfully, but these errors were encountered:

lcnr · 2023-06-19T07:12:49Z

When implementing this, also handle all the FIXME(#112651) added in #112307

ouz-a · 2023-08-15T13:55:24Z

@rustbot claim

Make subtyping explicit in MIR This adds new mir-opt that pushes new `ProjectionElem` called `ProjectionElem::Subtype(T)` to `Rvalue` of a subtyped assignment so we can unsoundness issues like rust-lang#107205 Addresses rust-lang#112651 r? `@lcnr`

Make subtyping explicit in MIR This adds new mir-opt that pushes new `ProjectionElem` called `ProjectionElem::Subtype(T)` to `Rvalue` of a subtyped assignment so we can unsoundness issues like rust-lang/rust#107205 Addresses rust-lang/rust#112651 r? `@lcnr`

goffrie · 2023-11-26T15:52:37Z

Should this be closed now that #115025 has been merged?

edit: I guess there are still some FIXMEs referencing this issue

lcnr · 2023-11-30T10:31:23Z

we encountered an issue with function pointers:

fn main() {
    let x: fn(fn(&'static ())) = some_function;
    let y: fn(for<'a> fn(&'a ())) = x;
    y(|&()| ());
}

here we can make subtyping inside of main explicit, resulting in something like the following

fn main() {
    let x: fn(fn(&'static ())) = some_function;
    let y: fn(for<'a> fn(&'a ())) = x as fn(fn(&'static ()));
    y(|&()| ());
}

however, when calling y, it is still a function pointer to some function expecting fn(&'static ()). This means that the calling y results in implicit subtyping. To handle this the let y = x assignment would also have to wrap the value of x to explicitly subtype. Modifying values in this way feels very difficult implement. Especially given that the function pointer could be nested in x, e.g. in some union or whatever.

As this means that subtyping has to still be implicitly handled somewhere, I believe that it's safer to always implicitly handle this. I believe that it's otherwise easier to accidentally rely on subtyping being explicit, resulting in bugs which are even harder to diagnose.

RalfJung · 2023-11-30T13:21:59Z

however, when calling y, it is still a function pointer to some function expecting fn(&'static ()). This means that the calling y results in implicit subtyping.

What is the issue with that? That subtyping only arises dynamically; as far as the type system is concerned, we don't know what y points to so the fact that the underlying value has type fn(fn(&'static ())) should not be an issue?

lcnr · 2023-11-30T13:53:38Z

yes, this is totally fine and even expected. However, we wanted to make subtyping explicit to prevent bugs like #107205 going forward:

While subtyping does not change the layout of a type, it does change trait selection. This broke stuff in two ways:

Optimizations, e.g. const prop, but also inlining, can incorrectly change

let x: subtype = whatever;
let y: supertype = x;
y.requires_trait_selection(); // e.g. by casting it to a trait object

to

let x: subtype;
x.requires_trait_selection();

We also currently use typed values both in the MIR, e.g. https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/consts/enum.Const.html#variant.Val and during codegen https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/mir/operand/struct.OperandRef.html (this uses TyAndLayout). If we propagate that value to a place which has a super type but don't update its type, using the type specified as part of the value is wrong.

That subtyping only arises dynamically; as far as the type system is concerned, we don't know what y points to so the fact that the underlying value has type fn(fn(&'static ())) should not be an issue?

It can be known by optimizations/ctfe, which can be unsound if they accidentally use the type of the value passed into the function to do trait selection.

I think ideally we would not have values which are typed, especially not in codegen, where the actual unsoundness happened. However, it was very difficult to change OperandRef to only use a Layout so I did not pursue that further.

RalfJung · 2023-11-30T15:52:16Z

While subtyping does not change the layout of a type, it does change trait selection.

Uh, okay, that sounds bad.

It can be known by optimizations/ctfe, which can be unsound if they accidentally use the type of the value passed into the function to do trait selection.

CTFE considers function argument passing to be a transmute. The caller and callee might have completely different but ABI-compatible types. Many things would go wrong loudly if we used the caller type ever inside the callee.

Also, in CTFE values are only ephemerally typed, during the execution of a single machine step (ImmTy/OpTy). The values that persist between machine steps are stored in Immediate or in memory, and are hence not typed. This matches the (envisioned) operational semantics where the untyped raw representation gets turned into a structured and typed Value when the AM needs to act on it (e.g. to add two integers together, or to dereference a pointer), and then back into an untyped raw representation when it is e.g. stored inside some place.

Likewise, ConstValue is not typed. Const is a ConstValue that appears in MIR, it needs the type because every operand in MIR needs to know its type.

The only optimization where type information from the caller could leak to the callee is inlining, I think?

lcnr · 2023-12-01T10:26:02Z

CTFE considers function argument passing to be a transmute. The caller and callee might have completely different but ABI-compatible types.

👍 that's good to know, so for ctfe method calls are not an issue. Happy that CTFE has the imo right way of dealing with values and their types, codegen currently does not '^^

The only optimization where type information from the caller could leak to the callee is inlining, I think?

yeah, inlining should be the only one, though if we actually make the "transmute" of the arguments explicit when inlining it could be alright? though ofc now you have to remember to always explicitly transmute when looking into nested functions, which is also easy to mess up 😅 alternatively we could try to change any op which relies on type info to store the type info by itself, without relying on the type of the affected local?

RalfJung · 2023-12-01T10:56:35Z

alternatively we could try to change any op which relies on type info to store the type info by itself, without relying on the type of the affected local?

yeah that'd be like wasm a bit, having each operation carry the types it needs and basically making locals untyped. Not sure if that would be a good fit for MIR though, it seems highly redundant.

As you say, codegen just uses a bad structure here. Though you also had to fix some optimizations and I am less sure what you be done structurally to help them avoid this pitfall...

But overall if we say that inlining is a transmutation site then the explicit projections you had planned should work, right? And inlining has to be a transmutation site in general when function pointers are involved because we allow some type mismatches as long as things are ABI-compatible.

though ofc now you have to remember to always explicitly transmute when looking into nested functions, which is also easy to mess up

What do you mean? We don't "look into" nested functions, do we? We just paste the MIR of the callee into the caller. And then if this is a call through a function pointer we need to add transmute. We already need to do that, the caller might use type NonZeroU32 and the callee u32 and obviously things explode if we don't insert some explicit transmute.

When the call is a direct call of a fn item, then I don't think there is a possible issue here? Currently we only inline direct calls AFAIK, so the current inliner should be fine.

Overall I don't yet see why your plan of making subtyping explicit should be problematic.

lcnr · 2023-12-04T10:23:19Z

hmm, I am undecided on what to do again 😅 🤔

Though you also had to fix some optimizations and I am less sure what you be done structurally to help them avoid this pitfall...

I think to avoid this pitfall in optimizations we either need subtyping to be explicit or move the type of the op into the op itself 🤔

Not sure if that would be a good fit for MIR though, it seems highly redundant.

Quickly looking over the MIR, the typed operations are Rvalue::Cast with CastKind::Unsize and TerminatorKind::Call (the type of func is used to fetch the right instance).

But overall if we say that inlining is a transmutation site then the explicit projections you had planned should work, right?

I think so

The possible solutions here are:

explicitly handle subtyping after borrowck:

requires inlining of function pointers to explicitly transmute the args if types differ (that happens anyways)
allows post borrowck users to not care about subtyping
can break if a new operation is added to the mir which is a subtyping location

change ops to not rely on the type of locals:

adds a lot of redundant information (each Call terminator now stores an additional type)
allows post borrowck users to not care about subtyping, subtyping is handled the same in all mir stages
can break if a new operation is added which relies on the type of a local

RalfJung · 2023-12-04T11:55:38Z

Quickly looking over the MIR, the typed operations are Rvalue::Cast with CastKind::Unsize and TerminatorKind::Call (the type of func is used to fetch the right instance).

Now I am confused. I thought we'd need types basically everywhere a local shows up. Why only these two, what is special about them? The type of a local is relevant in a lot more places after all. That doesn't sound so bad then.

requires inlining of function pointers to explicitly transmute the args if types differ (that happens anyways)

Right, inlining of function pointers without transmute is already wrong, so this changes nothing here.

I wonder what other MIR consumers think. Cc @rust-lang/wg-mir-opt

lcnr · 2023-12-04T12:12:05Z

Now I am confused. I thought we'd need types basically everywhere a local shows up. Why only these two, what is special about them? The type of a local is relevant in a lot more places after all. That doesn't sound so bad then.

subtyping can't change layout, it's only when interacting with the trait system that it makes a difference. Most MIR ops don't call the trait solver, so subtyping doesn't impact them

RalfJung · 2023-12-04T14:37:12Z

Okay, so those two operations are the only ones that involve traits?

Cast I guess needs traits for dyn unsizing/upcasting? Anything else? Cast already carries the output type but I guess for this it also needs the input type.
Call needs traits to figure out the method to be called. But when we do static dispatch, we anyway have an Operand::Constant which carries its type. A local can only be involved when there is a function pointer. And function pointers do not need trait dispatch. Is that right? I guess not quite, since I can have a local variable with function item type and that could be a trait item and that could involve locals and then the type of the local would be relevant again... unfortunate.

"Having more subtyping locations" seems more likely than "having more things depend on traits". And generally it seems preferable to have analysis MIR and optimization/runtime MIR be as close to each other as possible. So I guess I am now also leaning slightly (very slightly) towards implicit subtyping with explicit types where it matters.

BoxyUwU · 2024-11-21T01:25:25Z

The only time that subtyping can change trait solver results is higher ranked subtyping (e.g. for<'a> fn(&'a u32) -> fn(&'static u32)), does this imply that if we started using coercions for such operations (breaking change :3) this issue would become moot as eliding a local and using a supertype would not be observable?

lcnr · 2024-11-21T10:32:17Z

The only time that subtyping can change trait solver results is higher ranked subtyping (e.g. for<'a> fn(&'a u32) -> fn(&'static u32)), does this imply that if we started using coercions for such operations (breaking change :3) this issue would become moot as eliding a local and using a supertype would not be observable?

jup ✨

lcnr added C-cleanup Category: PRs that clean code up or issues documenting cleanup. E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. labels Jun 15, 2023

This was referenced Jun 15, 2023

codegen generates vtables for a variable's supertype when unsizing sometimes #107205

Closed

mir opt + codegen: handle subtyping #112307

Merged

lcnr added A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html A-mir-opt Area: MIR optimizations A-variance Area: Variance (https://doc.rust-lang.org/nomicon/subtyping.html) labels Jun 19, 2023

cjgillot mentioned this issue Jul 1, 2023

Implement a global value numbering MIR optimization #109597

Merged

rustbot assigned ouz-a Aug 15, 2023

ouz-a mentioned this issue Aug 20, 2023

Make subtyping explicit in MIR #115025

Merged

ouz-a mentioned this issue Nov 30, 2023

Remove ProjectionElem::Subtype #118477

Closed

lcnr mentioned this issue Nov 20, 2024

remove Subtype projections #133258

Open

lcnr changed the title ~~make subtyping explicit post analysis~~ change handling of subtyping to be less fragile after analysis Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

change handling of subtyping to be less fragile after analysis #112651

change handling of subtyping to be less fragile after analysis #112651

lcnr commented Jun 15, 2023 •

edited

Loading

lcnr commented Jun 19, 2023

ouz-a commented Aug 15, 2023

goffrie commented Nov 26, 2023 •

edited

Loading

lcnr commented Nov 30, 2023 •

edited

Loading

RalfJung commented Nov 30, 2023

lcnr commented Nov 30, 2023 •

edited

Loading

RalfJung commented Nov 30, 2023 •

edited

Loading

lcnr commented Dec 1, 2023 •

edited

Loading

RalfJung commented Dec 1, 2023 •

edited

Loading

lcnr commented Dec 4, 2023

RalfJung commented Dec 4, 2023

lcnr commented Dec 4, 2023 •

edited

Loading

RalfJung commented Dec 4, 2023

BoxyUwU commented Nov 21, 2024

lcnr commented Nov 21, 2024

change handling of subtyping to be less fragile after analysis #112651

change handling of subtyping to be less fragile after analysis #112651

Comments

lcnr commented Jun 15, 2023 • edited Loading

lcnr commented Jun 19, 2023

ouz-a commented Aug 15, 2023

goffrie commented Nov 26, 2023 • edited Loading

lcnr commented Nov 30, 2023 • edited Loading

RalfJung commented Nov 30, 2023

lcnr commented Nov 30, 2023 • edited Loading

RalfJung commented Nov 30, 2023 • edited Loading

lcnr commented Dec 1, 2023 • edited Loading

RalfJung commented Dec 1, 2023 • edited Loading

lcnr commented Dec 4, 2023

RalfJung commented Dec 4, 2023

lcnr commented Dec 4, 2023 • edited Loading

RalfJung commented Dec 4, 2023

BoxyUwU commented Nov 21, 2024

lcnr commented Nov 21, 2024

lcnr commented Jun 15, 2023 •

edited

Loading

goffrie commented Nov 26, 2023 •

edited

Loading

lcnr commented Nov 30, 2023 •

edited

Loading

lcnr commented Nov 30, 2023 •

edited

Loading

RalfJung commented Nov 30, 2023 •

edited

Loading

lcnr commented Dec 1, 2023 •

edited

Loading

RalfJung commented Dec 1, 2023 •

edited

Loading

lcnr commented Dec 4, 2023 •

edited

Loading