Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace current quasi-subtyping relationship for object types with coercions #18737

Closed
nikomatsakis opened this issue Nov 7, 2014 · 17 comments
Closed
Assignees
Milestone

Comments

@nikomatsakis
Copy link
Contributor

Currently the type checker considers A+Send to be a subtype of A, but not for the purposes of inference. Part of the DST / coercion plan was to generalize this into a coercion relationship so that it can accommodate arbitrary traits. We should do this before 1.0 as it will cause various minor bits of breakage. Among other things, method resolution will have to be adapted to accommodate this change.

cc @nick29581

@nikomatsakis
Copy link
Contributor Author

Nominating for P-backcompat-lang, 1.0.

@pnkfelix
Copy link
Member

P-backcompat-lang, 1.0, but its possible that this is lesser priority than other P-backcompat-lang issues (i.e. we may be able to hold off on fixing this until the beta period).

@pnkfelix pnkfelix added this to the 1.0 milestone Nov 13, 2014
@nikomatsakis
Copy link
Contributor Author

If push came to shove, we could live with this behavior.

@aturon
Copy link
Member

aturon commented Jan 8, 2015

Nominating. It sounds like this should not be on the -beta milestone, but we should discuss.

@pnkfelix
Copy link
Member

pnkfelix commented Jan 8, 2015

Assigning 1.0-beta milestone based on potential for breaking lots of code according to nrc.

@pnkfelix pnkfelix modified the milestones: 1.0 beta, 1.0 Jan 8, 2015
@alexcrichton
Copy link
Member

An example that this may fix (or maybe a good smoke test)

pub trait Any: 'static {}
pub struct Box2<T: ?Sized>(Box<T>);

impl Box2<Any> {
    pub fn foo(self) {}
}

fn t2(t: Box2<Any + Send>) {
    t.foo(); // type `Box2<Any + Send>` does not implement any method in scope named `foo`
} 

This is largely in relation to the BoxAny trait we currently have.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Jan 16, 2015
This commit aims to stabilize the `TypeId` abstraction by moving it out of the
`intrinsics` module into the `any` module of the standard library. Specifically,

* `TypeId` is now defined at `std::any::TypeId`
* `TypeId::hash` has been removed in favor of an implementation of `Hash`.

This commit also performs a final pass over the `any` module, confirming the
following:

* `Any::get_type_id` remains unstable as *usage* of the `Any` trait will likely
  never require this, and the `Any` trait does not need to be implemented for
  any other types. As a result, this implementation detail can remain unstable
  until associated statics are implemented.
* `Any::downcast_ref` is now stable
* `Any::downcast_mut` is now stable
* `BoxAny` remains unstable. While a direct impl on `Box<Any>` is allowed today
  it does not allow downcasting of trait objects like `Box<Any + Send>` (those
  returned from `Thread::join`). This is covered by rust-lang#18737.
* `BoxAny::downcast` is now stable.
bors added a commit that referenced this issue Jan 17, 2015
This commit aims to stabilize the `TypeId` abstraction by moving it out of the
`intrinsics` module into the `any` module of the standard library. Specifically,

* `TypeId` is now defined at `std::any::TypeId`
* `TypeId::hash` has been removed in favor of an implementation of `Hash`.

This commit also performs a final pass over the `any` module, confirming the
following:

* `Any::get_type_id` remains unstable as *usage* of the `Any` trait will likely
  never require this, and the `Any` trait does not need to be implemented for
  any other types. As a result, this implementation detail can remain unstable
  until associated statics are implemented.
* `Any::downcast_ref` is now stable
* `Any::downcast_mut` is now stable
* `BoxAny` remains unstable. While a direct impl on `Box<Any>` is allowed today
  it does not allow downcasting of trait objects like `Box<Any + Send>` (those
  returned from `Thread::join`). This is covered by #18737.
* `BoxAny::downcast` is now stable.
alexcrichton added a commit to alexcrichton/rust that referenced this issue Jan 19, 2015
This commit aims to stabilize the `TypeId` abstraction by moving it out of the
`intrinsics` module into the `any` module of the standard library. Specifically,

* `TypeId` is now defined at `std::any::TypeId`
* `TypeId::hash` has been removed in favor of an implementation of `Hash`.

This commit also performs a final pass over the `any` module, confirming the
following:

* `Any::get_type_id` remains unstable as *usage* of the `Any` trait will likely
  never require this, and the `Any` trait does not need to be implemented for
  any other types. As a result, this implementation detail can remain unstable
  until associated statics are implemented.
* `Any::downcast_ref` is now stable
* `Any::downcast_mut` is now stable
* `BoxAny` remains unstable. While a direct impl on `Box<Any>` is allowed today
  it does not allow downcasting of trait objects like `Box<Any + Send>` (those
  returned from `Thread::join`). This is covered by rust-lang#18737.
* `BoxAny::downcast` is now stable.
bors added a commit that referenced this issue Jan 19, 2015
This commit aims to stabilize the `TypeId` abstraction by moving it out of the
`intrinsics` module into the `any` module of the standard library. Specifically,

* `TypeId` is now defined at `std::any::TypeId`
* `TypeId::hash` has been removed in favor of an implementation of `Hash`.

This commit also performs a final pass over the `any` module, confirming the
following:

* `Any::get_type_id` remains unstable as *usage* of the `Any` trait will likely
  never require this, and the `Any` trait does not need to be implemented for
  any other types. As a result, this implementation detail can remain unstable
  until associated statics are implemented.
* `Any::downcast_ref` is now stable
* `Any::downcast_mut` is now stable
* `BoxAny` remains unstable. While a direct impl on `Box<Any>` is allowed today
  it does not allow downcasting of trait objects like `Box<Any + Send>` (those
  returned from `Thread::join`). This is covered by #18737.
* `BoxAny::downcast` is now stable.
@nikomatsakis
Copy link
Contributor Author

OK, so I have a fix for this issue, but there is a bit of a complication. In particular, removing the subtyping relation means that methods implemented on (say) &Any are not available when the receiver is Box<Any+Send>. The reason that this worked before is because, when we autoderef Box<Any+Send> to &(Any+Send), we found that &(Any+Send) was a subtype of &Any, but that is no longer true. This was expected and I figured I would fix it as part of this patch (it was, indeed, the anticipated "hard part"). However, I am now having second thoughts about fixing this right now, and instead am contemplating landing a patch that breaks it a bit further. Read on for the explanation. :) Or, skip to the end to see what I propose to do.

Here are people I think might have an opinion on this that are not already familiar with the details: @reem @alexcrichton

Detailed explanation

So, after this patch, &(Any+Send) can indeed be coerced to &Any through a new kind of unsize coercion-- an upcast. Right now, upcasts are no-ops at runtime, but this will not always be true, because we'd eventually like to extend upcasts to cover the cases where vtables have to be adjusted. This is indeed one of the primary reasons to remove this subtyping relation and replace it with coercion.

Method dispatch already makes some effort to accommodate unsizing. The idea is that it begins by creating a series of autoderef steps based on the type of the receiver (e.g., &Box<[T;N]> => Box<[T;N]> => [T;N]). If the final step is a fixed-length vector like [T;N], then we add one additional step to [T]. We will then walk those steps in order, searching for a method. At each step, we first try the type U itself, then we try &U and &mut U ("autoref"). So that means that &Box<[T;N]> could match to an expected receiver type of &[T], for example, by autoderefing 2 times to [T;N], unsizing once to [T], and then autorefing to &[T]. However, this process would never match something like Box<[T]> -- because at that step, the unsizing hasn't happened yet, and so we only search for Box<[T;N]>, &Box[T;N], and &mut Box<[T;N]>. This is unfortunate -- and in particular, the current API for the Any type includes a method downcast that is implemented for Box<Any> but frequently invoked on Box<Any+Send>, and a method invocation like that wouldn't fit into this model.

Besides the limitation around matching Box<[T]> There is another problem with the existing unsizing -- something which makes it somewhat unapplicable to trait objects. In particular, while unsizing a fixed-length vector type like [T;N] has a deterministic result ([T]), upcasting a trait object does not. If you have a type like A+B+C+D (thinking forward to the future), it could be upcast to A+B+C, A+B+D, A+C+D, etc. And in fact there are even more combinations, since any one of those traits may have supertraits as well.

OK, I've laid the groundwork. Now let me tell you about two things. First, what I tried so far, and second, what I think we probably eventually want.

What I implemented so far is a simple hack. Basically I extended the unsize support that we have today ([T;N] => [T]) to trait objects. This relies on the current limitations that are in place for trait objects. In particular, trait objects today must have a primary trait and some number of builtin traits (Copy, Send, Sync, Sized). In practice, these are always Send and Sync. Which is unfortunate, because those are the two traits we want to remove from the compiler entirely. :) Anyway, I tried a hack where if the final type is an object type with builtin traits, we will drop all the builtin traits. Which means we will unsize Any+Send to Any. If we had Any+Send+Sync, we'd also unsize that to Any. This scheme is almost sufficient to type-check the existing Any API, except for the fact that downcast is implemented on Box<Any>. As I described above when talking about unsizing [T;N], even with this additional unsize step, we only wind up searching for &Any, and not Box<Any>. This hack is also not very future proof. It relies on the notion of a primary trait, which is going away, and on a distinction around builtin traits that is going away. Bad.

What I think we probably want to do is extend method dispatching beyond subtyping to also consider (at least some) implicit coercions. I'm not entirely sure how we want to do this, I see a lot of options. For example, we might consider all implicit coercions. We might try only fat pointer coercions. We might not even want to try all fat pointer coercions. There are lots of potential interactions and more than a few unknowns here. For example, will we have widening coercions in the future? How exactly will DST coercions work and what will they cover? Will people even care if these coercions are missing from method dispatch? (Oh, and also, these more general solutions are kind of a pain to implement right now, though feasible.)

Let me give you one example of a potentially bad interaction I was concerned about. Imagine that at each step, where we now consider subtyping, we also considered fat pointer coercion. One of the fat pointer coercions is that Box<T> can be coerced to Box<Trait> where T:Trait. Now, imagine that I have a receiver x of some type Box<Foo> where Foo:Bar, and I call x.bar() (here, bar() is an &self method defined in the Bar trait). This would normally autoderef by one step to Foo and then autoref to &Foo and we would statically dispatch. OK, now imagine I added an impl like impl Bar for Box<Bar>. This seems harmless enough. But we'll see that in fact it's a kind of footgun -- because now with our receiver x we could do a fat pointer coercion from Box<Foo> to Box<Bar> and dispatch that way.

This led me to two options:

  1. Don't consider the full set of fat-pointer coercions at each step, just upcasts and array unsizing (from [T;N] to [T]). But that's weird because now method dispatch uses a distinct set of coercions from other things.
  2. First do the full walk based on subtyping, as we do today, and then do a second walk based on coercions. This has the advantage of (a) being backwards compatible; (b) preferring tighter types whenever possible; and (c) not costing anything in compile time for programs that compile today (and for most method calls in the future too, I anticipate).

So option 2 seems good, but it also seems like an extension we can toy with in the future, and in particular we can experiment around the precise set of coercions we want to consider.

What I propose to do

Remove the subtyping relation and add upcast coercions (implemented).

Leave method dispatch as it is (implemented). This includes the [T;N] to [T] unsizing but no others. That could prove to be a slight asymmetry but removing it seems like it would break a lot of code. There is a slight risk in keeping this coercion, because if we add more general coercion support in the future, it could induce backwards incompatibilities around receiver types like Box<[T;N]>. In particular, if there were impls available for Box<[T]> and &[T], we would dispatch to the latter today, but in the future with a more general solution we'd probably prefer the former. I expect we'd be able to change this without getting complaints (in fact, it seems like a bug fix), but it's a risk.

This does affect the Any API. In particular, if you have a Any+Send object, you can't use method form. You have two options then: upcast, or use free fns / UFCS form. The latter works because you get upcast coercions in that cast. Today that means writing Any::is::<T>(obj) instead of obj.is::<T>() and <Box<Any>>::downcast::<T>(obj) instead of obj.downcast::<T>(). Arguably we could/should just change those to be functions in the any module (any::is::<T>(obj), any::downcast::<T>(obj)), since the method form isn't especially convenient.

This may affect other APIs outside of std, but I am not sure. This is partly why I cc'd @reem, because I know he does various bits of magic, though I don't think the things I've seen would have their public APIs affected by these changes.

@nikomatsakis
Copy link
Contributor Author

I also posted the above comment to discuss: http://internals.rust-lang.org/t/impact-of-fixing-subtyping-for-object-types-18737/1709

@nikomatsakis
Copy link
Contributor Author

Hmm, I encountered a bit of a footgun. It's kind of related to all of the method dispatch above actually (and suggests, in fact, that more general fat pointer coercion would make the Any API harder to use). In particular, what would you expect this code to do:

struct S { .. }
fn foo(x: Box<Any+Send>) -> bool {
    Any::is::<S>(&x)
}

Presuming that x is, in fact, a Box<S>, I would expect this to return true. But it returns false. The reason is that Any expects a &Any. It is supplied a &Box<Any+Send>. Rather than deref coercions kicking in, as I expected, we get a trait coercion where Box<Any+Send> is coerced to the trait object Any, and thus the type test fails because S != Box<Any+Send>. The code works fine if you write &*x, in particular. Note that if we did the full fat-pointer-based method dispatch I described above, the same phenomenon would arise.

This problem is somewhat specific to the Any API, because it is implemented basically for all T, but it suggests that free-fns can also be easily misused. It'd be nice to have an API form that is downright hard to misuse. I'm not sure what's the best fix here.

@alexcrichton
Copy link
Member

If the Any+Send case is generalized to Foo+X, does this mean that methods defined via impl Foo will not be available on Box<Foo+X> (regardless of X)?

Hmm, I encountered a bit of a footgun.

Hm that is... unfortunate! I do agree though that it is Any-specific, however. I recall running across this footgun in the past (both myself and via questions on IRC), so I wouldn't be too too worried about it. I think it's fine to move to free functions or require UFCS (perhaps favoring UFCS as the methods are #[stable] currently)

@arielb1
Copy link
Contributor

arielb1 commented Mar 12, 2015

@nikomatsakis

How would moving to UFCS solve it? Isn't the problem that with <&Any as Any>::is::<S>(&ex) rustc coerces ex to Any by an autoref (to &Box<Any>) followed by an unsize rather than by an autoderef to Box<Any> followed by an autoref?

@nikomatsakis
Copy link
Contributor Author

@alexcrichton

If the Any+Send case is generalized to Foo+X, does this mean that methods defined via impl Foo will not be available on Box<Foo+X> (regardless of X)?

Yes.

I think it's fine to move to free functions or require UFCS (perhaps favoring UFCS as the methods are #[stable] currently)

I think I favor free fns only because I find it somewhat more ergonomic. For example, you can do use any::is and then write is::<T>(x) vs Any::is(x). Similarly, <Box<Any>>::downcast() is fairly annoying.

Hm that is... unfortunate! I do agree though that it is Any-specific, however. I recall running across this footgun in the past (both myself and via questions on IRC), so I wouldn't be too too worried about it.

I don't see any clear sol'n -- but I did find forward porting some tests to be fairly annoying due to this. This was mostly a problem in various "torture tests" that seemed to have particularly confusing type casts going on though.

@nikomatsakis
Copy link
Contributor Author

@arielb1

How would moving to UFCS solve it? Isn't the problem that with <&Any as Any>::is::<S>(&ex) rustc coerces ex to Any by an autoref (to &Box<Any>) followed by an unsize rather than by an autoderef to Box<Any> followed by an autoref?

Sorry, I'm not sure what comment of mine you are referring to, but:

  1. UFCS solves the problem where I supply a &(Any+Send) and a &Any is expected. This is because we can't make an Any object from an Any+Send object except by upcasting (you can only make an object from a sized type).
  2. UFCS does not solve the footgun I mentioned where you supply an &Box<Any+Send> and an &Any is expected, because Box<Any+Send> can be "objectified" to Any.

@reem
Copy link
Contributor

reem commented Mar 16, 2015

@nikomatsakis sorry I didn't get to this sooner. Your approach sounds reasonable, but the footguns surrounding the use of Any and other traits which are implemented for most/all T make me less enthused about this. I can work around anything in a downstream library, but it would make me a lot more apprehensive about using unsafe-any, since the chance of something going horribly wrong is much higher.

@nikomatsakis
Copy link
Contributor Author

@reem yes, I was wondering about what's the best solution here. Definitely I found working with Any to be... precarious when I was forward-porting tests. One possibility is of course a custom lint to watch out for things like coercing a &Box<Any+Send> into a &Any as an argument of any::is and friends, but that won't help for unsafe_any (and feels...very tailored!).

Another possibility is that I could keep the simple hack that dropped all builtin bounds from method dispatch (i.e., step from Trait+Send to Trait). But that feels unfortunate because it is involving a lot of concepts I would like to go away. It's not a very general mechanism so I'd hate to be relying on it forever.

@nikomatsakis
Copy link
Contributor Author

Well, something else just occurred to me. One other simple fix here might be to implement the relevant methods for Any on Any+Send as well. Kind of a hack, but at least we'd only have to get the code that does the upcasting correct once, and this is a very common case due to the thread error propagation APIs.

@nikomatsakis
Copy link
Contributor Author

I implemented the approach of adding extra impls for Any. This causes all existing code in libstd to work and seems to address the footgun problem, while still leaving us room in the future to expand method dispatch to handle more such cases (this basically just builds in one particular upcast).

bors added a commit that referenced this issue Mar 17, 2015
This upcast coercion currently never requires vtable changes. It should be generalized. 

This is a [breaking-change] -- if you have an impl on an object type like `impl SomeTrait`, then this will no longer be applicable to object types like `SomeTrait+Send`. In the standard library, this primarily affected `Any`, and this PR adds impls for `Any+Send` as to keep the API the same in practice. An alternate workaround is to use UFCS form or standalone fns. For more details, see <#18737 (comment)>.

r? @nrc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants