Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supertrait coercion and multiple trait pointers #2157

Closed
parkovski opened this issue Sep 18, 2017 · 3 comments
Closed

Supertrait coercion and multiple trait pointers #2157

parkovski opened this issue Sep 18, 2017 · 3 comments
Labels
T-compiler Relevant to the compiler team, which will review and decide on the RFC.

Comments

@parkovski
Copy link

I think I may have a solution for this, but it'd be awesome if it could be improved on.

For standalone traits (no supertrait), nothing changes. For traits with supertraits, I'm adding two types of pointer offset to the vtable. The first, Oa is the offset to trait A's vtable. The second, Ωa is the offset from trait A's vtable pointer to its offset table, where the O values for all its supertraits are listed in order. Here's what it looks like with these example traits:

Trait bounds
------------
A             (requires only A)
B             (requires only B)
C: A          (requires C, A)
D: A + B      (requires D, A, B)
E: C + B      (requires E, C, A, B)
F: C + D      (requires F, C, A, D, B)
G: C + E + F  (requires G, C, A, E, B, F, D)

Vtable layout
* indicates the actual location of the vtable pointer.
Ω values are negative offsets.
-------------
A: *A

B: *B

C: Oa Ωc *C A
  |__|
   Ωc

D: Oa Ob Ωd *D A B
  |_____|
     Ωd

E: Oc Oa Ob Ωe *E Ωc C A B
  |  |__|  |
  |___Ωc___|  
      Ωe

F: Oc Oa Od Ob Oa Ob Ωf *F Ωc C A Ωd D B
  |  |__|     |_____|
  |___Ωc______|  Ωd
        Ωf

  * this is a naive implementation, I'll talk about condensing it in a minute.

G: Oc Oa Oe Ob Of Od Oc Oa Ob Oc Oa Od Ob Ωg *G Ωc C A Ωe E B Ωf F Ωd D
  |  |__|           |  |_____|___________|
  |___Ωc____________|_____Ωd_|    Ωf
          Ωg           Ωe

  * some unnecessary repetitions here too.

Ok, so now that the basics of this are out of the way, the only overlaps I showed so far were the trivial ones - where if you scan the existing set of offsets you find a subset in the same order as the one you're trying to add. We definitely want to rearrange the order of the bounds though to condense the offset tables as much as possible. In this case, we can reorder them and get this:

F: Oc Oa Ob Od Ωf *F Ωc C A Ωd D B
  |  |__|  |  |
  |  |Ωc___|  |
  |____ Ωd____|
        Ωf

G: Oc Oa Ob Od Oe Of Ωg *G Ωc C A Ωe E B Ωf F Ωd D
  |  |__|  |  |     |
  |  |Ωc___|  |     |
  |_____Ωd_|  |     |
  |___Ωe______|     |
  |_____Ωf__________|
          Ωg

I probably could have picked a better example, because there will be cases where repetition is unavoidable, and I imagine there might even be cases where the optimal solution for the whole program involves leaving a hole in the offset table for some trait that basically acts as a "wildcard" where combined bounds can fill in whatever they need to optimize the size of their offset tables. I don't have an algorithm for this yet.

So, how casting works should be fairly obvious, but consider a cast from D to B, given these criteria which this system satisfies:

  • We have a pointer to trait object D, where the vtable pointer points to D's method table (indicated by D in the example).
  • We know the order of the O values for D, whatever and wherever they are, will be the same for every instance of D.
  • We know that Ωd is located at (C syntax) ((size_t *)D - 1).
  • We know the vtable for B is at Ob + *Ob.
  • So the cast/coercion (d: Box<D>) as Box<B> becomes (D + *Ωd + 1 /* index of Ob for D */) + *(D + Ωd + 1) == (D + *(D - 1) + 1) + *(D + *(D - 1) + 1).

Two pointer dereferences, I know. At least they're close together in cache, and there are some optimizations I'll get to. So why not just list the offsets inline and only do one dereference? That's the first optimization - for a trait like C with only one supertrait, it's really overkill to list two separate offsets, so for that case we should just replace Ωc with Oa. And that may end up happening in a lot of cases, because while you may have a trait that requires several other traits, you probably only use one or two of them for virtual dispatch, and there's no sense generating offset tables when they're never going to be used. But the main reason why I picked this implementation is that it supports Box<A + B> really easily.

All you do is create a pseudo trait A + B that doesn't implement any of its own methods, but requires A + B (obviously), and add it as a requirement to everything that requires both those traits. Now they end up with an Ωa+b in their vtable that takes three dereferences to get a method out of it, so we want to do one of those with the cast. Since the offsets are listed in a known order, the vtable for a combination type like this is *Ωa+b == the first of Oa, Ob. Now calling a method on one of these types takes at most two adds and two dereferences, and hopefully LLVM will cache a lot of these accesses if we're say, calling mostly methods from A on that type.

Other potential optimizations include:

  • If the compiler knows what concrete type it most likely is, it can emit an Objective-C style "is this what I think it is" hot path and avoid any extra lookups at all most of the time.
  • If it works out that all implementations of A + B list their vtables in the same order with the same distance between them, then casts between them are trivial. The algorithm that tries to achieve this may not be quite as trivial.

Thoughts?

@glaebhoerl
Copy link
Contributor

I think it might be worth posting this on internals.rust-lang.org, which gets more attention from compiler devs.

@parkovski
Copy link
Author

Thanks, I'm going to clarify a couple things and then I'll post it there.

@Centril Centril added the T-compiler Relevant to the compiler team, which will review and decide on the RFC. label Dec 6, 2017
@Centril
Copy link
Contributor

Centril commented Oct 8, 2018

Closing in favor of #2035.

@Centril Centril closed this as completed Oct 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-compiler Relevant to the compiler team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests

3 participants