Skip to content
This repository was archived by the owner on Apr 25, 2025. It is now read-only.
This repository was archived by the owner on Apr 25, 2025. It is now read-only.

Polymorphic devirtualization and funcref not being a subtype of eqref? #239

Closed
@kripken

Description

@kripken

As mentioned in the last GC CG meeting, devirtualization helps quite a lot on j2cl, something like a 41% speedup. That handles the case of a single call target being possible, that is, we load from a vtable and binaryen can infer that the vtable must contain a particular function reference, and so we replace the load with that reference, which then allows the call to become direct and even inlined.

Looking into the polymorphic case, that is, where there is a small number of possible function references but more than one, I was hoping to do something like this:

(struct.get vtable)

=>

(select
  (first possible function)
  (second possible function)
  (ref.eq (struct.get vtable) (first possible function))
)

Later optimizations can then replace a call_ref of a select of two function references into an if over two possible calls, etc. However, the condition of the select hits a problem, as funcref is not a subtype of eqref - function references cannot be compared for equality.

I couldn't find a detailed discussion of that, but IIRC the motivation was to allow VMs to optimize things like folding two identical functions into a single one, etc. That sounds reasonable, but the devirtualization issue shows that might be an optimization tradeoff which is not obvious?

Gathering some data, if I disable validation in binaryen then allowing 2 functions instead of 1 leads to 14K more places where we can turn a get from a vtable into a constant (well, a select over constants). Allowing 3 raises that to 17K, and 4 to 19K. (At some amount this becomes less useful, though, of course.) For comparison, the total number of call_refs is 42K, so even with 2 functions we are talking about potentially optimizing away a third of indirect call sites, which sounds like it could be very significant.

Alternatives:

  • Rewrite the types, replacing the funcref with an i32 index that we can select on. The problem is that we'd need to replace the vtable field in all relevant subtypes and supertypes, which may not be practical in general. Adding an additional field is another option, but would add memory and runtime overhead.
  • Use ref.test on the vtable. That might work if the different functions come from different types, which I believe is the general case (but I'd need to check). How fast is ref.test expected to be?
  • Perform such devirtualization in the VM and not the toolchain. Doing it statically is probably not reasonable (as a large LTO-style optimization), but using runtime profiling data it might be.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Post-MVPIdeas for Post-MVP extensions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions