[RyuJIT] lack of escape analysis makes high GC overhead in SoA SIMD programs #10760
Labels
area-CodeGen-coreclr
CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
enhancement
Product code improvement that does NOT require public API changes/additions
optimization
Milestone
According to the VTune characterization dotnet/coreclr#18839 (comment), SoA SIMD programs have higher GC overhead than AoS and scalar programs because of temp object allocation.
SoA SIMD programs use
VectorPacket
as the primitive data type (Note, hereVectorPacket
is a reference typeclass
)And each
VectorPacket
operation is immutable that returns a newVectorPacket
as the result.This semantic makes a lot of temp object allocations, for example, there are two
VectorPacket
operations in the code segment belowThese two lines will be compiled by RyuJIT to
However, the two commented blocks are unnecessary, and the ideal codegen could be
So introducing escape analysis https://github.com/dotnet/coreclr/issues/1784 and unwarping the local
VectorPacket
objects will significantly reduce the GC overhead of SIMD programs.Additionally, the current struct promotion also does not work with
VectorPacket
, so if changingVectorPacket
tostruct
fromclass
, that will generate so much memory copies and get worse performance.category:cq
theme:vector-codegen
skill-level:expert
cost:large
impact:medium
The text was updated successfully, but these errors were encountered: