-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: inefficient codegen for calls returning 16-byte structs on Linux x64 / arm64 #8571
Comments
This is also applicable for ARM64 as well. When profiling this method on windows ARM4, we generate the following. The ctor call is inlined.
Here
|
This issue seems to pertain to an older version of binarytrees (perhaps https://github.com/dotnet/coreclr/blob/414ab4ee1a6f31ae63f166de2b9d4d0af640574f/tests/src/JIT/Performance/CodeQuality/BenchmarksGame/binarytrees/binarytrees.csharp.cs). For that version, after #36862, we are largely keeping the return values in registers. |
From the binarytrees performance benchmark, initial call to
bottomUpTree
fromBench
(other calls to this method have similar issues)bottomUpTree
has similar issues at its recursive call sites, and also does some redundant zeroing of temp structs that were zeroed in the prolog:Note this latter bit of code could simply be something like
category:cq
theme:structs
skill-level:expert
cost:large
The text was updated successfully, but these errors were encountered: