-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NativeAOT] Link-time-optimize unmanaged portions of the runtime on Linux #86083
Comments
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas Issue DetailsWe currently compile unmanaged portions of the runtime without LTO or PGO because the runtime is placed in an .a file that gets linked using an unknown linker that exists on the user machine. LTO requires a linker that knows how to interpret the bitcode in non-ELF object files. We can however apply LTO on .a files. See David's prototype at https://gist.github.com/davidwrighton/385035ffd24b88c39c2e7d5cf0274907. How to use from David:
So the theory is that we could:
We'd need to measure if this is indeed profitable and worth the engineering costs. Success not guaranteed. Might be better to first just enable LTO locally and use a linker that can handle it E2E (i.e. turn on LTO and compile with ILC as usual, expecting the linker step to do the LTO) and get some measurements. GC perf would be the most interesting to measure, so do something that stresses the GC and measure with/without LTO.
|
I did a local experiment where I simply passed For TodosApi the improvement would be about 1% in RPS and a small improvement to latency as well. In theory, this could also be addressed with #83611.
|
We currently compile unmanaged portions of the runtime without LTO or PGO because the runtime is placed in an .a file that gets linked using an unknown linker that exists on the user machine. LTO requires a linker that knows how to interpret the bitcode in non-ELF object files.
We can however apply LTO on .a files. See David's prototype at https://gist.github.com/davidwrighton/385035ffd24b88c39c2e7d5cf0274907.
How to use from David:
So the theory is that we could:
We'd need to measure if this is indeed profitable and worth the engineering costs. Success not guaranteed. Might be better to first just enable LTO locally and use a linker that can handle it E2E (i.e. turn on LTO and compile with ILC as usual, expecting the linker step to do the LTO) and get some measurements. GC perf would be the most interesting to measure, so do something that stresses the GC and measure with/without LTO.
The text was updated successfully, but these errors were encountered: