-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure loops optimize right at the LLVM level #26902
Comments
@pcwalton has said their willing to mentor anyone who wants to take a crack at this. |
Also check the benchmarks game benchmarks. I've heard things about LoopIdiomRecognize badness causing problems there. |
A lot of the things we want to guarantee optimize right:
This is an old gist I have of some test things when I was looking at solving this with TrustedLen (which is a bad idea IMO). Not all of these benches can clearly optimize well (e.g. they aren't a memcopy/memset, but rather creating (0..n) as a vec). |
I'll take a crack this weekend. I'll start with Ganko's benches and add anything from benchmark games that isn't already there. |
I've made a repo for minimally experimenting with Vec, RawVec, and some basic benches: https://github.com/Gankro/vec-perf-test Note that the Note that not all of these benches are expected to optimize well; in particular skip-take and take-skip make [1, 2, 3, ..] which can't be turned into a memcpy or memset to my knowledge. |
I'm sending my scripts/command lines I use to generate the LLVM IR to pcwalton this weekend along with the output of a run so he can make sure I'm on the right path. One of the problems I encountered was the deep inlining required when rust monomorphized a static lambda. It make it very difficult to pull out a few things and took considerable time to hunt down all the generated functions. If anybody knows of a way to llvm-extract that will also pull out functions that the target depends on, that would be really helpful. -func="target" will only give you exactly what you ask for. Running over the entire output just gives too much. So, I've taken to putting a single function in main() and dumping that up then running opt with -print-before-all and looking at the before loop idiom recognition sections that are generated, |
Last note on this for now. I sent pcwalton the LLVM IR |
This is quite a speculative wishlist task, not updated in nearly a year. Closing. Performance improvements always welcome. |
This involves 4 steps:
rustc -opt-level=0 --emit=llvm-ir
opt -O2 -print-before-all
and grab the IR from beforeRecognize loop idioms
CC @pcwalton
The text was updated successfully, but these errors were encountered: