-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add type-system-chess to stress trait resolution. #1680
Conversation
Wow, pretty cool project I gotta say :) We'll see what others think about it being a benchmark. CC @lcnr - does this look like something that could e.g. usefully stress test the new trait solver? |
if you have the capacity to add it as a secondary benchmark then I think this is somewhat valuable. It could be a good indicator preventing us from accidentally introducing some non-linear complexity somewhere |
Ok, thank you. We will discuss this at the performance team meeting next week. In any case, 40s seems a bit too much. We run each benchmark three times, and adding (at least) 120s to the suite is a bit much to my taste. Could you (eventually) reduce this down to ~10-15s? |
Changing the board configuration to just the two kings in the center of the board brings the time down to around that range however it then uses fewer of the types involved. Could we instead run it just once? I noticed some of the other benchmarks do that. |
That's also possible, but the fewer iterations, the larger the noise could be. It would only make sense if the benchmark couldn't be easily reduced to a smaller time (as e.g. cargo) and if the additional time was actually relevant. If the performed work is the same, and just expanded from 15s to 45s, then I'm not sure it's worth it. Exponential behaviour and other problems will probably be seen also on 15s compilation, which is already quite long. |
Okay. If the performance team is interested in this we can see how long it takes to run on the CI pipeline. |
I took a look with Cachegrind. The profile is attached. It certainly looks different to any other profile I've ever seen! In general, hash map insertion is extremely hot. Some specific code snippets follow, with instruction counts. In
In
In
In
And with Callgrind I was able to identify that this code is the heart of the problem. I haven't investigated what kind of data is in there, e.g. whether the obligations list gets extremely long. I have profiles of 1000 of the top crates on crates.io on my machine, and none of them look anything like this profile. So my inclination is to not create a benchmark for this, because it's so unusual. Having said that, there's a good chance that some small changes will speed up compilation of this code drastically. |
Because I am unsure where else to put it: I spent an hour or so playing with https://github.com/Dragon-Hatcher/type-system-chess and |
@nnethercote I'm not that knowledgeable in this area but is it possible that even though the workload doesn't look like that of real world crates it could still be a useful benchmark? Real crates don't spend 100% of their time doing trait resolution but lots of crates still need to do heavy trait resolution. If we improve this benchmark then those area could get faster. My thinking when I submitted this was that it would give a pure signal on that front. OTOH maybe the workloads here are too extreme to be a good signal and optimizing for this benchmark could be harmful to normal crates. |
Like I said above:
This is a unique program 😄 Equally importantly, there are constraints on the benchmark suite size. More benchmarks means it takes longer to run, and there are more results to interpret on every run. We currently have 19 primary and 23 secondary benchmarks. I already think there are too many secondary benchmarks. We just don't have space for every stress test, unfortunately. I just tried changing the |
This gives massive (~7x) compile time and memory usage reductions for the trait system stress test in rust-lang/rustc-perf#1680.
FWIW, I got about a 7x reduction in compile time and memory usage for this program in rust-lang/rust#114611. There are a couple of secondary benchmarks that also get wins of a couple of percent. And I looked more closely at the top 1000 crates, I see a handful there that might also benefit by 1 or 2%. |
Really cool! |
…piler-errors Speed up compilation of `type-system-chess` [`type-system-chess`](rust-lang/rustc-perf#1680) is an unusual program that implements a compile-time chess position solver in the trait system(!) This PR is about making it compile faster. r? `@ghost`
rust-lang/rust#114611 has landed so we can close this PR. |
Thanks for your work @nnethercote! |
Speed up compilation of `type-system-chess` [`type-system-chess`](rust-lang/rustc-perf#1680) is an unusual program that implements a compile-time chess position solver in the trait system(!) This PR is about making it compile faster. r? `@ghost`
Speed up compilation of `type-system-chess` [`type-system-chess`](rust-lang/rustc-perf#1680) is an unusual program that implements a compile-time chess position solver in the trait system(!) This PR is about making it compile faster. r? `@ghost`
Speed up compilation of `type-system-chess` [`type-system-chess`](rust-lang/rustc-perf#1680) is an unusual program that implements a compile-time chess position solver in the trait system(!) This PR is about making it compile faster. r? `@ghost`
This adds the rust version of my Dragon-Hatcher/type-system-chess in order to stress trait resolution. There is basically no code generation involved, only type checking.
On my laptop
CARGO_INCREMENTAL=0 cargo check ; cargo clean
takes 38.68 seconds. This time can easily adjusted by adding or removing positions to analyze.