-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add asv benchmarks for "utility scale" compilation #12148
Conversation
This commit adds new benchmarks that parse and compile "utility scale" to the asv suite. These scale of circuits are increasingly a user workload of interest so having nightly benchmarks that cover this is important. This adds a few benchmarks to time circuit parsing and compilation so we track this over time. Additionally to better optimize our output binaries on release a variant of the same benchmarks is added to the PGO profiling to ensure we have coverage of this scale problem as part of our profiling data.
One or more of the the following people are requested to review this:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like an obviously good idea to me to start getting in ASAP. We potentially ought to be adding parsing of OpenQASM 3 files to the benchmarks and PGO as well, but could do that in a follow-up if you want.
# Uncomment when this is fast enough to run during release builds | ||
# qv_circ = QuantumVolume(100, seed=123456789) | ||
# qv_circ.measure_all() | ||
# qv_circ.name = "QV1267650600228229401496703205376" | ||
for pm in [cz_pm, ecr_pm, cx_pm]: | ||
for circ in [qft_circ, square_heisenberg_circ, qaoa_circ]: | ||
print(f"Compiling: {circ.name}") | ||
pm.run(circ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, how long does it take right now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running this script with current main had a wall time of 1min 29sec on my workstation. It'll definitely be slower in CI but hopefully not that much slower. But even if it's 10x, an extra 10min seems reasonable to me to ensure we have the coverage at this scale.
I really wanted to do the 100 qubit full depth QV too, but we're still too slow for that. When we get closer to the final 1.1 release I'll run it again with QV and see how it looks then and maybe re-add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran it with QV in the circuit list again just now with main and it took 9 min 6 sec on my workstation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only get two (ish) processes on CI, so that's probably too long right now, but it certainly feels like a very achievable goal to get that down enough.
Pull Request Test Coverage Report for Build 8576388275Details
💛 - Coveralls |
This commit adds a default value to the generate_preset_pass_manager's optimization_level argument. If it's not specified optimization level 2 will be used. After Qiskit#12148 optimization level 2 is a better fit for an optimal tradeoff between heuristic effort and runtime that makes it well suited as a default optimization level.
Lets do this in a follow up, I think it only really makes sense to use the rust parser for this since that's the one we're actively developing (and it's the only one that makes sense for PGO). I only say this because I tried using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the OQ2 files you had would parse fine with qasm2.load
(no extra bits needed), but if there's any sx
hiding in them, I guess that's the problem - it's a nuisance that sx
wasn't in the core qelib1.inc
. I really want to get the exporter fixed so that there's not that discrepancy - it was a mistake to merge my separate OQ2 parser without the legacy support by default while the exporter was still doing that.
At any rate, this looks fine to me for now, and we can always add more benchmarks later.
) * Add a default optimization level to generate_preset_pass_manager This commit adds a default value to the generate_preset_pass_manager's optimization_level argument. If it's not specified optimization level 2 will be used. After #12148 optimization level 2 is a better fit for an optimal tradeoff between heuristic effort and runtime that makes it well suited as a default optimization level. * Update transpile()'s default opt level to match This commit updates the transpile() function's optimization_level argument default value to match generate_preset_pass_manager's new default to use 2 instead of 1. This is arguably a breaking API change, but since the semantics are equivalent with two minor edge cases with implicit behavior that were a side effect of the level 1 preset pass manager's construction (which are documented in the release notes) we're ok making it in this case. Some tests which we're relying on the implicit behavior of optimization level 1 are updated to explicitly set the optimization level argument which will retain this behavior. * Update more tests expecting optimization level 1 * * Set optimization level to 1 in test_approximation_degree. * Replace use of transpile with specific pass in HLS tests. * Set optimization_level=1 in layout-dependent tests. * Expand upgrade note explanation on benefits of level 2 * Apply Elena's reno suggestions --------- Co-authored-by: Elena Peña Tapia <epenatap@gmail.com> Co-authored-by: Elena Peña Tapia <57907331+ElePT@users.noreply.github.com>
…kit#12150) * Add a default optimization level to generate_preset_pass_manager This commit adds a default value to the generate_preset_pass_manager's optimization_level argument. If it's not specified optimization level 2 will be used. After Qiskit#12148 optimization level 2 is a better fit for an optimal tradeoff between heuristic effort and runtime that makes it well suited as a default optimization level. * Update transpile()'s default opt level to match This commit updates the transpile() function's optimization_level argument default value to match generate_preset_pass_manager's new default to use 2 instead of 1. This is arguably a breaking API change, but since the semantics are equivalent with two minor edge cases with implicit behavior that were a side effect of the level 1 preset pass manager's construction (which are documented in the release notes) we're ok making it in this case. Some tests which we're relying on the implicit behavior of optimization level 1 are updated to explicitly set the optimization level argument which will retain this behavior. * Update more tests expecting optimization level 1 * * Set optimization level to 1 in test_approximation_degree. * Replace use of transpile with specific pass in HLS tests. * Set optimization_level=1 in layout-dependent tests. * Expand upgrade note explanation on benefits of level 2 * Apply Elena's reno suggestions --------- Co-authored-by: Elena Peña Tapia <epenatap@gmail.com> Co-authored-by: Elena Peña Tapia <57907331+ElePT@users.noreply.github.com>
Summary
This commit adds new benchmarks that parse and compile "utility scale" to the asv suite. These scale of circuits are increasingly a user workload of interest so having nightly benchmarks that cover this is important. This adds a few benchmarks to time circuit parsing and compilation so we track this over time.
Additionally to better optimize our output binaries on release a variant of the same benchmarks is added to the PGO profiling to ensure we have coverage of this scale problem as part of our profiling data.
Details and comments