-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove _legacy_format_cache
slot from CircuitInstruction
#10417
Remove _legacy_format_cache
slot from CircuitInstruction
#10417
Conversation
This was originally added as a mitigation for performance loss in consumers of `QuantumCircuit.data` that were still using the legacy interface. It is now becoming more critical to minimise memory usage in the `QuantumCircuit` object, so the extra slot is a luxury we can no longer afford. This represents a 12.5% reduction in the inherent memory footprint of `CircuitInstruction`, though in practice the impact will be smaller, since the `operation` and `qubits` fields will near-universally have associated weight themselves (the `clbits` field is _usually_ the empty tuple, with is a natural singleton untracked by the GC in CPython). Other ongoing work on making more objects singletons will reduce those weights, however.
Pull Request Test Coverage Report for Build 5533006024
💛 - Coveralls |
I ran a small test on PEC-style circuits including the singleton PR and it reduced the memory requirements from 2.5GB to 1.7GB, so a 32% reduction 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, it might be good to keep an eye on the asv benchmarks to see if this adds any unexpected overhead though 🙂
If it's a 32% reduction, there's something going wrong with the code that we're benchmarking. It can't be that large unless something in that benchmarking code is causing the legacy cache to get populated, which means that this PR would cause a runtime issue for that code. edit: or the benchmark is not measuring the memory accurately, and this PR is somehow getting credit for a reclamation from the GC that the other form of the code should also be able to attain as well. |
I should say, I still think that this is good to merge, but the exact memory profiling is confusing me, and I think it's strongly indicative that however we're measuring the memory usage of a |
Summary
This was originally added as a mitigation for performance loss in consumers of
QuantumCircuit.data
that were still using the legacy interface. It is now becoming more critical to minimise memory usage in theQuantumCircuit
object, so the extra slot is a luxury we can no longer afford.This represents a 12.5% reduction in the inherent memory footprint of
CircuitInstruction
, though in practice the impact will be smaller, since theoperation
andqubits
fields will near-universally have associated weight themselves (theclbits
field is usually the empty tuple, with is a natural singleton untracked by the GC in CPython). Other ongoing work on making more objects singletons will reduce those weights, however.Details and comments
100% of the performance impact of this change within Terra should be mitigated by #10416, but equally, those places are unlikely to be too highly performance critical already.
The memory-usage improvement from this PR will have more relative impact when measured along with #10314 - the absolute improvement in memory usage won't change, but making more operations singletons will drastically improve the relative effects.