Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dict-based SparsePauliOp.simplify #7656

Merged
merged 21 commits into from
Mar 17, 2022
Merged

Conversation

t-imamichi
Copy link
Member

@t-imamichi t-imamichi commented Feb 13, 2022

Summary

This PR replaces np.unique in SparsePauliOp.simplify with a dict-based operation.
This is faster than np.unique when the number of qubits is large and there are many duplicate operators.
But, if the number of qubits is small or there is no duplicate operator, this PR can be slower.
Moreover, the code gets more complicated than just applying np.unique.
So, I don't have a strong opinion to include this PR as Terra. I want feedback from Qiskit developers.

Update:
This PR contains three updates:

  1. I implemented a utility function unordered_unique in rust to replace np.unique in SparsePauliOp.simplify.

  2. I added SparsePauliOp.equiv to check the equivalence of two operators while SparsePauliOp.__eq__ checks elementwise equality. Based on Equality of SparsePauliOps with different orders #7657
    E.g.,

op = SparsePauliOp.from_list([("X", 1), ("Y", 1)])
op2 = SparsePauliOp.from_list([("X", 1), ("Y", 1), ("Z", 0)])
op3 = SparsePauliOp.from_list([("Y", 1), ("X", 1)])

print(op == op2)  # False
print(op == op3)  # False
print(op.equiv(op2))  # True
print(op.equiv(op3))  # True
  1. I found a bug that SparsePauliOp.__eq__ raises an error if two operators have different numbers of coefficients.
    I fixed it and add unit tests.
op = SparsePauliOp.from_list([("X", 1), ("Y", 1)])
op2 = SparsePauliOp.from_list([("X", 1), ("Y", 1), ("Z", 0)])
print(op == op2)
#  ValueError: operands could not be broadcast together with shapes (2,) (3,)

Closes #7657

Details and comments

I attach a microbenchmark. Because op4 = op3.simplify(), op4 does not contain any duplicate operator.
The current approach with np.unique is expected to be faster in the 2nd pass because it is fast when operators are already sorted.

import random
from timeit import timeit

from qiskit.quantum_info import SparsePauliOp

def bench(k, n):
    random.seed(123)
    op = SparsePauliOp.from_list([(''.join(random.choices('IXYZ', k=k)), 1) for _ in range(n)])
    op2 = SparsePauliOp.from_list([(''.join(random.choices('IXYZ', k=k)), 1) for _ in range(n)])
    op3 = op.compose(op2)
    op4 = op3.simplify()

    print(f'{k}, {n} (1st pass): {timeit(lambda: op3.simplify(), number=1)} sec')
    print(f'{k}, {n} (2nd pass): {timeit(lambda: op4.simplify(), number=1)} sec')
    print()

for i in range(8, 41, 8):
    bench(i, 1000)

main

8, 1000 (1st pass): 0.7635101599989866 sec
8, 1000 (2nd pass): 0.014977621998696122 sec

16, 1000 (1st pass): 1.1324572909979906 sec
16, 1000 (2nd pass): 0.2923723389976658 sec

24, 1000 (1st pass): 1.223762061999878 sec
24, 1000 (2nd pass): 0.3293825949986058 sec

32, 1000 (1st pass): 1.1833479929991881 sec
32, 1000 (2nd pass): 0.2818607930021244 sec

40, 1000 (1st pass): 1.3982988319985452 sec
40, 1000 (2nd pass): 0.40111826100110193 sec

this PR

8, 1000 (1st pass): 0.3265826780007046 sec
8, 1000 (2nd pass): 0.011909335997188464 sec

16, 1000 (1st pass): 0.4464764760014077 sec
16, 1000 (2nd pass): 0.29431921200011857 sec

24, 1000 (1st pass): 0.28692975499870954 sec
24, 1000 (2nd pass): 0.27714769699741737 sec

32, 1000 (1st pass): 0.2480848600025638 sec
32, 1000 (2nd pass): 0.24830579999979818 sec

40, 1000 (1st pass): 0.2818297639969387 sec
40, 1000 (2nd pass): 0.28098882400081493 sec

@t-imamichi t-imamichi marked this pull request as ready for review February 13, 2022 09:04
@t-imamichi t-imamichi requested review from a team and ikkoham as code owners February 13, 2022 09:04
@t-imamichi t-imamichi marked this pull request as draft February 13, 2022 09:56
@t-imamichi
Copy link
Member Author

t-imamichi commented Feb 13, 2022

Some unit tests failed due to #7657, so I apply sort in SparsePauliOp.__eq__ in 68de9a5
I updated __eq__ to use simplify 499dceb

@coveralls
Copy link

coveralls commented Feb 13, 2022

Pull Request Test Coverage Report for Build 1997872381

  • 19 of 19 (100.0%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.003%) to 83.474%

Totals Coverage Status
Change from base Build 1994359867: 0.003%
Covered Lines: 52378
Relevant Lines: 62748

💛 - Coveralls

Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be interesting to get some more in-depth profiling on your microbenchmarks, especially on smaller numbers (perhaps even smaller than you took). It looks like constant factors might be having quite an effect at small numbers.

Dictionary-based is asymptotically better, but we should make sure that we don't have too big an impact on the (more common) small-scale objects too.

Comment on lines 129 to 134
def __eq__(self, other):
"""Check if two SparsePauliOp operators are equal"""
return (
super().__eq__(other)
and np.allclose(self.coeffs, other.coeffs)
and self.paulis == other.paulis
)
if not super().__eq__(other):
return False
return np.allclose((self - other).simplify().coeffs, [0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to check the performance characteristics of this. The dict-based simplify has better asymptotic scaling than it used to, so this becomes more feasible, but this makes a relatively simple operation involve probably 3 SparsePauliOp allocations (in -other, self._add(-other) and simplify).

@t-imamichi
Copy link
Member Author

t-imamichi commented Feb 18, 2022

Thank you for your comments. I tried my microbenchmarks with smaller qubits. I see the current code runs faster up to 16 qubits. I think np.packbits does really good job.

import random
from timeit import timeit

from qiskit.quantum_info import SparsePauliOp

def bench(k, n):
    random.seed(123)
    op = SparsePauliOp.from_list([(''.join(random.choices('IXYZ', k=k)), 1) for _ in range(n)])
    op2 = SparsePauliOp.from_list([(''.join(random.choices('IXYZ', k=k)), 1) for _ in range(n)])
    op3 = op.compose(op2)
    op4 = op3.simplify()

    print(f'{k}, {n} (1st pass): {timeit(lambda: op3.simplify(), number=5)} sec')
    print(f'{k}, {n} (2nd pass): {timeit(lambda: op4.simplify(), number=5)} sec')
    print()

for i in range(8, 33, 8):
    bench(i, 1000)

main

8, 1000 (1st pass): 3.351907108 sec
8, 1000 (2nd pass): 0.07600556600000008 sec

16, 1000 (1st pass): 4.943881548999999 sec
16, 1000 (2nd pass): 1.383466663 sec

24, 1000 (1st pass): 5.464609770000001 sec
24, 1000 (2nd pass): 1.619675787000002 sec

32, 1000 (1st pass): 5.373269798999999 sec
32, 1000 (2nd pass): 1.440173430999998 sec

this PR

8, 1000 (1st pass): 4.607136525 sec
8, 1000 (2nd pass): 0.23311107700000022 sec

16, 1000 (1st pass): 5.703168041 sec
16, 1000 (2nd pass): 4.6930094539999985 sec

24, 1000 (1st pass): 4.611978835999999 sec
24, 1000 (2nd pass): 4.573024002 sec

32, 1000 (1st pass): 4.659426916999998 sec
32, 1000 (2nd pass): 4.521127856 sec

@jakelishman
Copy link
Member

Hmm, that poses some interesting questions, especially because the current version is significantly better on close-to-simplified forms. I suspect that it's to do with being close-to-sorted as well, so shuffling the order of the operators in the output of a simplify probably returns main to the "1st pass" speeds, despite already being simplified.

The speed-boost is significant at high numbers of qubits (of course, because of the asymptotic scaling removing a factor of ln(n)), so it may well be worth having both, and have an option to select.

That said, actually the best solution might be to move the backing of SparsePauliOp down into Rust or Cython, and enforce that the Paulis are always sorted. That would let us implement the fused _add -> simplify in linear time, because the algorithm becomes effectively a slight variant on the "merge" primitive of mergesort, since we can guarantee that the two arrays are sorted. compose might be slightly tricky to code up in that style, but if we could manage it, then simplify would become unnecessary and the == check could be done in guaranteed linear time as well. Creation of SparsePauliOp would become O(n * ln(n)) instead, but that may well be a price worth paying - we'd be able to add a sort=True keyword argument to the constructor to skip the check if the input was guaranteed in the right format, and possibly for quite a lot of uses, the initial creation of the objects from Python objects isn't the bottleneck.

@t-imamichi
Copy link
Member Author

t-imamichi commented Feb 18, 2022

Thank you for your feedback. I don't think the current use cases require >50 qubits mostly. So, I agree to leave the current sort-based simplify and think of more sophisticated approach such as rust and cython.

@t-imamichi
Copy link
Member Author

t-imamichi commented Mar 3, 2022

I implemented a rust version of unique in SparsePauliOp.simplify. I'm not familiar with rust, so the code might not be optimized well. But, the rust version is faster than sort-based np.unique. If anyone has an idea to optimize it, feel free to let me know.

comparison with main

import random
from timeit import timeit

from qiskit.quantum_info import SparsePauliOp

def bench(k, n):
    random.seed(123)
    op = SparsePauliOp.from_list([(''.join(random.choices('IXYZ', k=k)), 1) for _ in range(n)])
    op2 = SparsePauliOp.from_list([(''.join(random.choices('IXYZ', k=k)), 1) for _ in range(n)])
    op3 = op.compose(op2)
    op4 = op3.simplify()

    print(f'{k}, {n} (1st pass): {timeit(lambda: op3.simplify(), number=1)} sec')
    print(f'{k}, {n} (2nd pass): {timeit(lambda: op4.simplify(), number=1)} sec')
    print()

for i in range(8, 41, 8):
    bench(i, 1000)

main

8, 1000 (1st pass): 0.630255923 sec
8, 1000 (2nd pass): 0.01311622400000001 sec

16, 1000 (1st pass): 0.9540651409999996 sec
16, 1000 (2nd pass): 0.2669846600000003 sec

24, 1000 (1st pass): 1.052437779 sec
24, 1000 (2nd pass): 0.3130550200000002 sec

32, 1000 (1st pass): 1.0365809030000008 sec
32, 1000 (2nd pass): 0.2829057749999997 sec

40, 1000 (1st pass): 1.219256982000001 sec
40, 1000 (2nd pass): 0.4384409550000008 sec

this PR

8, 1000 (1st pass): 0.3164088220000001 sec
8, 1000 (2nd pass): 0.021032002999999966 sec

16, 1000 (1st pass): 0.890976802 sec
16, 1000 (2nd pass): 0.665050409 sec

24, 1000 (1st pass): 0.6271690339999996 sec
24, 1000 (2nd pass): 0.580206091 sec

32, 1000 (1st pass): 0.6370045559999999 sec
32, 1000 (2nd pass): 0.5719273319999996 sec

40, 1000 (1st pass): 0.6434421850000014 sec
40, 1000 (2nd pass): 0.625335218 sec

Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm excited to see more Rust usage, I just left some quick inline suggestions on how to potentially improve the performance of the rust code. Although, I didn't benchmark any of the suggestions so not sure how big a difference it will make in practice.

@t-imamichi
Copy link
Member Author

Thank you for your feedback, @mtreinish! I will update my rust code based on your comments. I'm really impressed with the rust framework for terra. I feel it is easier than cython. Kudos to you.

@t-imamichi
Copy link
Member Author

Thanks to @mtreinish's comment, simplify gets much faster.

8, 1000 (1st pass): 0.16840985499999994 sec
8, 1000 (2nd pass): 0.008250524000000148 sec

16, 1000 (1st pass): 0.2780222299999999 sec
16, 1000 (2nd pass): 0.12144982699999995 sec

24, 1000 (1st pass): 0.13483429299999994 sec
24, 1000 (2nd pass): 0.14022761000000017 sec

32, 1000 (1st pass): 0.1404055569999998 sec
32, 1000 (2nd pass): 0.11580450200000003 sec

40, 1000 (1st pass): 0.15199184499999951 sec
40, 1000 (2nd pass): 0.15105953399999983 sec

@mtreinish mtreinish added Rust This PR or issue is related to Rust code in the repository mod: quantum info Related to the Quantum Info module (States & Operators) performance labels Mar 3, 2022
@t-imamichi
Copy link
Member Author

t-imamichi commented Mar 4, 2022

Thanks. The rust version looks good, so I will finalize this PR.
This new version of simplify does not guarantee that the output is sorted.
So, the current unit tests based on __eq__ do not pass due to the order #7656 (comment).
The current __eq__ checks the equality of object level (order matters). E.g., I + X != X + I
We perhaps need to make another method, e.g., equiv, that checks the equality of numerical level (order ignored) #7657 (comment). E.g., I + X == X + I

TODOs

@t-imamichi
Copy link
Member Author

FYI: I updated the summary of this PR.
#7656 (comment)

Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this LGTM. The rust code looks fine and I'm fine along with the other changes. I think the distinction between __eq__ and equiv() also makes sense here and is consistent with other classes. Just a couple inline comments/questions but nothing major

self.assertNotEqual(spp_op1 + spp_op2, spp_op2 + spp_op1)

@combine(num_qubits=[1, 2, 3, 4])
def test_equiv(self, num_qubits):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to add a test here with a shuffled equiv pauli string? Like:

a = SparsePauliOp.from_list([('X', 1), ('Y', 2)])
b = SparsePauliOp.from_list([('Y', 2), ('X', 1)])

which was the example you used from the issue

Copy link
Member Author

@t-imamichi t-imamichi Mar 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I added two cases including your suggestion. a16a229

@mtreinish mtreinish added this to the 0.20 milestone Mar 15, 2022
Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for updating

@mtreinish mtreinish added Changelog: New Feature Include in the "Added" section of the changelog Changelog: Bugfix Include in the "Fixed" section of the changelog automerge labels Mar 17, 2022
@mergify mergify bot merged commit d255de1 into Qiskit:main Mar 17, 2022
@t-imamichi t-imamichi deleted the dict-simplify2 branch March 18, 2022 04:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: Bugfix Include in the "Fixed" section of the changelog Changelog: New Feature Include in the "Added" section of the changelog mod: quantum info Related to the Quantum Info module (States & Operators) performance Rust This PR or issue is related to Rust code in the repository
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Equality of SparsePauliOps with different orders
5 participants