dict-based `SparsePauliOp.simplify` #7656

t-imamichi · 2022-02-13T08:47:11Z

Summary

This PR replaces np.unique in SparsePauliOp.simplify with a dict-based operation.
~~This is faster than np.unique when the number of qubits is large and there are many duplicate operators.~~
~~But, if the number of qubits is small or there is no duplicate operator, this PR can be slower.~~
~~Moreover, the code gets more complicated than just applying np.unique.~~
~~So, I don't have a strong opinion to include this PR as Terra. I want feedback from Qiskit developers.~~

Update:
This PR contains three updates:

I implemented a utility function unordered_unique in rust to replace np.unique in SparsePauliOp.simplify.
I added SparsePauliOp.equiv to check the equivalence of two operators while SparsePauliOp.__eq__ checks elementwise equality. Based on Equality of SparsePauliOps with different orders #7657
E.g.,

op = SparsePauliOp.from_list([("X", 1), ("Y", 1)])
op2 = SparsePauliOp.from_list([("X", 1), ("Y", 1), ("Z", 0)])
op3 = SparsePauliOp.from_list([("Y", 1), ("X", 1)])

print(op == op2)  # False
print(op == op3)  # False
print(op.equiv(op2))  # True
print(op.equiv(op3))  # True

I found a bug that SparsePauliOp.__eq__ raises an error if two operators have different numbers of coefficients.
I fixed it and add unit tests.

op = SparsePauliOp.from_list([("X", 1), ("Y", 1)])
op2 = SparsePauliOp.from_list([("X", 1), ("Y", 1), ("Z", 0)])
print(op == op2)
#  ValueError: operands could not be broadcast together with shapes (2,) (3,)

Closes #7657

Details and comments

I attach a microbenchmark. Because op4 = op3.simplify(), op4 does not contain any duplicate operator.
The current approach with np.unique is expected to be faster in the 2nd pass because it is fast when operators are already sorted.

import random
from timeit import timeit

from qiskit.quantum_info import SparsePauliOp

def bench(k, n):
    random.seed(123)
    op = SparsePauliOp.from_list([(''.join(random.choices('IXYZ', k=k)), 1) for _ in range(n)])
    op2 = SparsePauliOp.from_list([(''.join(random.choices('IXYZ', k=k)), 1) for _ in range(n)])
    op3 = op.compose(op2)
    op4 = op3.simplify()

    print(f'{k}, {n} (1st pass): {timeit(lambda: op3.simplify(), number=1)} sec')
    print(f'{k}, {n} (2nd pass): {timeit(lambda: op4.simplify(), number=1)} sec')
    print()

for i in range(8, 41, 8):
    bench(i, 1000)

main

8, 1000 (1st pass): 0.7635101599989866 sec
8, 1000 (2nd pass): 0.014977621998696122 sec

16, 1000 (1st pass): 1.1324572909979906 sec
16, 1000 (2nd pass): 0.2923723389976658 sec

24, 1000 (1st pass): 1.223762061999878 sec
24, 1000 (2nd pass): 0.3293825949986058 sec

32, 1000 (1st pass): 1.1833479929991881 sec
32, 1000 (2nd pass): 0.2818607930021244 sec

40, 1000 (1st pass): 1.3982988319985452 sec
40, 1000 (2nd pass): 0.40111826100110193 sec

this PR

8, 1000 (1st pass): 0.3265826780007046 sec
8, 1000 (2nd pass): 0.011909335997188464 sec

16, 1000 (1st pass): 0.4464764760014077 sec
16, 1000 (2nd pass): 0.29431921200011857 sec

24, 1000 (1st pass): 0.28692975499870954 sec
24, 1000 (2nd pass): 0.27714769699741737 sec

32, 1000 (1st pass): 0.2480848600025638 sec
32, 1000 (2nd pass): 0.24830579999979818 sec

40, 1000 (1st pass): 0.2818297639969387 sec
40, 1000 (2nd pass): 0.28098882400081493 sec

t-imamichi · 2022-02-13T10:21:30Z

Some unit tests failed due to #7657, ~~so I apply sort in SparsePauliOp.__eq__ in 68de9a5~~
I updated __eq__ to use simplify 499dceb

coveralls · 2022-02-13T10:55:11Z

Pull Request Test Coverage Report for Build 1997872381

19 of 19 (100.0%) changed or added relevant lines in 3 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.003%) to 83.474%

Totals
Change from base Build 1994359867:	0.003%
Covered Lines:	52378
Relevant Lines:	62748

💛 - Coveralls

qiskit/quantum_info/operators/symplectic/sparse_pauli_op.py

jakelishman

It would be interesting to get some more in-depth profiling on your microbenchmarks, especially on smaller numbers (perhaps even smaller than you took). It looks like constant factors might be having quite an effect at small numbers.

Dictionary-based is asymptotically better, but we should make sure that we don't have too big an impact on the (more common) small-scale objects too.

jakelishman · 2022-02-16T15:36:41Z

qiskit/quantum_info/operators/symplectic/sparse_pauli_op.py

    def __eq__(self, other):
        """Check if two SparsePauliOp operators are equal"""
-        return (
-            super().__eq__(other)
-            and np.allclose(self.coeffs, other.coeffs)
-            and self.paulis == other.paulis
-        )
+        if not super().__eq__(other):
+            return False
+        return np.allclose((self - other).simplify().coeffs, [0])


It would be good to check the performance characteristics of this. The dict-based simplify has better asymptotic scaling than it used to, so this becomes more feasible, but this makes a relatively simple operation involve probably 3 SparsePauliOp allocations (in -other, self._add(-other) and simplify).

qiskit/quantum_info/operators/symplectic/sparse_pauli_op.py

t-imamichi · 2022-02-18T07:51:21Z

Thank you for your comments. I tried my microbenchmarks with smaller qubits. I see the current code runs faster up to 16 qubits. I think np.packbits does really good job.

import random
from timeit import timeit

from qiskit.quantum_info import SparsePauliOp

def bench(k, n):
    random.seed(123)
    op = SparsePauliOp.from_list([(''.join(random.choices('IXYZ', k=k)), 1) for _ in range(n)])
    op2 = SparsePauliOp.from_list([(''.join(random.choices('IXYZ', k=k)), 1) for _ in range(n)])
    op3 = op.compose(op2)
    op4 = op3.simplify()

    print(f'{k}, {n} (1st pass): {timeit(lambda: op3.simplify(), number=5)} sec')
    print(f'{k}, {n} (2nd pass): {timeit(lambda: op4.simplify(), number=5)} sec')
    print()

for i in range(8, 33, 8):
    bench(i, 1000)

main

8, 1000 (1st pass): 3.351907108 sec
8, 1000 (2nd pass): 0.07600556600000008 sec

16, 1000 (1st pass): 4.943881548999999 sec
16, 1000 (2nd pass): 1.383466663 sec

24, 1000 (1st pass): 5.464609770000001 sec
24, 1000 (2nd pass): 1.619675787000002 sec

32, 1000 (1st pass): 5.373269798999999 sec
32, 1000 (2nd pass): 1.440173430999998 sec

this PR

8, 1000 (1st pass): 4.607136525 sec
8, 1000 (2nd pass): 0.23311107700000022 sec

16, 1000 (1st pass): 5.703168041 sec
16, 1000 (2nd pass): 4.6930094539999985 sec

24, 1000 (1st pass): 4.611978835999999 sec
24, 1000 (2nd pass): 4.573024002 sec

32, 1000 (1st pass): 4.659426916999998 sec
32, 1000 (2nd pass): 4.521127856 sec

jakelishman · 2022-02-18T13:45:59Z

Hmm, that poses some interesting questions, especially because the current version is significantly better on close-to-simplified forms. I suspect that it's to do with being close-to-sorted as well, so shuffling the order of the operators in the output of a simplify probably returns main to the "1st pass" speeds, despite already being simplified.

The speed-boost is significant at high numbers of qubits (of course, because of the asymptotic scaling removing a factor of ln(n)), so it may well be worth having both, and have an option to select.

That said, actually the best solution might be to move the backing of SparsePauliOp down into Rust or Cython, and enforce that the Paulis are always sorted. That would let us implement the fused _add -> simplify in linear time, because the algorithm becomes effectively a slight variant on the "merge" primitive of mergesort, since we can guarantee that the two arrays are sorted. compose might be slightly tricky to code up in that style, but if we could manage it, then simplify would become unnecessary and the == check could be done in guaranteed linear time as well. Creation of SparsePauliOp would become O(n * ln(n)) instead, but that may well be a price worth paying - we'd be able to add a sort=True keyword argument to the constructor to skip the check if the input was guaranteed in the right format, and possibly for quite a lot of uses, the initial creation of the objects from Python objects isn't the bottleneck.

t-imamichi · 2022-02-18T14:31:56Z

Thank you for your feedback. I don't think the current use cases require >50 qubits mostly. So, I agree to leave the current sort-based simplify and think of more sophisticated approach such as rust and cython.

t-imamichi · 2022-03-03T13:29:21Z

I implemented a rust version of unique in SparsePauliOp.simplify. I'm not familiar with rust, so the code might not be optimized well. But, the rust version is faster than sort-based np.unique. If anyone has an idea to optimize it, feel free to let me know.

comparison with main

import random
from timeit import timeit

from qiskit.quantum_info import SparsePauliOp

def bench(k, n):
    random.seed(123)
    op = SparsePauliOp.from_list([(''.join(random.choices('IXYZ', k=k)), 1) for _ in range(n)])
    op2 = SparsePauliOp.from_list([(''.join(random.choices('IXYZ', k=k)), 1) for _ in range(n)])
    op3 = op.compose(op2)
    op4 = op3.simplify()

    print(f'{k}, {n} (1st pass): {timeit(lambda: op3.simplify(), number=1)} sec')
    print(f'{k}, {n} (2nd pass): {timeit(lambda: op4.simplify(), number=1)} sec')
    print()

for i in range(8, 41, 8):
    bench(i, 1000)

main

8, 1000 (1st pass): 0.630255923 sec
8, 1000 (2nd pass): 0.01311622400000001 sec

16, 1000 (1st pass): 0.9540651409999996 sec
16, 1000 (2nd pass): 0.2669846600000003 sec

24, 1000 (1st pass): 1.052437779 sec
24, 1000 (2nd pass): 0.3130550200000002 sec

32, 1000 (1st pass): 1.0365809030000008 sec
32, 1000 (2nd pass): 0.2829057749999997 sec

40, 1000 (1st pass): 1.219256982000001 sec
40, 1000 (2nd pass): 0.4384409550000008 sec

this PR

8, 1000 (1st pass): 0.3164088220000001 sec
8, 1000 (2nd pass): 0.021032002999999966 sec

16, 1000 (1st pass): 0.890976802 sec
16, 1000 (2nd pass): 0.665050409 sec

24, 1000 (1st pass): 0.6271690339999996 sec
24, 1000 (2nd pass): 0.580206091 sec

32, 1000 (1st pass): 0.6370045559999999 sec
32, 1000 (2nd pass): 0.5719273319999996 sec

40, 1000 (1st pass): 0.6434421850000014 sec
40, 1000 (2nd pass): 0.625335218 sec

mtreinish

I'm excited to see more Rust usage, I just left some quick inline suggestions on how to potentially improve the performance of the rust code. Although, I didn't benchmark any of the suggestions so not sure how big a difference it will make in practice.

src/array_unique.rs

t-imamichi · 2022-03-03T15:03:41Z

Thank you for your feedback, @mtreinish! I will update my rust code based on your comments. I'm really impressed with the rust framework for terra. I feel it is easier than cython. Kudos to you.

t-imamichi · 2022-03-03T15:56:00Z

Thanks to @mtreinish's comment, simplify gets much faster.

8, 1000 (1st pass): 0.16840985499999994 sec
8, 1000 (2nd pass): 0.008250524000000148 sec

16, 1000 (1st pass): 0.2780222299999999 sec
16, 1000 (2nd pass): 0.12144982699999995 sec

24, 1000 (1st pass): 0.13483429299999994 sec
24, 1000 (2nd pass): 0.14022761000000017 sec

32, 1000 (1st pass): 0.1404055569999998 sec
32, 1000 (2nd pass): 0.11580450200000003 sec

40, 1000 (1st pass): 0.15199184499999951 sec
40, 1000 (2nd pass): 0.15105953399999983 sec

src/array_unique.rs

qiskit/quantum_info/operators/symplectic/sparse_pauli_op.py

t-imamichi · 2022-03-04T15:31:34Z

Thanks. The rust version looks good, so I will finalize this PR.
This new version of simplify does not guarantee that the output is sorted.
So, the current unit tests based on __eq__ do not pass due to the order #7656 (comment).
The current __eq__ checks the equality of object level (order matters). E.g., I + X != X + I
We perhaps need to make another method, e.g., equiv, that checks the equality of numerical level (order ignored) #7657 (comment). E.g., I + X == X + I

TODOs

discussion Equality of SparsePauliOps with different orders #7657
code cleanup
docstrings

# Conflicts: # src/lib.rs

# Conflicts: # qiskit/__init__.py # src/lib.rs

t-imamichi · 2022-03-14T16:08:05Z

FYI: I updated the summary of this PR.
#7656 (comment)

mtreinish

Overall this LGTM. The rust code looks fine and I'm fine along with the other changes. I think the distinction between __eq__ and equiv() also makes sense here and is consistent with other classes. Just a couple inline comments/questions but nothing major

mtreinish · 2022-03-15T13:25:30Z

test/python/quantum_info/operators/symplectic/test_sparse_pauli_op.py

+            self.assertNotEqual(spp_op1 + spp_op2, spp_op2 + spp_op1)
+
+    @combine(num_qubits=[1, 2, 3, 4])
+    def test_equiv(self, num_qubits):


Do you want to add a test here with a shuffled equiv pauli string? Like:

a = SparsePauliOp.from_list([('X', 1), ('Y', 2)]) b = SparsePauliOp.from_list([('Y', 2), ('X', 1)])

which was the example you used from the issue

Thanks. I added two cases including your suggestion. a16a229

releasenotes/notes/add-sparsepauliop-equiv-7a8a1420117dba21.yaml

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

mtreinish

LGTM, thanks for updating

t-imamichi marked this pull request as ready for review February 13, 2022 09:04

t-imamichi requested review from a team and ikkoham as code owners February 13, 2022 09:04

t-imamichi marked this pull request as draft February 13, 2022 09:56

t-imamichi marked this pull request as ready for review February 14, 2022 02:37

t-imamichi mentioned this pull request Feb 14, 2022

Apply simplify more often in QubitMapper.mode_based_mapping qiskit-community/qiskit-nature#549

Closed

Cryoris reviewed Feb 14, 2022

View reviewed changes

qiskit/quantum_info/operators/symplectic/sparse_pauli_op.py Outdated Show resolved Hide resolved

t-imamichi force-pushed the dict-simplify2 branch from e7dab10 to fc55a61 Compare February 15, 2022 07:32

Cryoris reviewed Feb 15, 2022

View reviewed changes

qiskit/quantum_info/operators/symplectic/sparse_pauli_op.py Outdated Show resolved Hide resolved

jakelishman reviewed Feb 16, 2022

View reviewed changes

t-imamichi force-pushed the dict-simplify2 branch from f34fca0 to f96b71f Compare February 17, 2022 03:08

t-imamichi force-pushed the dict-simplify2 branch from f96b71f to 441c364 Compare March 3, 2022 13:26

mtreinish reviewed Mar 3, 2022

View reviewed changes

src/array_unique.rs Outdated Show resolved Hide resolved

src/array_unique.rs Outdated Show resolved Hide resolved

src/array_unique.rs Outdated Show resolved Hide resolved

src/array_unique.rs Outdated Show resolved Hide resolved

src/array_unique.rs Outdated Show resolved Hide resolved

mtreinish reviewed Mar 3, 2022

View reviewed changes

src/array_unique.rs Outdated Show resolved Hide resolved

mtreinish added Rust This PR or issue is related to Rust code in the repository mod: quantum info Related to the Quantum Info module (States & Operators) performance labels Mar 3, 2022

mtreinish reviewed Mar 4, 2022

View reviewed changes

qiskit/quantum_info/operators/symplectic/sparse_pauli_op.py Outdated Show resolved Hide resolved

qiskit/quantum_info/operators/symplectic/sparse_pauli_op.py Outdated Show resolved Hide resolved

qiskit/quantum_info/operators/symplectic/sparse_pauli_op.py Outdated Show resolved Hide resolved

t-imamichi added 3 commits March 7, 2022 15:48

dict-based SparsePauliOp.simplify

76e0e83

tune

4bd082d

apply sort in SparsePauliOp.__eq__

397d88a

t-imamichi and others added 4 commits March 8, 2022 12:11

Merge branch 'main' into dict-simplify2

fae48ce

# Conflicts: # src/lib.rs

add pylint tag

65e3ca6

Merge branch 'main' into dict-simplify2

244b94f

Merge branch 'main' into dict-simplify2

7a24bea

# Conflicts: # qiskit/__init__.py # src/lib.rs

t-imamichi requested review from manoelmarques and woodsp-ibm as code owners March 11, 2022 10:11

t-imamichi requested review from mtreinish, jakelishman and Cryoris March 11, 2022 10:13

update

8ce0a23

t-imamichi force-pushed the dict-simplify2 branch from aa136a6 to 8ce0a23 Compare March 11, 2022 10:15

fix a sphinx warning

8df6468

t-imamichi mentioned this pull request Mar 14, 2022

Equality of SparsePauliOps with different orders #7657

Closed

mtreinish reviewed Mar 15, 2022

View reviewed changes

mtreinish added this to the 0.20 milestone Mar 15, 2022

t-imamichi and others added 3 commits March 17, 2022 18:11

Update releasenotes/notes/add-sparsepauliop-equiv-7a8a1420117dba21.yaml

ca534da

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

Merge branch 'main' into dict-simplify2

8af9735

add two test cases

a16a229

t-imamichi force-pushed the dict-simplify2 branch from 1685aeb to a16a229 Compare March 17, 2022 09:27

simplify tests with zero

1909b9f

mtreinish approved these changes Mar 17, 2022

View reviewed changes

mtreinish added Changelog: New Feature Include in the "Added" section of the changelog Changelog: Bugfix Include in the "Fixed" section of the changelog automerge labels Mar 17, 2022

mergify bot merged commit d255de1 into Qiskit:main Mar 17, 2022

t-imamichi deleted the dict-simplify2 branch March 18, 2022 04:25

t-imamichi mentioned this pull request Mar 18, 2022

Fix test_direct_mapper (fix CI breakage) qiskit-community/qiskit-nature#603

Merged

kevinsung mentioned this pull request Apr 6, 2022

Vectorize SpinOp.simplify and VibrationalOp.simplify qiskit-community/qiskit-nature#622

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dict-based `SparsePauliOp.simplify` #7656

dict-based `SparsePauliOp.simplify` #7656

t-imamichi commented Feb 13, 2022 •

edited

Loading

t-imamichi commented Feb 13, 2022 •

edited

Loading

coveralls commented Feb 13, 2022 •

edited

Loading

jakelishman left a comment

jakelishman Feb 16, 2022

t-imamichi commented Feb 18, 2022 •

edited

Loading

jakelishman commented Feb 18, 2022

t-imamichi commented Feb 18, 2022 •

edited

Loading

t-imamichi commented Mar 3, 2022 •

edited

Loading

mtreinish left a comment •

edited

Loading

t-imamichi commented Mar 3, 2022

t-imamichi commented Mar 3, 2022

t-imamichi commented Mar 4, 2022 •

edited

Loading

t-imamichi commented Mar 14, 2022

mtreinish left a comment

mtreinish Mar 15, 2022

t-imamichi Mar 17, 2022 •

edited

Loading

mtreinish left a comment

dict-based SparsePauliOp.simplify #7656

dict-based SparsePauliOp.simplify #7656

Conversation

t-imamichi commented Feb 13, 2022 • edited Loading

Summary

Details and comments

t-imamichi commented Feb 13, 2022 • edited Loading

coveralls commented Feb 13, 2022 • edited Loading

Pull Request Test Coverage Report for Build 1997872381

💛 - Coveralls

jakelishman left a comment

Choose a reason for hiding this comment

jakelishman Feb 16, 2022

Choose a reason for hiding this comment

t-imamichi commented Feb 18, 2022 • edited Loading

jakelishman commented Feb 18, 2022

t-imamichi commented Feb 18, 2022 • edited Loading

t-imamichi commented Mar 3, 2022 • edited Loading

mtreinish left a comment • edited Loading

Choose a reason for hiding this comment

t-imamichi commented Mar 3, 2022

t-imamichi commented Mar 3, 2022

t-imamichi commented Mar 4, 2022 • edited Loading

t-imamichi commented Mar 14, 2022

mtreinish left a comment

Choose a reason for hiding this comment

mtreinish Mar 15, 2022

Choose a reason for hiding this comment

t-imamichi Mar 17, 2022 • edited Loading

Choose a reason for hiding this comment

mtreinish left a comment

Choose a reason for hiding this comment

dict-based `SparsePauliOp.simplify` #7656

dict-based `SparsePauliOp.simplify` #7656

t-imamichi commented Feb 13, 2022 •

edited

Loading

t-imamichi commented Feb 13, 2022 •

edited

Loading

coveralls commented Feb 13, 2022 •

edited

Loading

t-imamichi commented Feb 18, 2022 •

edited

Loading

t-imamichi commented Feb 18, 2022 •

edited

Loading

t-imamichi commented Mar 3, 2022 •

edited

Loading

mtreinish left a comment •

edited

Loading

t-imamichi commented Mar 4, 2022 •

edited

Loading

t-imamichi Mar 17, 2022 •

edited

Loading