Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Improve speed of sample_counts from O(N) to O(1) #8547

Closed
wants to merge 6 commits into from

Conversation

jlapeyre
Copy link
Contributor

@jlapeyre jlapeyre commented Aug 15, 2022

Use numpy.random.multinomial with parameter N rather than actually generating N counts in QuantumState.sample_counts. This is an $O(1)$ method, whereas the current one is $O(N)$.

  • I have added the tests to cover my changes.
  • (NA, no API change) I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • reno release note

Summary

Use a more efficient algorithm for QuantumState.sample_counts.
Fixes #8535.

Details and comments

In #8535 a solution using numpy.random.multinomial was proposed. This PR implements it.

  • The test has been updated. It now uses $10^7$ shots rather than $2000$, and a tolerance of $0.001$.
    Furthermore, we run the tests for 7 different seeds, rather than just one. The time to run all tests has
    decreased from $>6$ ms to $<6$ ms (one one machine.)
  • In the following, I use a plain-python generator
        return Counts(
            (labels[i], counts_array[i]) for i in range(len(counts_array)) if counts_array[i] > 0
        )
    Both labels and counts_array are numpy arrays. So it make sense to instead filter the indices and then build new arrays
    indexing into the two arrays. I latter is slower if the vector of probabilities is not too long. But, it would be probably faster for
    large arrays. I suppose optimizing for the larger arrays is best. We could do both with a length cutoff, but that addes
    complexity, and I think it's premature optimization at this point.

Use multinomial with parameter N rather than acutally generating N counts.
@jlapeyre jlapeyre requested review from a team and ikkoham as code owners August 15, 2022 20:12
@qiskit-bot

This comment was marked as duplicate.

@jlapeyre jlapeyre changed the title [WIP] Use more efficient method for sample_counts Use more efficient method for sample_counts Aug 15, 2022
Sampling is now faster. We run the test with more seeds. We still
save about 1ms in test time. Furthermore, we increase the number
of samples greatly.
@coveralls
Copy link

coveralls commented Aug 15, 2022

Pull Request Test Coverage Report for Build 2864795941

  • 4 of 4 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.003%) to 84.053%

Totals Coverage Status
Change from base Build 2862656180: 0.003%
Covered Lines: 56318
Relevant Lines: 67003

💛 - Coveralls

@jlapeyre
Copy link
Contributor Author

For more guidance on how to evaluate this, the introduction of the wikipedia page for multinomial distribution says

The Bernoulli distribution models the outcome of a single Bernoulli trial. In other words, it models whether flipping a (possibly biased) coin one time will result in either a success (obtaining a head) or failure (obtaining a tail). The binomial distribution generalizes this to the number of heads from performing n independent flips (Bernoulli trials) of the same coin. The multinomial distribution models the outcome of n experiments, where the outcome of each trial has a categorical distribution, such as rolling a k-sided die n times.

Note that distributions over basis states are "categorical distribution"s.

This is the same thing I said in #8535. But, here it is stated succinctly with the full authority of Wikipedia(!)

@jlapeyre
Copy link
Contributor Author

Another thing: We should check other places in Qiskit (maybe Aer?) where we are effectively sampling from a multinomial distribution. For example, noise modeled by perturbing the circuit randomly with each shot could not be handled as in this PR.

@jlapeyre jlapeyre changed the title Use more efficient method for sample_counts Improve speed of sample_counts from O(N) to O(1) Use more efficient method for sample_counts Aug 18, 2022
@jlapeyre jlapeyre changed the title Improve speed of sample_counts from O(N) to O(1) Use more efficient method for sample_counts Improve speed of sample_counts from $O(N)$ to $O(1)$ Aug 18, 2022
@jlapeyre jlapeyre changed the title Improve speed of sample_counts from $O(N)$ to $O(1)$ Improve speed of sample_counts from O(N) to O(1) Aug 18, 2022
@yaelbh
Copy link
Contributor

yaelbh commented Aug 18, 2022

We've had extensive discussions in the past about it in Aer, in both state vector and MPS simulators. Maybe @chriseclectic or Merav remember the conclusions. This came up also recently with the mock backends of qiskit-experiments (Itamar is the contact point).

I'll try to recall myself the discussion and find relevant issues and pull requests. For now I only remember that multinomial was not so magical, but I'll have to recall why. In the case of Aer, part of the story was that Aer is already written in C++, so the question became comparison between different algorithms, without the aspect of using numpy for speeding-up a program.

@yaelbh
Copy link
Contributor

yaelbh commented Aug 18, 2022

I'm sorry that I can't help much more than referring to a search of the word "multinomial" in the Aer repository... https://github.com/Qiskit/qiskit-aer/issues?q=multinomial
I doubt if it's of much help, but check out at least Qiskit/qiskit-aer#831, which reminds me that @hhorii was very much involved in this, and is probably the first person to talk to.

Similarly, in qiskit-experiments: https://github.com/Qiskit/qiskit-experiments/issues?q=multinomial

@merav-aharoni
Copy link

Referring to @yaelbh 's comment above, there was some work in MPS to determine the fastest algorithm for sample_measure. We discussed three algorithms, two of which are specific to MPS, so are not relevant here. The only one relevant is Algorithm 1 in Qiskit/qiskit-aer#1377 (comment), where we create the accumulated probability vector, generate all the random numbers once, sort the random numbers, and then move up in the probability vector generating a count for every probability hit.
Since Aer is in C++, it might be worth understanding what is done in the numpy.random.multinomial package, and implementing it in Aer.

@jlapeyre
Copy link
Contributor Author

Thanks @yaelbh and @merav-aharoni for pointing me to those issues.

After reading these, it's clear that this PR needs a bit more work.

To be clear, there are two related tasks.

  • The sample task. Draw and return a list of $N$ samples from the categorical distribution with $n$ probabilities $\mathbf{p}=(p_1,\ldots,p_n)$.

  • The counts task. Draw $N$ samples from $\mathbf{p}$. Then make a count map. That is, return a list $(c_1,\ldots, c_n)$ where $c_i$ is the number of samples equal to category $i$.

You can perform the counts task by actually generating $N$ samples and binning them. How best to sample depends on the problem parameters, that is $N$, $n$, etc. Also on the computer language.

You can also perform the counts task by generating the counts directly from multinomial distribution, without drawing $N$ samples. This is what is done in numpy.random.multinomial. But, again, whether this is best depends on details of the task.

You can also perform the samples task by first doing the counts task using numpy.random.multinomial and then generating the counts in accordance with the results. That is, after generating the counts, make a list of length $N$ whose first $c_1$ elements are $1$, the next $c_2$ elements are $2$, etc. You could also randomly permute the results. I had not thought of this before reading the the investigation that @lbishop related. It seems that for some parameter regimes, it outperforms naive sampling.

@jlapeyre jlapeyre changed the title Improve speed of sample_counts from O(N) to O(1) [WIP] Improve speed of sample_counts from O(N) to O(1) Aug 19, 2022
@ikkoham
Copy link
Contributor

ikkoham commented Aug 24, 2022

LGTM, but my only concern is it breaks the API, that is it returns different counts even if the seed is the same.

@jlapeyre
Copy link
Contributor Author

jlapeyre commented Aug 24, 2022

Thanks for looking at this @ikkoham !

LGTM, but my only concern is it breaks the API, that is it returns different counts even if the seed is the same.

Oh, yes, this needs to be fixed.

However, I realized after reading comments from @yaelbh and @merav-aharoni that the method in this PR is in practice not always better. For example, if I have a vector of $10^6$ probabilities and I ask for one sample, using the previous implementation may be faster.

EDIT: A plot of these experiments is given in #8618
I am running experiments now to present probably as a issue rather than a PR. I will close this PR in favor of the other issue when it is ready.

EDIT: I noticed that sampling from multinomial is done in #8137. That could possibly also be done conditionally. Although, I suppose the safest thing is to use multinomial.

@jlapeyre jlapeyre marked this pull request as draft August 24, 2022 18:25
@jlapeyre
Copy link
Contributor Author

With the move to Rust, this is obsolete.

@jlapeyre jlapeyre closed this Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make QuantumState.sample_counts faster, O(1) rather than O(N)
6 participants