Add dedicated functions for memory marginalization #8051

mtreinish · 2022-05-11T15:40:06Z

Summary

This commit adds dedicated functions for memory marginalization.
Previously, the marginal_counts() function had support for marginalizing
memory in a Results object, but this can be inefficient especially if
your memory list is outside a Results object. The new functions added in
this commit are implemented in Rust and multithreaded. Additionally the
marginal_counts() function is updated to use the same inner Rust
functions.

Details and comments

TODO:

Rebase after Add a marginal_distribution function #8026 merges
Benchmark and tune
Add release note

qiskit-bot · 2022-05-11T15:40:09Z

Thank you for opening a new pull request.

Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient.

While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone.

One or more of the the following people are requested to review this:

@Qiskit/terra-core
@kevinhartman
@mtreinish

coveralls · 2022-05-11T17:25:37Z

Pull Request Test Coverage Report for Build 2543893476

20 of 24 (83.33%) changed or added relevant lines in 2 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage decreased (-0.001%) to 84.288%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
qiskit/result/utils.py	19	23	82.61%

Totals
Change from base Build 2543833521:	-0.001%
Covered Lines:	54987
Relevant Lines:	65237

💛 - Coveralls

eggerdj · 2022-05-16T17:42:40Z

qiskit/result/utils.py

+    to 4 threads.
+
+    Args:
+        memory: The input memory list, this is a list of hexadecimal strings to be marginalized


Does this not cover IQ data or is that outside the scope of this PR?

I wasn't factoring that in when I wrote this, but we should try to support that in this function. I can add it to this PR if you have an example for what the input and output look like with IQ data.

Sure, here is what the input looks like for a single three-qubit circuit under meas level 1 and with 5 single-shots

memory=[ # qubit 0 qubit 1 qubit 2 [[-12974255.0, -28106672.0], [ 15848939.0, -53271096.0], [-18731048.0, -56490604.0]], #shot 1 [[-18346508.0, -26587824.0], [-12065728.0, -44948360.0], [14035275.0, -65373000.0]], # shot 2 [[ 12802274.0, -20436864.0], [-15967512.0, -37575556.0], [15201290.0, -65182832.0]], # ... [[ -9187660.0, -22197716.0], [-17028016.0, -49578552.0], [13526576.0, -61017756.0]], [[ 7006214.0, -32555228.0], [ 16144743.0, -33563124.0], [-23524160.0, -66919196.0]] ]

you can see sth like this by running job_1ts = backend.run(circ, meas_level=1, memory=True, meas_return="single", shots=5). If I want to marginalize over some of the qubits then I need to remove their slots. For instance keeping qubits 0 and 2 would result in

memory=[ [[-12974255.0, -28106672.0], [-18731048.0, -56490604.0]], #shot 1 [[-18346508.0, -26587824.0], [14035275.0, -65373000.0]], # shot 2 [[ 12802274.0, -20436864.0], [15201290.0, -65182832.0]], # ... [[ -9187660.0, -22197716.0], [13526576.0, -61017756.0]], [[ 7006214.0, -32555228.0], [-23524160.0, -66919196.0]] ]

If we are dealing with average IQ data then the input memory looks like so (again three qubits, one circuit, five shots but now they are averaged).

memory=[[-1059254.375, -26266612.0], [-9012669.0, -41877468.0], [6027076.0, -54875060.0]]

I added support for this in: e709ce8 I still need to add testing to cover all the paths. But let me know if that interface works for you.

fyi
https://github.com/Qiskit/qiskit-terra/blob/794db4792e97adca4a78eb4f3411c04e50665df7/qiskit/result/result.py#L211-L223

I really don't like how it is implemented without explicit typing.

That table is what I based e709ce8 but yeah the lack of explicit typing was annoying and why I needed an avg_data kwarg to differentiate between single level 1 and avg level 0 because I couldn't figure out a way to reliably detect the difference without an explicit input type. I was trying to make this function work independently of the Results object which is the only place I think that metadata would be stored.

qiskit/result/utils.py

eggerdj · 2022-05-16T17:46:57Z

src/results/converters.rs

+#[inline]
+pub fn hex_char_to_bin(c: char) -> &'static str {
+    match c {
+        '0' => "0000",
+        '1' => "0001",
+        '2' => "0010",


Can we cover cases where we are, e.g., working with the third level of the Transmon? I.e. we have states 0, 1 and 2?

I think this is outside the scope of this PR (though I like the idea). Because current result object doesn't define basis in metadata, i.e. it always assumes binary, it is difficult to select proper output basis (binary or ternary) only with input hex numbers. Having basis argument in marginal function seems to me an overkill. But we can implement such ternary memory in experiment data processor, where we always need marginalization of IQ numbers to run custom ternary discriminator.

I think we can save this for a follow on, I think having the marginization functions work with this would be useful but there is probably enough to this PR just working with binary to start.

This commit adds dedicated functions for memory marginalization. Previously, the marginal_counts() function had support for marginalizing memory in a Results object, but this can be inefficient especially if your memory list is outside a Results object. The new functions added in this commit are implemented in Rust and multithreaded. Additionally the marginal_counts() function is updated to use the same inner Rust functions.

Co-authored-by: Daniel J. Egger <38065505+eggerdj@users.noreply.github.com>

mtreinish · 2022-05-17T13:26:28Z

I spent some time to find the default value for the parallel threshold. I ran different number of shots in parallel and serial to find where parallel is faster. Based on these graphs I'm going to increase the parallel threshold to 1000.

(note these are all log log plots)

Which were generated by plotting:

random.seed(412123)

size = []
serial_times = []
parallel_times = []
for i in np.logspace(1, 8, dtype=int):
    memory = [
        hex(random.randint(0, 999999999999999999999999999999999999999999999999999900000))
        for _ in range(i)
    ]
    size.append(i)
    print(i)
    start = time.perf_counter()
    res = marginal_memory(memory, indices=[0,3,6,7], parallel_threshold=i-2)
    stop = time.perf_counter()
    parallel_times.append(stop - start)
    start = time.perf_counter()
    res = marginal_memory(memory, indices=[0,3,6,7], parallel_threshold=i+2)
    stop = time.perf_counter()
    serial_times.append(stop - start)

nkanazawa1989 · 2022-05-19T03:31:02Z

qiskit/result/utils.py

+def marginal_memory(
+    memory: List[str],
+    indices: Optional[List[int]] = None,
+    int_return: bool = False,


This is just a curiosity. Is the list of integer more efficient in memory footprint than binary ndarray? Given we use memory information to run restless analysis in qiskit experiment, it should take memory efficient representation to run a parallel experiment in 100Q+ device.

Do you mean like storing the shot memory as a 2d array where each row has n elements for each bit or something else? The list of ints here will be more memory efficient than that in the rust side because I'm using a Vec<BigUint> (whch is just a Vec of digits internally) and it will not be fixed with for each shot. The python side I expect would be similar since the Python integer class is very similar to BigUint (a byte array of digits). (although list isn't necessarily as contiguous as a Vec<T>/ndarray). I think it would be best to test this though to be sure and settle on a common way to represent large results values in a non-string type.

As an aside I only used a list here because numpy doesn't have support for arbitrary large integers (outside of using a object dtype, which ends up just being a pointer to the python heap, for python ints) and I was worried about the

Thanks. Sounds like current implementation is reasonable (I just worried about storing 2**100 for "10000...0", in binary array it's just 100 binary element).

kevinhartman

LGTM overall. Will approve once comments are addressed.

qiskit/result/utils.py

kevinhartman · 2022-06-21T20:16:30Z

src/results/converters.rs

+
+#[inline]
+pub fn hex_char_to_bin(c: char) -> &'static str {
+    match c {


This is fine, but it might be a really fun opportunity to write a constant expression LUT generator function :)

I leveraged lazy_static to generate a static lookup table in: 5c81510 I need to benchmark it and look into expanding it to support larger chunks. But this might be a good enough start as it should eliminate most of the runtime branching.

I tried playing with adding chunking in groups of 4 and it was slower than doing this. I think this is coming down to needing to do intermediate allocations in how I could get it to work. At this point I think just doing a single element lookup table is probably sufficient. If we end up hitting bottlenecks down the road I feel like we can revisit this easily enough as it's not a big deal to improve the internal implementation down the road.

For benchmarking I compared prior to 5c81510 with 5c81510 and my local chunked implementation using:

import time import random from qiskit.result import marginal_memory random.seed(42) memory = [hex(random.randint(0, 4096)) for _ in range(500000)] start = time.perf_counter() res = marginal_memory(memory, indices=[0, 3, 5, 9]) stop = time.perf_counter() print(stop - start)

The geometric mean of each implementation over 10 trials each was:

match: 0.08678476060453334 LUT: 0.08359493472968436 Chunked LUT: 0.10288708564573844

kevinhartman · 2022-06-21T20:24:51Z

src/results/marginalization.rs

+            Ok(out_mem
+                .iter()
+                .map(|x| BigUint::parse_bytes(x.as_bytes(), 2).unwrap())
+                .collect::<Vec<BigUint>>()


Yay turbo fish! ::<<>>

src/results/marginalization.rs

Co-authored-by: Kevin Hartman <kevin@hart.mn>

In the recently merged Qiskit#8051 we create a lookup table in Rust to speed up the hex->bin conversion used internally as part of the marginal_memory() function. This was previously done using the lazy_static crate which is used to lazily evaluate dynamic code to create a static at runtime on the first access. The typical use case for this is to create a static Vec or HashMap. However for the marginal_counts() usage we didn't need to do this because we were creating a fixed size array so the static can be evaulated at compile time assuming the array is constructed with a const function. This commit removes the lazy_static usage and switches to a true static to further improve the performance of the lookup table by avoiding the construction overhead.

#8223) * Refactor marginal_memory() hex to bin lookup table to be a true static In the recently merged #8051 we create a lookup table in Rust to speed up the hex->bin conversion used internally as part of the marginal_memory() function. This was previously done using the lazy_static crate which is used to lazily evaluate dynamic code to create a static at runtime on the first access. The typical use case for this is to create a static Vec or HashMap. However for the marginal_counts() usage we didn't need to do this because we were creating a fixed size array so the static can be evaulated at compile time assuming the array is constructed with a const function. This commit removes the lazy_static usage and switches to a true static to further improve the performance of the lookup table by avoiding the construction overhead. * Reduce number of empty entries in LUT Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

mtreinish added the on hold Can not fix yet label May 11, 2022

mtreinish requested a review from a team as a code owner May 11, 2022 15:40

mtreinish added the Changelog: New Feature Include in the "Added" section of the changelog label May 11, 2022

mtreinish changed the title ~~Add dedicated functions for memory marginalization~~ [WIP] Add dedicated functions for memory marginalization May 11, 2022

jakelishman added the Rust This PR or issue is related to Rust code in the repository label May 11, 2022

mtreinish requested review from eggerdj, nkanazawa1989 and chriseclectic May 11, 2022 16:30

eggerdj reviewed May 16, 2022

View reviewed changes

mtreinish and others added 4 commits May 16, 2022 22:07

Fix rustfmt

ea86c35

Add missing test file

f234efe

Fix typos

c4924f3

Co-authored-by: Daniel J. Egger <38065505+eggerdj@users.noreply.github.com>

mtreinish force-pushed the marginal-memory branch from a2a3004 to c4924f3 Compare May 17, 2022 02:09

mtreinish added 2 commits May 17, 2022 09:19

Remove unused import

e8930c1

Merge remote-tracking branch 'origin/main' into marginal-memory

b088de1

mtreinish added 2 commits May 17, 2022 09:28

Increate default parallel_threshold to 1000

a0ca25c

Add support for different measurement levels

e709ce8

nkanazawa1989 reviewed May 19, 2022

View reviewed changes

mtreinish added 4 commits May 31, 2022 15:00

Merge remote-tracking branch 'origin/main' into marginal-memory

b718200

Update docstring

7c2be93

Add release note

141ca7b

Expand unit tests

20b4c60

mtreinish mentioned this pull request May 31, 2022

Leverage rust marginalization function for marginal_counts() #8122

Closed

mtreinish changed the title ~~[WIP] Add dedicated functions for memory marginalization~~ Add dedicated functions for memory marginalization May 31, 2022

mtreinish removed the on hold Can not fix yet label May 31, 2022

Fix rustfmt

6bc9e6c

mtreinish added this to the 0.21 milestone Jun 6, 2022

kevinhartman self-assigned this Jun 21, 2022

kevinhartman reviewed Jun 21, 2022

View reviewed changes

mtreinish and others added 3 commits June 21, 2022 16:55

Apply suggestions from code review

85cb99c

Co-authored-by: Kevin Hartman <kevin@hart.mn>

Merge remote-tracking branch 'origin/main' into marginal-memory

a9262c2

Use a lookup table instead of a match statement

5c81510

mtreinish requested a review from kevinhartman June 22, 2022 16:18

kevinhartman approved these changes Jun 22, 2022

View reviewed changes

kevinhartman added the automerge label Jun 22, 2022

Merge branch 'main' into marginal-memory

7229634

mergify bot merged commit 8ee4ac8 into Qiskit:main Jun 22, 2022

mtreinish mentioned this pull request Jun 22, 2022

Refactor marginal_memory() hex to bin lookup table to be a true static #8223

Merged

mtreinish deleted the marginal-memory branch November 16, 2022 22:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dedicated functions for memory marginalization #8051

Add dedicated functions for memory marginalization #8051

mtreinish commented May 11, 2022 •

edited

Loading

qiskit-bot commented May 11, 2022

coveralls commented May 11, 2022 •

edited

Loading

eggerdj May 16, 2022

mtreinish May 16, 2022

eggerdj May 17, 2022

mtreinish May 17, 2022

nkanazawa1989 May 19, 2022

mtreinish May 19, 2022

eggerdj May 16, 2022

nkanazawa1989 May 19, 2022

mtreinish May 19, 2022

mtreinish commented May 17, 2022

nkanazawa1989 May 19, 2022

mtreinish May 19, 2022 •

edited

Loading

nkanazawa1989 May 20, 2022 •

edited

Loading

kevinhartman left a comment

kevinhartman Jun 21, 2022

mtreinish Jun 21, 2022

mtreinish Jun 22, 2022

kevinhartman Jun 21, 2022

Add dedicated functions for memory marginalization #8051

Add dedicated functions for memory marginalization #8051

Conversation

mtreinish commented May 11, 2022 • edited Loading

Summary

Details and comments

qiskit-bot commented May 11, 2022

coveralls commented May 11, 2022 • edited Loading

Pull Request Test Coverage Report for Build 2543893476

💛 - Coveralls

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtreinish commented May 17, 2022

Choose a reason for hiding this comment

mtreinish May 19, 2022 • edited Loading

Choose a reason for hiding this comment

nkanazawa1989 May 20, 2022 • edited Loading

Choose a reason for hiding this comment

kevinhartman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtreinish commented May 11, 2022 •

edited

Loading

coveralls commented May 11, 2022 •

edited

Loading

mtreinish May 19, 2022 •

edited

Loading

nkanazawa1989 May 20, 2022 •

edited

Loading