Should the keys in result.Counts be ordered ? #4985

jlapeyre · 2020-08-26T16:07:31Z

What is the expected enhancement?

result.Counts would be more convenient if it were a subclass of OrderedDict. The keys are bit strings, which have a natural order, so it would be useful to preserve the order. For example, currently, when displaying the counts interactively, the keys (bit strings) are not ordered.
Is there a reason this choice was made ?
If ordering the counts is not desirable (too expensive or not necessary for most uses, etc.) an alternative is a function that returns the an ordered dict.

The text was updated successfully, but these errors were encountered:

mtreinish · 2020-08-26T16:40:51Z

It was mostly done this way for compatibility with the dict that was there before see #4501 I didn't really think about it except to make the new Counts class subclass the same type as what was there before. That being said I think preserving insertion ordering is fine. But, I don't think we'll need to do anything here. After #4926 merges python will be preserving insertion ordering. In python3.6 that was added because of performance optimizations in that release. but it wasn't explicitly guaranteed yet. In 3.7 it was added as a guarantee moving forward, so we really only need to use OrderedDict while we still support Python3.5.

jlapeyre · 2020-08-26T17:37:31Z

But, I don't think we'll need to do anything here

This makes sense since python dicts are now ordered.

But, in applications I am playing with now, under python 3.8, the strings are not ordered. Since insertion order is preserved, this tells me that the strings are not ordered when the source dict is created. They also don't appear to be inserted in the order in which the results are obtained (which should be random) because many strings with very low counts come before strings with very high counts.

mtreinish · 2020-08-26T17:58:56Z

Well we can look at sorting them before they're added to the Counts object here:

https://github.com/Qiskit/qiskit-terra/blob/master/qiskit/result/result.py#L250

change that to be something like

dict_list.append(Counts(sorted(self.data(key)['counts']), **counts_header))

but before we open a PR doing this we should benchmark it to see what the practical overhead of sorting is. If it's minimal lets go for it, but if it adds a noticeable amount of overhead I'd say lets just add a helper method (or kwargs to existing ones) to counts() to get a sorted iter of the counts dict and leave it up to the user.

jlapeyre · 2020-08-26T19:08:55Z

result = ...
raw_dict = result.data(0)['counts']
counts = result.get_counts()
ms = counts.memory_slots

def makeCounts(d, memory_slots):
    sk = sorted(d)  # returns sorted keys
    sd = {k: d[k] for k in sk}
    return Counts(sd, memory_slots=memory_slots)

Then

In [1]: len(raw_dict)                                                                                                                              
Out[1]: 25

In [2]: %timeit Counts(raw_dict, memory_slots=ms)                                                                                                  
38.1 µs ± 230 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [3]: %timeit makeCounts(raw_dict, memory_slots=ms)                                                                                              
40.4 µs ± 94.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [2]: len(raw_dict)                                                                                                                              
Out[2]: 100

In [3]: %timeit Counts(raw_dict, memory_slots=ms)                                                                                                  
118 µs ± 408 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [4]: %timeit makeCounts(raw_dict, memory_slots=ms)                                                                                              
123 µs ± 422 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

almost all the extra time is due not to sorting, but to building the intermediate dict.
adding a helper method would be the least disruptive in the sense that we are sure it would not affect existing code. But, it adds to the users' cognitive/documentation-digging burden. I can't imagine a situation in which a user would rather have the bitstrings in a random order than in numerical order.

1ucian0 · 2020-10-24T20:25:02Z

I can imagine situation where a user wants the counts ordered by value instead, as she might be interested in the most statistically significant results only...

Avhijit-Nair · 2021-12-16T03:41:45Z

This issue was recently added in the "To Do" list of Backends, Qobj, and Result . Requires more information on the status.

1ucian0 · 2021-12-17T20:59:18Z

I can imagine situation where a user wants the counts ordered by value instead, as she might be interested in the most statistically significant results only...

this use cases is covered with the Count.most_frequent() method.

1ucian0 · 2021-12-17T21:02:21Z

@mtreinish, is the performance analysis done by @jlapeyre enough to make a decision about this issue?

1ucian0 · 2021-12-17T21:08:44Z

To reproduce the issue, notice that the output of the following code result in an unsorted key:

from qiskit import Aer, QuantumCircuit

circuit = QuantumCircuit(2)
circuit.h(0)
circuit.cx(0, 1)
circuit.measure_all()

sim = Aer.get_backend('aer_simulator')

result = sim.run(circuit, seed_simulator=10).result()
counts = result.get_counts()
print(counts)

{'11': 505, '00': 519}

jakelishman · 2023-04-20T00:39:23Z

In the interests of trimming old issues, I'll close this as stale now. There is more than one ordering that might be useful to users, and "arbitrary" is the cheapest to construct, which is convenient for performance-focussed users who want to do their own post-processing. If there's more to discuss, feel free to re-open.

fwiw, sorting this particular object in Python in the order requested is as simple as Counts(sorted(counts.items())), so imo it's something we should reasonably expect a user to do themselves if they want to choose the performance hit.

jlapeyre added the type: enhancement It's working, but needs polishing label Aug 26, 2020

jlapeyre changed the title ~~Should result.Counts be an OrderedDict ?~~ Should the keys in result.Counts be ordered ? Aug 26, 2020

jlapeyre mentioned this issue Nov 4, 2021

[RFC] Sorted histogram data #6684

Closed

1ucian0 added the status: needs information Further information is requested label Dec 17, 2021

jakelishman closed this as completed Apr 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should the keys in result.Counts be ordered ? #4985

Should the keys in result.Counts be ordered ? #4985

jlapeyre commented Aug 26, 2020

mtreinish commented Aug 26, 2020 •

edited

Loading

jlapeyre commented Aug 26, 2020

mtreinish commented Aug 26, 2020

jlapeyre commented Aug 26, 2020

1ucian0 commented Oct 24, 2020

Avhijit-Nair commented Dec 16, 2021

1ucian0 commented Dec 17, 2021

1ucian0 commented Dec 17, 2021

1ucian0 commented Dec 17, 2021

jakelishman commented Apr 20, 2023

Should the keys in result.Counts be ordered ? #4985

Should the keys in result.Counts be ordered ? #4985

Comments

jlapeyre commented Aug 26, 2020

What is the expected enhancement?

mtreinish commented Aug 26, 2020 • edited Loading

jlapeyre commented Aug 26, 2020

mtreinish commented Aug 26, 2020

jlapeyre commented Aug 26, 2020

1ucian0 commented Oct 24, 2020

Avhijit-Nair commented Dec 16, 2021

1ucian0 commented Dec 17, 2021

1ucian0 commented Dec 17, 2021

1ucian0 commented Dec 17, 2021

jakelishman commented Apr 20, 2023

mtreinish commented Aug 26, 2020 •

edited

Loading