Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qpy roundtrip produces different floating point rounding for parameter values in different locations #10166

Closed
wshanks opened this issue May 25, 2023 · 10 comments
Labels
bug Something isn't working mod: pulse Related to the Pulse module mod: qpy Related to QPY serialization priority: high

Comments

@wshanks
Copy link
Contributor

wshanks commented May 25, 2023

Environment

  • Qiskit Terra version: 0.24.0
  • Python version: 3.10
  • Operating system: Fedora Linux 38

What is happening?

At a high level, some jobs using parameterized pulse gates are failing when using parameters with many digits of precision. The error for the jobs is "unsupported instruction" for the parameterized pulse gate.

The cause of the error is that the pulse gate instruction is becoming decoupled from the circuit calibration after the circuit is submitted to be run, in particular during the qpy serialization / deserialization process.

How can we reproduce the issue?

Here is code to build a problematic circuit:

from qiskit import QuantumCircuit, pulse
from qiskit.circuit import Gate, Parameter
from qiskit.pulse.library import Gaussian


amp = Parameter("amp")

circ = QuantumCircuit(1, 1)
custom_gate = Gate("my_custom_gate", 1, [amp])
circ.append(custom_gate, [0])

with pulse.build() as my_schedule:
    pulse.play(Gaussian(duration=64, amp=amp, sigma=8), pulse.DriveChannel(0))

circ.add_calibration(custom_gate, [0], my_schedule)

circ.assign_parameters([0.3333333333333333], inplace=True)

Here is code to show the discrepancy in parameter value:

from io import BytesIO

from qiskit.qpy import dump, load


buff = BytesIO()

dump(circ, buff)
buff.seek(0)
qpy_circ = load(buff)[0]

instr_param_value = float(qpy_circ.data[0].operation.params[0])
cal_param_value = next(iter(next(iter(qpy_circ.calibrations.values()))))[1][0]
instr_param_value == cal_param_value

What should happen?

The above code should give True for the final expression, but it gives False.

Any suggestions?

The way the current system works is that a circuit instruction can have a name, a tuple of qubits it operates on, and a list of parameters (which can be floats). Then in the calibrations attribute of the circuit there is a nested dictionary structure with keys <instruction_name>:<qubit_tuple>:<parameter_tuple> mapping to schedule definitions. When the backend encounters a custom instruction, it tries to look up a schedule definition in the calibrations attribute using these keys, so there must be an exact match. Floats as dictionary keys are not great, and I could see an argument for a different system, but maybe for now we should just make the current system work.

The cause of the discrepancy is that QuantumCircuit.assign_parameters handles the circuit instruction and calibration substitutions differently. For the circuit instruction, assign_parameters binds the float value into a parameter expression:

https://github.com/Qiskit/qiskit-terra/blob/5013fe2239290414f2cfaafae13c6a9c09ddbbda/qiskit/circuit/quantumcircuit.py#L2836-L2842

as that is what assign() does:

https://github.com/Qiskit/qiskit-terra/blob/5013fe2239290414f2cfaafae13c6a9c09ddbbda/qiskit/circuit/parameterexpression.py#L79-L149

For the calibration parameter, assign_parameters calls _assign_calibration_parameters which substitutes scalar parameters with floats:

https://github.com/Qiskit/qiskit-terra/blob/5013fe2239290414f2cfaafae13c6a9c09ddbbda/qiskit/circuit/quantumcircuit.py#L2896-L2900

The reason this discrepancy matters for high precision floats is that ParameterExpression gets serialized using sympy:

https://github.com/Qiskit/qiskit-terra/blob/5013fe2239290414f2cfaafae13c6a9c09ddbbda/qiskit/qpy/binary_io/value.py#L48-L56

while the float in the calibrations key gets written directly as a float (struct.pack("!d", val)) by write_value():

https://github.com/Qiskit/qiskit-terra/blob/5013fe2239290414f2cfaafae13c6a9c09ddbbda/qiskit/qpy/binary_io/circuits.py#L727

One solution could be for the assignment of a float to a circuit instruction parameter to also directly replace it with the float as happens now with the calibration parameter. Other options could include improving the precision of the parameter so that it deserializes to something that matches the deserialized float or doing a more significant rework of how we keep track of parameterized calibrations.

@wshanks wshanks added bug Something isn't working mod: pulse Related to the Pulse module mod: qpy Related to QPY serialization labels May 25, 2023
@nkanazawa1989
Copy link
Contributor

nkanazawa1989 commented May 26, 2023

Thanks Will for detailed investigation. I also noticed that we don't have any QPY test requiring high-precision floats.
https://github.com/Qiskit/qiskit-terra/blob/main/test/python/qpy/test_block_load_from_qpy.py#L279

I agree adding something like

    try:
        bound_symbol_expr = float(bound_symbol_expr) 
    except TypeError: 
        pass

to the circuit parameter binding is the minimum bugfix. This bug is critical because some jobs (such as calibration job) fail in execution with new IBM Provider.

@wshanks
Copy link
Contributor Author

wshanks commented May 26, 2023

One thing I am not sure about is if there is any benefit to keeping the bound parameter on the circuit instruction as a ParameterExpression bound to a float value versus just replacing it with the float (what you suggest). If not, that seems like the simplest fix, I think. It might be enough to check if the number of free parameters is zero (like the code does for the calibrations key) rather than use try/except.

Otherwise, one thing to note is that the parameter with a bound scalar mechanism has been working with qiskit-ibmq-provider. I did not see any recent changes with git blame for the code sections I referenced above. I think this issue might just be surfacing now due to increased usage of qiskit-ibm-provider (though it is still a little surprising it was not noticed before if it really is not new, so I might be missing something). So one option might be to special case the serialization of ParameterExpressions that contain a single parameter bound to a float -- instead of using srepr from sympy which produces a string containing the float in a symbolic expression, somehow serialize the float directly like plain floats are serialized and the deserialize it back into a ParameterExpression.

(This is expanding on the first two suggestions I had at the bottom of the issue description).

@jakelishman
Copy link
Member

Fwiw, a lot of this is duplicated in #9764, and while "cast fully bound ParameterExpressions to floats" would cause the issue to be sidestepped in this particular instance, it isn't a general solution, because a ParameterExpression can contain floats that will lose accuracy through Symengine's slightly bugged srepr and have unbound parameters.

It's non-trivial to remove the usage of srepr from QPY, because we'd have to replace it with some other way of serialising the Sympy/Symengine expressions in a way that's interoperable between the two libraries. It also doesn't feel great to have special-casing in QPY that introduces casts of one type to another except where completely unavoidable, since it's meant to be as close to a transparent serialisation/deserialisation for built-in Qiskit objects as possible.

If there's a simple catch specifically in the assign_calibrations parts of the parameter assignment that can be done, I think that'll be the least disruptive for a bugfix. I think you'll need to be careful to handle cases where the ParameterExpression is representing a complex as well - that's allowed too (though maybe not used in calibrations).

@wshanks
Copy link
Contributor Author

wshanks commented May 26, 2023

Oops, I would say that is the same issue! (I am a little surprised @nkanazawa1989 didn't point me to that one since he pointed out this context to me 🙂 ). It's a little depressing that it has been known for six months now without a solution (still makes me wonder if something else changed that made it start happening more frequently).

Since it's the same issue, I guess we should close this in favor of #9764? The argument against closing it is that we have some discussion of possible fixes here and all the discussion there is about one path involving symengine that seems to have been abandoned (the reproducer code here is a bit simpler too).

While data not roundtripping unmodified with qpy is an issue, the immediate problem here is that parameter assignment gets handled differently for the instruction parameters and the calibrations parameter keys. If the values were modified but modified in the same way, that would at least avoid the most pressing problem. Another option that could be considered is leaving the calibrations parameter keys as assigned parameters as they are in the instruction parameters rather than replacing them with floats. I don't see an immediate problem with that...we could see if any tests fail.

@jakelishman
Copy link
Member

jakelishman commented May 26, 2023

I don't really remember the context, but at the time I think whatever/whoever was triggering the bug just found a work-around, so it wasn't high priority and neither Naoki nor I had any great ideas for fixing it, so it just got forgotten.

That's a good point about there being a discrepancy - embarrassingly, despite having looked into the issue, I hadn't even thought about it. We could definitely try changing it, but I suspect that calibration lookups will then explode because of #9299. It's not clear to me how we might go about fixing that (though again, it's really something we should). The relevant failure in this case is something like

from qiskit.circuit import ParameterExpression

a = ParameterExpression({}, "0.125")
b = ParameterExpression({}, "1 / 8.")
assert a == b  # success
assert hash(a) == hash(b)  # failure
assert a in {b}  # failure

Another fixing strategy that also seems unpalatable is a breaking API change to ParameterExpression.assign, changing its return value to ParameterExpression | float | complex. Autoconverting to a floating-point type on complete binding is unpleasant because a) it's an API break so not eligible for a patch release even if it fixes the issue and b) the ParameterExpression technically might be representing an infinite precision real number (like sqrt(2)) and not a float. I could easily be sold that we should formalise the semantics of ParameterExpression to "ParameterExpression is a stand-in for float or complex, and fully-bound expressions are not at infinite precision", given that I think people have somewhat assumed that in several places, and it's compatible with what the type was meant to be. I think that the API break in the return value is probably fairly safe - I think all cases where ParameterExpression is a valid input type would also need to handle the case of the input being a float, so having .assign return a float instead hopefully shouldn't break much.

Will, would you be able to drive finding a backport-able solution (if at all possible) and fixing the bug? I know that it's high priority, but I don't have bandwidth to take it on.

@jakelishman
Copy link
Member

Oh, my example about #9299 is actually mistaken here - I didn't construct the ParameterExpressions correctly, which is why the hashes aren't equal. I should have wrapped both the strings in a sympy.sympify/symengine.sympify as appropriate. If I had, then the hashes would actually have been equal, since sympy and symengine aren't broken in that most simple of cases. So maybe that attempt won't immediately explode. I'd worry about all the extant code that already attempts to work around this mistake in the hash-table lookups, though - there's plenty of places in Qiskit that do params = (float(p) for p in params) before using that in the calibration key.

@wshanks
Copy link
Contributor Author

wshanks commented May 27, 2023

I suspect that calibration lookups will then explode because of #9299. It's not clear to me how we might go about fixing that (though again, it's really something we should).

Yes, changing the key in the calibrations dict would mean that you could not look up a calibration with a bare float any more since that would give a different hash. You would need to do something like has_calibration_for to turn a float into a parameter before doing the look up.

Separate from the immediate fix, I wonder if we want to add a get_calibration(gate, qubits, params) method and to discourage accessing the calibrations dict directly to allow more freedom to change how we deal with this in the future.

Will, would you be able to drive finding a backport-able solution (if at all possible) and fixing the bug? I know that it's high priority, but I don't have bandwidth to take it on.

Yes, I can keep trying, but I think we have to come up with something that you don't rule out 😛

Options that are somewhat viable:

  1. Do not replace bound parameters with floats in the calibrations dict parameters key when assigning a value to a parameter with QuantumCircuit.assign_parameter. Doing only this would be an API break like we noted so could not be backported. An option to soften the API break would be to replace the calibrations (sub)dictionary with a MutableMapping that could map floats to bound parameters. This might still be a bit of an API break if we consider it valid for a user to interact with QuantumCircuit.calibrations outside of add_calibration and dictionary lookups (so doing some other direct manipulation of the dictionary).
  2. Special case the writing of parameters bound to floats to serialize and deserialize the float outside of sympy. One problem with this is that it makes the code inelegant. It also might open up other complications with larger parameter expressions that serialize floats differently from when they are serialized in individual parameters (like you noted before). Additionally, this special casing would require an increment of the qpy version and I am not sure if that is allowed in a backport (is it okay for 0.24.1 to generate qpy that 0.24 can not read?).
  3. Modify QuantumCircuit.assign_parameters() so that instruction parameters that are bound to floats get replaced by those floats like they do in the calibrations dictionary keys. This option does not seem so bad to me. I do not have a good sense of how a parameter bound to a float is used in the code, so I am not totally sure. Once bound, a parameter can not be rebound, I think, so a parameter bound to a float is already very similar to a float, I think.

Less desirable options:

  1. Make the calibrations dictionary lookup a fuzzy match. Doing the fuzzy matching opens up difficult edge cases. Also, this option leaves the discrepancy between instruction and calibration parameter values and we know there are downstream consumers (like the IBM provider) that use these values and will still have trouble. They could implement fuzzy matching, but they could do that now with no change to terra.
  2. Change ParameterExpression.assign to replace parameters bound to floats with just the floats. This might be okay, but it is a more impactful version of the option to change QuantumCircuit.assign_parameters above. It's probably best to limit the scope of the change.
  3. Change parameter serialization more fully. One example of this is Native symengine serialization for QPY generation #8232 which was abandoned. This is probably too big a change for a backport.

@yaelbh
Copy link
Contributor

yaelbh commented May 28, 2023

Regarding the old provider versus the new provider: the old provider runs these circuits without crashing, and returns results (probably incorrect results because of this issue's bug - I haven't checked). Whereas the new provider crashes. This explains why the complaints began or became more frequent with the new provider.

Code that demonstrates it:

from qiskit import QuantumCircuit, pulse, transpile, IBMQ
from qiskit.circuit import Gate, Parameter
from qiskit.pulse.library import Gaussian

from qiskit_ibm_provider import IBMProvider


new_provider = IBMProvider()
new_backend = new_provider.get_backend("backend name")

IBMQ.load_account()
old_provider = IBMQ.get_provider(
    hub="hub name", group="group name", project="project name"
)
old_backend = old_provider.backend.same_backend_name


for backend, old_or_new in zip([old_backend, new_backend], ["old, new"]):
    amp = Parameter("amp")

    circ = QuantumCircuit(1, 1)
    custom_gate = Gate("my_custom_gate", 1, [amp])
    circ.append(custom_gate, [0])
    circ.measure(0, 0)

    with pulse.build(backend) as my_schedule:
        pulse.play(Gaussian(duration=64, amp=amp, sigma=8), pulse.drive_channel(0))

    circ.add_calibration(custom_gate, [0], my_schedule)
    circ.assign_parameters([0.3333333333333333], inplace=True)
    circ = transpile(circ, backend)
    
    res = backend.run(circ).result()
    print(old_or_new, " provider:", res.get_counts(0), "\n")

Note that many backends don't work with the old provider anymore. In the case, the result method is stuck (this is a separate bug, not related to parametrized pulses), but you can see in the dashboard that the execution terminated successfully and results have been created (although the "Unknown instruction" alert is still there!).
image

@wshanks
Copy link
Contributor Author

wshanks commented May 28, 2023

probably incorrect results because of this issue's bug

I don't expect the results to be incorrect with the old provider because it serializes directly to json without going through qpy. In that case, the instruction parameter and the calibrations dict key for the parameter should both be cast to strings in the same way.

@nkanazawa1989
Copy link
Contributor

Manually closing this issue because likely github doesn't recognize

Closes #9764 and #10166

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working mod: pulse Related to the Pulse module mod: qpy Related to QPY serialization priority: high
Projects
None yet
Development

No branches or pull requests

4 participants