Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve mappers, add ModeBasedMapper #1301

Merged

Conversation

grossardt
Copy link
Contributor

Summary

Adds a new class ModeBasedMapper which implements the basic functionality of the mode based mappers (JW, BK, parity, direct mapper) through a Pauli table. Reverts pauli_table to instance method and removes all caching from the parent class. As discussed in #1289.

Performance testing

I have performed the tests as in #545 and #644 with code and more plots available here. The results look like this:
Parity_operator
Number_operators_of_length_15

Essentially, the effect of caching the Pauli table methods seems marginal. As there appear to be some fringe cases, where caching is somewhat useful (e.g. performances almost doubles for the BK and Parity mappers when mapping large numbers of length 10 number operators but not for length 5 or 15) and no real harm is being done, I implement the caching on the level of the individual pauli_table implementations. For this, the instance method pauli_table simply calls the private static method _pauli_table which is cached. This way, one can keep the caching for the existing mappers, but can also implement new mappers that actually use that pauli_table is an instance method.

@coveralls
Copy link

coveralls commented Dec 12, 2023

Pull Request Test Coverage Report for Build 8719469367

Details

  • 74 of 76 (97.37%) changed or added relevant lines in 8 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.03%) to 86.908%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit_nature/second_q/mappers/mode_based_mapper.py 44 46 95.65%
Totals Coverage Status
Change from base Build 8719460949: 0.03%
Covered Lines: 8968
Relevant Lines: 10319

💛 - Coveralls

@woodsp-ibm
Copy link
Member

In looking at the number op timing in your notebook, it looks like #644, in that the timing includes the operator build out. Which is not the case in the parity op one where the operator is built before the call to map() rather than being built as the operator parameter.

I do not know how closely things were looked at before. What might be interesting to know is if you put timing in the code around the pauli table buildout how long does that take in comparison to the overall mapping. I imagine that varies a lot though right since the table is built based on the operator width (register size) but the mapping depends on how many terms the operator has.

#644 seems to talk about auxiliary operators - if these are normally relatively few terms then the overhead of building the table for each may end up being more significant. And I guess the table being cached should save some memory too.

@grossardt
Copy link
Contributor Author

In looking at the number op timing in your notebook, it looks like #644, in that the timing includes the operator build out. Which is not the case in the parity op one where the operator is built before the call to map() rather than being built as the operator parameter.

I do not know how closely things were looked at before. What might be interesting to know is if you put timing in the code around the pauli table buildout how long does that take in comparison to the overall mapping. I imagine that varies a lot though right since the table is built based on the operator width (register size) but the mapping depends on how many terms the operator has.

Yes, right. Based on the result my suspicion is that building the Pauli table is actually not the most expensive part, but it's a good idea to look at this explicitly. I will try to find some time to do that (but possibly only after Christmas).

#644 seems to talk about auxiliary operators - if these are normally relatively few terms then the overhead of building the table for each may end up being more significant. And I guess the table being cached should save some memory too.

That's also a good point. Generally I just took these two examples from the previous issues without much questioning, but I think it would be a good idea to put some thought into coming up with some realistic and meaningful benchmark in some sense.

That being said: as I wrote above, I think there is no harm done by doing the caching (on the level of the individual mapper's pauli_table method).

My guess is that the compose and simplify calls in mode_based_mapping are quite expensive. I see if I can time that as well. Possibly, improving the efficiency of simplify would do more for the efficiency of the mappers as well, as compared to caching the pauli_table.

The main difference regarding the caching is that sparse_pauli_operators is not cached any longer, which probably explains the somewhat better performance of the old version even when compared to the new version with caching. However, the composition of the creation/annihilation operators in sparse_pauli_operators seems to not have a large effect overall.

On a different but related note:

I was wondering what the point is of having these two separate methods, pauli_table and sparse_pauli_operators? As I understand it, mode based mapping is supposed to capture all those cases in which the single particle Fock space operators $a_j$, $a_j^\dagger$ are each mapped to some SparsePauliOp. It is obvious that these Pauli operators are then $(P_j \pm Q_j)/2$ where for the common mappers (JW, Parity, BK) $P_j$ and $Q_j$ are single Pauli string operators. I don't see, however, why ModeBasedMapper shouldn't be more general and accept any such mapping.

Now, pauli_table determines these single Pauli strings $P_j$ and $Q_j$, whereas sparse_pauli_operators takes over the combination to $a_j$ and $a_j^\dagger$, or $(P_j \pm Q_j)/2$, respectively. Does it really make sense to keep these two steps as separate methods? I can't think of use cases that would need $P_j$ and $Q_j$ as intermediate results in this process. If anything, it is a mere convenience to prevent repetition of these ~4-6 lines of code in the implementation for the child classes. On the other hand, combining the two methods into a single one that just returns the sparse_pauli_operators matching the creation/annihilation operators may allow for more efficient implementations that do not need to first build the operators $P_j$ and $Q_j$.

My personal sentiment is that I would expect pauli_table to return what sparse_pauli_operators returns, namely a list of Pauli operators corresponding to all the modes in the system, and it should for every mapper separately implement the most efficient way to obtain this list, with or without the detour via calculating $P_j$ and $Q_j$.

Neither pauli_table nor sparse_pauli_operators is currently called anywhere outside of mode_based_mapping (at least within qiskit-nature), so it would probably still be possible without too much harm to change this, but maybe there is a reason for the splitting that I am missing?

@grossardt
Copy link
Contributor Author

grossardt commented Jan 12, 2024

Happy New Year everyone! To continue where we left off, I changed the timing of the number operator to not include building the operators. That doesn't change much, however, even for the total time. So building the operators seems to take negligible time compared to the mapper, but caching seems to not have much benefit.
In an attempt to come up with some more practical test, I also built versions of the Fermi-Hubbard Hamiltonian with creation/annihilation operators transformed by a random unitary. I built n operators of size n for n = 1...9, and the result again shows only marginal benefit of the caching. (But there is this marginal advantage, so I would leave the caching in as is.)

Fermi-Hubbard_Hamiltonian

For me, the only open question is the "different note" above about whether the pauli_table and sparse_pauli_operators methods should be merged.

mrossinek
mrossinek previously approved these changes Feb 23, 2024
Copy link
Member

@mrossinek mrossinek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor details, but overall this LGTM.


Regarding your question about pauli_table and sparse_pauli_operators: the separation is likely historical. I don't have a strong opinion on whether this is kept as is or not. However, changing it will require proper deprecation.

Maybe the approach suggested in #1340 could inform a potential new implementation.

@@ -143,7 +143,7 @@ def _logarithmic_encoding(
op.chop()
spin_op_encoding.append(op)

return tuple(spin_op_encoding)
return (spin_op_encoding[0], spin_op_encoding[1], spin_op_encoding[2], spin_op_encoding[3])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this is necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this because otherwise mypy fails due to this issue: python/mypy#7509

qiskit_nature/second_q/mappers/mode_based_mapper.py Outdated Show resolved Hide resolved
releasenotes/notes/improve-mappers-b55cb0ca5fd656e4.yaml Outdated Show resolved Hide resolved
robertodr
robertodr previously approved these changes Mar 7, 2024
- |
The class :class:`.second_q.mappers.ModeBasedMapper` has been added to implement mode based
mapping via a Pauli table (previously part of :class:`.second_q.mappers.QubitMapper`).
upgrade:
Copy link
Member

@woodsp-ibm woodsp-ibm Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

upgrade is supposed to be things an end user needs to do to upgrade (alter) their code for the release e.g. things have been removed and users code needs to be adjusted. How much is this really just change and part of the features statement - though honestly this is supposed to be more an end user facing statement around the changes where I am not sure how much, if anything, this internal change affects them or surfaces to them in any way. I imagine code they had before using the mappers works just as it did before.

I see Max's comment saying some methods have been removed - that is a case where the user would have to change things so that makes sense, so that would be upgrade, but that just seems to be the last paragraph there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That explanation makes sense. I have removed the point about changes in caching, and the inheritance which I believe has no end user facing consequences. The removal of public methods as pointed out by @mrossinek should obviously stay. Based on your reasoning, the info that pauli_table is now an instance method instead of a class method should stay, I guess, in case someone is calling it on a class in their code.

@grossardt grossardt requested a review from woodsp-ibm March 11, 2024 11:59
@grossardt grossardt requested a review from robertodr March 11, 2024 11:59
Copy link
Member

@mrossinek mrossinek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience and hard work on this! You substantially improved the design of our mappers with this PR 👍

@mrossinek mrossinek merged commit 295788f into qiskit-community:main Apr 17, 2024
16 checks passed
@grossardt grossardt deleted the fix-improve-mappers-issue1289 branch April 19, 2024 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants