Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better batch size default setting for QuantumKernel #270

Closed
cnktysz opened this issue Dec 8, 2021 · 12 comments
Closed

Better batch size default setting for QuantumKernel #270

cnktysz opened this issue Dec 8, 2021 · 12 comments
Labels
on hold 🛑 Can not fix yet priority: low QAMP 🎓 Qiskit Advocate Mentorship Program type: enhancement ✨ Features or aspects to improve

Comments

@cnktysz
Copy link
Contributor

cnktysz commented Dec 8, 2021

What is the expected enhancement?

The default batch size value of the Quantum Kernel (currently 900) might create unnecessary jobs when the user is not careful. The main reason for this is that some IBMQ backends have different maximum circuit values (e.g. 100, 300). To avoid this issue setting the default value to 0 and disabling batching would be better in the long run. This way advanced users can still have access to this feature.

To recereate the problem:
Let's say we use the values of batch_size=400 and max_circuits=300 (defined by an IBMQ backend)
With the batching system: 1500 circuits → 400+400+400+300 = 7 jobs
Without the batching sytem: 1500 circuits → 300+300+300+300+300= 5 jobs

The fix is currently in progress within the QAMP project (Analyze and improve performance of Qiskit Machine Learning #14)

@cnktysz cnktysz added the type: enhancement ✨ Features or aspects to improve label Dec 8, 2021
@woodsp-ibm
Copy link
Member

woodsp-ibm commented Dec 8, 2021

With the batching system: 1500 circuits → 400+400+400+300 = 7 jobs

7? or 4? I guess I do not fully follow your example

The way things were done was the algorithm was, in general, independent of any backend with the quantum instance responsible for any breakdown of the circuits into jobs based on backend limitations.

My recollection is that the problem in this instance is that with larger kernels the sheer amount of circuits it could generate would cause out-of-memory errors. Hence a local limit (batch size) was chosen to avoid this, but there is this interplay that did get introduced between the chunks generated by the kernel and the breakdown that is needed by the backend so that there may then be inefficiencies if the user simply stays with defaults. The default value was some other value in the past - 1000 - but was dropped to 900 otherwise it was indeed inefficient for limits as known when it was done since for each 1000 batch it would do 2 jobs, one at 900 and one at 100!

@cnktysz
Copy link
Contributor Author

cnktysz commented Dec 8, 2021

It is 7:(300+100)*3+300. IBMQ backend creates jobs that can fit to max_circuits, that is why 2 jobs are created when you pass 400.

I think I didn't choose the best numbers as the example, but it is possible to get more jobs with certain settings. Unwanted jobs would occur when max_circuits is not a divisor of the batch_size value.

We (me and @adekusar-drl ) thought turning it off would be easiest fix.

@woodsp-ibm
Copy link
Member

Ok it did not state that 400 was the batch size. 1500 circuits with 400 batch size and 300 max circuits on backend would indeed create an inefficiency.

As to turning it off by default, as the easiest fix, I think you need to investigate behavior with larger kernels and ensure we don;t run out of memory. Maybe the default could be informed via the quantum instance - though of course for local simulators we still need to take care as there is no limit if I recall correctly.

@cnktysz
Copy link
Contributor Author

cnktysz commented Dec 8, 2021

Yes, you are right. The plan is to create a quantum instance that can test if we don't run out of memory and get the expected behaviour.

@woodsp-ibm
Copy link
Member

I found the original issue before we had any limit on kernel qiskit-community/qiskit-aqua#208 (of course things have changed a bit since back then but I think the memory consumption is still an issue that needs care taken)

@woodsp-ibm
Copy link
Member

The plan is to create a quantum instance that can test if we don't run out of memory

I am not sure how doable that is - the Python process will often simply abend if its out of memory.

@woodsp-ibm
Copy link
Member

FYI the change from 1000, that was set back then, to 900, for better match/efficiency to backends is relatively recent #150

@adekusar-drl
Copy link
Collaborator

@woodsp-ibm Idea here to align batches in QK with batches on hardware as much as possible. In some cases this may lead to a reduced number of jobs submitted, thus less time to train a model. There's no intention to remove currently available batching features of QuantumInstance, but rather add a new one. For instance, if we pass a zero batch size then we fully rely on batching on hardware level and don't optimize batches anyhow. This behavior should lead to a minimal number of jobs required to execute all circuits.

@woodsp-ibm
Copy link
Member

To align the batch size with the remote entity based on some remote circuit size limit per job seems fine. Some auto mode - that's what you want some magic value to do it seems - better than have the user know about these things I agree. Just saying when auto configured with a local simulator, since there will be no job limit, the batch size just needs to take care that too many circuits are not build which will run out of memory, that was the only reason there was a limit added there in the past since we ran out of memory before the QI could fragment them into limit sized chunks.

@manoelmarques manoelmarques added the QAMP 🎓 Qiskit Advocate Mentorship Program label Apr 14, 2022
@adekusar-drl adekusar-drl added the on hold 🛑 Can not fix yet label Nov 4, 2022
@GayatriVadaparty
Copy link

GayatriVadaparty commented Mar 11, 2023

May be I was late to the discussion, But I can somehow discuss about this issue.
I believe the optimal batch size for a quantum kernel can be anything just like we have in classical kernels. It may also depends on complexity of the quantum circuit, the size of the dataset, the number of qubits, and the available computational resources. However, As you have mentioned, as a general rule of thumb, a batch size of 300 or 400 is considered large for most quantum machine learning problems, especially if the quantum circuit is complex or if the dataset is large.
For some simpler quantum machine learning problems or smaller datasets, a batch size of 300 or 400 may be appropriate.
I have tried adjusting hyperparameters which actually improved accuracy for my machine learning models

@woodsp-ibm
Copy link
Member

Batching here was all about optimal job creation wrt the remote backend which had limit on number of circuits per job. Each job went to the queue for the device, which is shared. This was primarily then about optimizing jobs since the QuantumInstance, via which the circuit execution took place, would split any request into multiple jobs to ensure the backend limit was met (if not it raised an error about exceeding the limit so it had to be enforced). If you re-read the discussion you will see its all in this context.
Algorithms like the newer quantum kernel here are all based on primitives and things are quite different since this was created. I see this was put on hold - judging by the date this was at the time of introducing the new primitive based kernels and deprecating the kernel that used QuantumInstance. Probably this issue should simply be closed now, as while the kernel it applied to still exists, it is deprecated and will be removed in a future release, in favor or the new kernels.

@adekusar-drl
Copy link
Collaborator

Closing the issue as QuantumKernel has been deprecated and removed. Batching is implemented on the primitives level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
on hold 🛑 Can not fix yet priority: low QAMP 🎓 Qiskit Advocate Mentorship Program type: enhancement ✨ Features or aspects to improve
Projects
None yet
Development

No branches or pull requests

5 participants