-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU inference not working #949
Comments
Hi @charrezde, The input you send in the q_module.forward(images.detach().cpu().numpy(), fhe="disable") However, GPU acceleration is for the FHE execution so you need to do |
Hi @jfrery , this does not seem to be working for me q_module.forward(images.to("cuda"), fhe="execute") This is failing due to:
It is by default converting the inputs to numpy. If I run: q_module.forward(images.detach().numpy(), fhe="execute") I get my process killed. |
Yes that is expected. We don't handle torch input for the inference part. Only for the compilation so you need to convert them to numpy.
Alright I would need more information here:
|
|
How many images are you trying to pass in the compile? You should probably monitor the gpu memory as I think this is the bottleneck. And maybe try with a single image in the compile and a single image in the test. Hopefully that fits within the 16GB. We run resnet like model on h100. I am not sure you can do that on a T4. |
Summary
The compilation is happening with CUDA but the FHE inference is not possible on CUDA. It by default convert the input to numpy, which is only working with cpu.
Will there be a support for using GPUs for inference?
Description
Step by step procedure someone should follow to trigger the bug:
minimal POC to trigger the bug
The text was updated successfully, but these errors were encountered: