-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calculating FLOPs and training memory per GPU #16
Comments
Hi, We used fvcore from facebook research to automatically estimate FLOPs. For the GPU memory usage, we reported the maximum GPU memory occupied by tensors with torch.cuda.max_memory_allocated. |
Thanks so much for your response. I tried fvcore but I keep getting this error message: "RuntimeError: Detected that you are using FX to torch.jit.trace a dynamo-optimized function. This is not supported at the moment." Could you please share the details of where do you call |
Just disable |
Hi,
Would you please give some instructions about the way you calculated FLOPs for a single forward pass during training and training memory per GPU? Thanks.
The text was updated successfully, but these errors were encountered: