-
Bug descriptionThe TFLOPS value shown in this code for H100 is different from NVIDIA's document as illustrated by following figure. What operating system are you using?None Where do you run your code?None Environment
|
Beta Was this translation helpful? Give feedback.
Replies: 4 comments
-
Thanks for the callout. You are absolutely right that the numbers differ. It's been a while since but according to the link I had included there, I gathered the number for the H100 PCI version from here https://www.techpowerup.com/gpu-specs/h100-pcie-80-gb.c3899, which has a lower number of TFLOPS than the H100 SXM and H100 NVL you referenced. |
Beta Was this translation helpful? Give feedback.
-
so which number is more correct? do i need to update if i want to calculate MFU? |
Beta Was this translation helpful? Give feedback.
-
I am not a hardware expert and take this with a grain of salt, but I think nvidia-smi might give you some info what card you have in your machine. Whether it's the PCI, NVL, SXM versions. I don't have machines with these cards so I can't confirm. But yeah, as you can see based on the table above even the SXM and NVL version have noticeable differences. |
Beta Was this translation helpful? Give feedback.
-
I did some more research on this, and I believe these are the correct numbers:
*Values marked with an asterisk indicate performance achieved with sparsity. So to get the correct numbers I suggest first determining what exact H100 card you are using. To clean up the issue tracker, I'll be moving this to the Discussion page here on GitHub but please let me know if you have any follow-up questions and/or concerns. You did raise a good question there, but it's just a bit tricky to address. |
Beta Was this translation helpful? Give feedback.
I did some more research on this, and I believe these are the correct numbers:
*Values marked with an asterisk indicate performance achieved with sparsity. So to get…