Other GPU utilization measures #181

lars-t-hansen · 2024-08-23T06:32:24Z

It's possible to have 100% gpu utilization as measured by nvidia-smi and still not doing anything because the available parallelism is not exploited; keeping a single SMI busy is enough for 100%. There is some discussion of that here: https://news.ycombinator.com/item?id=41312335. It might be interesting to investigate whether there are measurements we could extract to highlight this.

(Obviously this is not unique to GPUs but it's a lot more critical in GPUs given the available parallelism. In CPUs a somewhat-but-not-really comparable situation is when the program is using serial code to process data and letting the AVX512 unit sit unused - we're not able to see this.)

lars-t-hansen added the enhancement New feature or request label Aug 23, 2024

lars-t-hansen added the question Further information is requested label Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Other GPU utilization measures #181

Other GPU utilization measures #181

lars-t-hansen commented Aug 23, 2024

Other GPU utilization measures #181

Other GPU utilization measures #181

Comments

lars-t-hansen commented Aug 23, 2024