Skip to content

Conversation

@spraveenio
Copy link
Contributor

  • violation status in metrics payload is not populated
  • new api is being used to get the correct status
   #amd-smi metric --g 0 -T --json
   -- snipped ---
  "throttle": {
                "accumulation_counter": 117761943,
                "prochot_accumulated": 0,
                "ppt_accumulated": 332554,
                "socket_thermal_accumulated": 0,
                "vr_thermal_accumulated": 0,
                "hbm_thermal_accumulated": 0,

   ---  snipped --

   #gpuctl show gpu -y
   -- snipped  --
    violationstats:
      currentaccumulatedcounter: 117761943
      processorhotresidencyaccumulated: 0
      pptresidencyaccumulated: 332554
      socketthermalresidencyaccumulated: 0
      vrthermalresidencyaccumulated: 0
      hbmthermalresidencyaccumulated: 0

violation status in metrics payload is not populated
new api is being used to get the correct status

```bash
   #amd-smi metric --g 0 -T --json
   -- snipped ---
  "throttle": {
                "accumulation_counter": 117761943,
                "prochot_accumulated": 0,
                "ppt_accumulated": 332554,
                "socket_thermal_accumulated": 0,
                "vr_thermal_accumulated": 0,
                "hbm_thermal_accumulated": 0,

   ---  snipped --

   #gpuctl show gpu -y
   -- snipped  --
    violationstats:
      currentaccumulatedcounter: 117761943
      processorhotresidencyaccumulated: 0
      pptresidencyaccumulated: 332554
      socketthermalresidencyaccumulated: 0
      vrthermalresidencyaccumulated: 0
      hbmthermalresidencyaccumulated: 0
```
Copy link
Contributor

@rsrikanth86 rsrikanth86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Collaborator

@sarat-k sarat-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@spraveenio spraveenio merged commit 4cd3409 into ROCm:main Sep 25, 2025
@spraveenio spraveenio deleted the bugfix/violationstatsapichange branch September 25, 2025 02:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants