Skip to content

Conversation

@spraveenio
Copy link
Contributor

@spraveenio spraveenio commented Dec 4, 2025

Author - Srikanth (@rsrikanth86)

cherry pick from pensando/sw#108321

  • Modify GPU discovery logic to handle the changes in GPU partitions
  • Changes for gpuagent mock and gpuagent with gim for the same
  • Remove dummy parent GPU from GPUGet output since they contain no useful information and can be retrieved using GPUComputePartitionGet

UT on Mi300X

root@33f55f1c4fc7:/# gpuctl show gpu compute-partition
------------------------------------------------------------------------------------------------
PhysicalGPU                             PartitionType   GPUPartitions
------------------------------------------------------------------------------------------------
66366662-0000-0010-0065-000000000000    CPX             f6ff74a1-0000-1000-80fb-627cf64d0590
                                                        4bff74a1-0000-1000-802d-b1d8e5ff3b57
                                                        5fff74a1-0000-1000-800a-018e4f389309
                                                        43ff74a1-0000-1000-80f4-99ed40112929
                                                        6aff74a1-0000-1000-80a7-4b1f085b433c
                                                        11ff74a1-0000-1000-8038-64c968ee50a4
                                                        a0ff74a1-0000-1000-80a7-87753a3b1d9b
                                                        79ff74a1-0000-1000-80f7-41c95aac2c0a

--- snipped -----

$ time gpuctl show gpu all
  DRM card id                            : 50
  Virtualization mode                    : none
  GPU handle                             : 0x34742720
  Card series                            : Aqua Vanjaram [Instinct MI300X]
  Card vendor                            : Advanced Micro Devices, Inc. [AMD/ATI]
  Card SKU                               : M3000100
  Driver version                         : 6.16.6
  VBIOS part number                      : 113-M3000100-102
  VBIOS version                          : 022.040.003.042.000001
  Partition Id                           : 1

------------------------------------------------------------------------------------------

No. of gpus : 63

real    0m8.177s
user    0m0.011s
sys     0m0.027s

@spraveenio spraveenio requested review from rsrikanth86 and sarat-k and removed request for sarat-k December 4, 2025 01:35
@spraveenio spraveenio changed the title cherry pick from pensando/sw#108321 update 7.1.1 gpu partioning logic Dec 4, 2025
Author - Srikanth (rsrikanth86)

- Modify GPU discovery logic to handle the changes in GPU partitions
- Changes for gpuagent mock and gpuagent with gim for the same
- Remove dummy parent GPU from GPUGet output since they contain no useful
  information and can be retrieved using GPUComputePartitionGet

UT on Mi300X

```bash
root@33f55f1c4fc7:/# gpuctl show gpu compute-partition
------------------------------------------------------------------------------------------------
PhysicalGPU                             PartitionType   GPUPartitions
------------------------------------------------------------------------------------------------
66366662-0000-0010-0065-000000000000    CPX             f6ff74a1-0000-1000-80fb-627cf64d0590
                                                        4bff74a1-0000-1000-802d-b1d8e5ff3b57
                                                        5fff74a1-0000-1000-800a-018e4f389309
                                                        43ff74a1-0000-1000-80f4-99ed40112929
                                                        6aff74a1-0000-1000-80a7-4b1f085b433c
                                                        11ff74a1-0000-1000-8038-64c968ee50a4
                                                        a0ff74a1-0000-1000-80a7-87753a3b1d9b
                                                        79ff74a1-0000-1000-80f7-41c95aac2c0a

--- snipped -----

$ time gpuctl show gpu all
  DRM card id                            : 50
  Virtualization mode                    : none
  GPU handle                             : 0x34742720
  Card series                            : Aqua Vanjaram [Instinct MI300X]
  Card vendor                            : Advanced Micro Devices, Inc. [AMD/ATI]
  Card SKU                               : M3000100
  Driver version                         : 6.16.6
  VBIOS part number                      : 113-M3000100-102
  VBIOS version                          : 022.040.003.042.000001
  Partition Id                           : 1

------------------------------------------------------------------------------------------

No. of gpus : 63

real    0m8.177s
user    0m0.011s
sys     0m0.027s
```
@spraveenio spraveenio force-pushed the bugfix/711partitionupdate branch from 9cf008d to 00661f7 Compare December 5, 2025 21:55
@sarat-k sarat-k merged commit 4a4ae04 into ROCm:main Dec 5, 2025
4 checks passed
@spraveenio spraveenio deleted the bugfix/711partitionupdate branch December 5, 2025 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants