Add support for node labels to report GPU mode #768

visheshtanksale · 2024-06-14T06:59:47Z

Some GPUs support switching between graphics mode and compute mode.

The mode is switched by a utility called displaymodeselector

This change identifies the mode on the GPU and labels the node to help end user schedule workloads

The assumption is that all the GPUs on the node have the same mode, if that is not the case then the label nvidia.com/gpu.mode value is unknown

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

internal/resource/cuda-device.go

elezar

As a general comment, could we update the commit message to indicate that we are reporting the mode and not specifying it.

Then, should MIG devices have unknown reported? Should these not always be compute? What is the intended use of this label?

internal/resource/device_mock.go

internal/resource/nvml-device.go

elezar · 2024-06-14T13:43:26Z

internal/resource/nvml-device.go

+	return resolvePCIAddressToMode(pciID)
+}
+
+func resolvePCIAddressToMode(addr string) (string, error) {


This seems like a function that we should have in go-nvpci for instead of reimplementing it here.

I changed the implementation here to just read the class based on the pci address. Maybe we should still push getting the class function to go-nvml.

The place for this to live would always be go-nvlib. The package at go-nvml is a wrapper for NVML specifically and we don't really add additional functionality there.

internal/resource/sysfs-device.go

tests/expected-output-mig-none.txt

elezar · 2024-06-14T13:47:11Z

docs/gpu-feature-discovery/README.md

@@ -209,6 +209,7 @@ their meaning:
 | nvidia.com/gpu.machine         | String     | Machine type                                 | DGX-1          |
 | nvidia.com/gpu.memory          | Integer    | Memory of the GPU in Mb                      | 2048           |
 | nvidia.com/gpu.product         | String     | Model of the GPU                             | GeForce-GT-710 |
+| nvidia.com/gpu.mode            | String     | Display or Compute Mode of the GPU           | compute        |


Could we provide more information on what this value means?

Agreed. Working on it

We still need to add a reference as to what this means. From the documentation it is also only applicable to specific device types. Do we want to link to the relevant documentation here?

jojimt · 2024-06-14T19:55:54Z

Then, should MIG devices have unknown reported? Should these not always be compute? What is the intended use of this label?
MIG devices should report compute. The intended use of the label is to let an application select a worker node based on the mode available on the node.

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

visheshtanksale · 2024-06-17T10:38:20Z

As a general comment, could we update the commit message to indicate that we are reporting the mode and not specifying it.

Fixed this

Then, should MIG devices have unknown reported? Should these not always be compute? What is the intended use of this label?

Based on current list of GPUs that support mode switch it will be always compute. But fixed this to just pull it from the actual device.

internal/lm/nvml.go

internal/resource/cuda-device.go

internal/resource/sysfs-device.go

internal/resource/types.go

internal/resource/nvml-device.go

internal/lm/nvml.go

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

elezar

Thanks @visheshtanksale.

There are some typos in the function name and I'm also not clear on what the behaviour should be for a MIG device.

internal/lm/nvml.go

internal/resource/cuda-device.go

internal/resource/nvml-device.go

elezar · 2024-06-24T16:13:11Z

internal/resource/nvml-mig-device.go

@@ -132,3 +133,23 @@ func totalMemory(attr map[string]interface{}) (uint64, error) {
 		return 0, fmt.Errorf("unsupported attribute type %v", t)
 	}
 }
+
+func (d nvmlMigDevice) GetPIEClass() (uint32, error) {
+	info, retVal := d.MigDevice.GetPciInfo()


Question: Is PCI info valid for a MIG device? How does the busid differ from that of the parent?

It is the busID of the parent.

elezar · 2024-06-24T16:13:58Z

internal/resource/nvml-mig-device.go

+	if retVal != nvml.SUCCESS {
+		return 0, retVal
+	}
+	var bytes []byte
+	for _, char := range info.BusId {
+		if char == 0 {
+			break
+		}
+		bytes = append(bytes, byte(char))
+	}


Instead of implemention this here this should be pulled into go-nvlib if it is valid.

Happy to do this in a follow-up though.

We can decide where to put this behavior once we decide what behavior is expected for MIG devices.

Because of the hardcoding of the PCI class for MIG devices this code is removed

internal/lm/nvml.go

elezar · 2024-06-24T16:20:04Z

docs/gpu-feature-discovery/README.md

@@ -209,6 +209,7 @@ their meaning:
 | nvidia.com/gpu.machine         | String     | Machine type                                 | DGX-1          |
 | nvidia.com/gpu.memory          | Integer    | Memory of the GPU in Mb                      | 2048           |
 | nvidia.com/gpu.product         | String     | Model of the GPU                             | GeForce-GT-710 |
+| nvidia.com/gpu.mode            | String     | Display or Compute Mode of the GPU           | compute        |


We still need to add a reference as to what this means. From the documentation it is also only applicable to specific device types. Do we want to link to the relevant documentation here?

elezar · 2024-06-24T16:22:16Z

internal/lm/nvml.go

+	}
+	gpuMode := getModeForClasses(classes)
+	labels := Labels{
+		"nvidia.com/gpu.mode": gpuMode,


Question: Since we also extract this label for a MIG device do we expect different labels per MIG profile?

Devices that support MIG do not support changing the GPU mode. They are always going to return compute PCI class. We can actually hard code the class of MIG devices to be compute.

Yes, let's rather do that. It would simplify the implementation quite a bit.

Updated the implementation

elezar · 2024-06-24T16:23:48Z

internal/lm/nvml.go

+	for _, d := range devices {
+		class, err := d.GetPIEClass()
+		if err != nil {
+			return nil, err


Question: Do we want to treat errors in getting the class as fatal? This will crash GFD and cause NO labels to be generated. Should we rather return unknown as the label in this case?

I wanted to keep this behavior consistent with other labeler, that is the reason I am returning the error instead instead of labeling it unknown .

internal/resource/sysfs-device.go

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

elezar

Thanks @visheshtanksale.

I have some minor comments, but these can be addressed in a follow-up.

elezar · 2024-06-26T09:09:37Z

internal/lm/nvml.go

+	}
+	for _, class := range classes {
+		if class != classes[0] {
+			return "unknown"


Not a blocker for this PR, but we may want to log the content of classes here as a warning.

elezar · 2024-06-26T09:10:29Z

internal/lm/nvml_test.go

@@ -204,3 +204,89 @@ func TestSharingLabeler(t *testing.T) {
 		})
 	}
 }
+
+func TestGPUModeLabeler(t *testing.T) {


Thanks for adding the tests!

elezar · 2024-06-26T09:12:02Z

internal/resource/testing/resource-testing.go

@@ -51,6 +51,14 @@ func NewDeviceMock(migEnabled bool) *DeviceMock {
 		IsMigEnabledFunc:     func() (bool, error) { return migEnabled, nil },
 		IsMigCapableFunc:     func() (bool, error) { return migEnabled, nil },
 		GetMigDevicesFunc:    func() ([]resource.Device, error) { return nil, nil },
+		GetPCIClassFunc:      func() (uint32, error) { return 0x030000, nil },


nit: Should this return 0 by default since it's "unknown" / "undefined"?

elezar · 2024-06-26T09:12:39Z

tests/expected-output-mig-single.txt

@@ -30,4 +30,5 @@ nvidia\.com\/gpu\.engines\.jpeg=[0-9]+
 nvidia\.com\/gpu\.engines\.ofa=[0-9]+
 nvidia\.com\/gpu\.slices\.gi=[0-9]+
 nvidia\.com\/gpu\.slices\.ci=[0-9]+
+nvidia\.com\/gpu\.mode=[unknown|compute|graphics]


nit: this can only be compute now, correct? Not a blocker.

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

visheshtanksale requested review from elezar and jojimt June 14, 2024 06:59

visheshtanksale force-pushed the main branch from e2c0186 to d92b786 Compare June 14, 2024 07:01

Adding GFD node labels to specify GPU mode

b37dd9c

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

visheshtanksale force-pushed the main branch from 08524f3 to b37dd9c Compare June 14, 2024 07:36

elezar reviewed Jun 14, 2024

View reviewed changes

internal/resource/cuda-device.go Outdated Show resolved Hide resolved

elezar reviewed Jun 14, 2024

View reviewed changes

visheshtanksale changed the title ~~Add support for node labels to specify GPU mode~~ Add support for node labels to report GPU mode Jun 14, 2024

Using class of the device to add the GPU mode label

03f72f9

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

elezar requested changes Jun 17, 2024

View reviewed changes

visheshtanksale added 5 commits June 18, 2024 06:58

Updating GPU mode labeler

ec8a1ba

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

Updating go-nvml version

3991e16

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

Merge branch 'main' into main

e868871

Renaming GetClass method to GetPIEClass

2f2541e

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

Changing signature of GetPIEClass method

4e2818b

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

elezar requested changes Jun 24, 2024

View reviewed changes

visheshtanksale added 4 commits June 24, 2024 22:38

Adding Unit test and fixing typos

ac2cb43

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

Updating README with details about GPU modes

a2e6a7c

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

Updating the method to get class for MIG devices

82c22b6

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

Merge branch 'main' into main

8f10948

elezar approved these changes Jun 26, 2024

View reviewed changes

Adding log messages and updating unit test cases

7d0e252

Signed-off-by: Vishesh Tanksale <vtanksale@nvidia.com>

visheshtanksale merged commit 35ad180 into NVIDIA:main Jun 27, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for node labels to report GPU mode #768

Add support for node labels to report GPU mode #768

visheshtanksale commented Jun 14, 2024

elezar left a comment

elezar Jun 14, 2024

visheshtanksale Jun 17, 2024

elezar Jun 17, 2024

elezar Jun 14, 2024

visheshtanksale Jun 17, 2024

elezar Jun 24, 2024

visheshtanksale Jun 25, 2024

jojimt commented Jun 14, 2024

visheshtanksale commented Jun 17, 2024

elezar left a comment

elezar Jun 24, 2024

visheshtanksale Jun 24, 2024

elezar Jun 24, 2024

elezar Jun 24, 2024

visheshtanksale Jun 24, 2024

visheshtanksale Jun 25, 2024

elezar Jun 24, 2024

elezar Jun 24, 2024

visheshtanksale Jun 24, 2024

elezar Jun 25, 2024

visheshtanksale Jun 25, 2024

elezar Jun 24, 2024

visheshtanksale Jun 24, 2024

elezar left a comment

elezar Jun 26, 2024

visheshtanksale Jun 27, 2024

elezar Jun 26, 2024

elezar Jun 26, 2024

visheshtanksale Jun 27, 2024

elezar Jun 26, 2024

visheshtanksale Jun 27, 2024

Add support for node labels to report GPU mode #768

Add support for node labels to report GPU mode #768

Conversation

visheshtanksale commented Jun 14, 2024

elezar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jojimt commented Jun 14, 2024

visheshtanksale commented Jun 17, 2024

elezar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elezar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment