Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[nvmdev] fix bug in construction of parent PCI device #43

Merged
merged 2 commits into from
Jul 16, 2024

Conversation

cdesiniotis
Copy link
Contributor

When constructing NvidiaPCIDevice objects for each 'parent' device in the '/sys/class/mdev_bus' directory, use the default PCI devices root '/sys/bus/pci/devices'. All devices in '/sys/class/mdev_bus' will have a corresponding directory at '/sys/bus/pci/devices'.

Starting with bf3f431 the construction of the NvidiaPCIDevice object will fail when attempting to detect the physfn. When SRIOV is used, all the VFs will show up under '/sys/class/mdev_bus', but the physfn will only show up under '/sys/bus/pci/devices'.

When constructing NvidiaPCIDevice objects for each 'parent' device in the
'/sys/class/mdev_bus' directory, use the default PCI devices root
'/sys/bus/pci/devices'. All devices in '/sys/class/mdev_bus' will have
a corresponding directory at '/sys/bus/pci/devices'.

Starting with NVIDIA@bf3f431
the construction of the NvidiaPCIDevice object will fail when attempting to detect the physfn.
When SRIOV is used, all the VFs will show up under '/sys/class/mdev_bus', but the physfn will
only show up under '/sys/bus/pci/devices'.

Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
@cdesiniotis cdesiniotis marked this pull request as draft July 15, 2024 19:26
@klueska
Copy link
Contributor

klueska commented Jul 15, 2024

The commit you linked is a broken link. What was the change that broke things?

@cdesiniotis
Copy link
Contributor Author

bf3f431

@klueska
Copy link
Contributor

klueska commented Jul 15, 2024

@PiotrProkop can you take a look at this

@cdesiniotis
Copy link
Contributor Author

To provide more context, a call to nvmdev.GetAllParentDevices() results in the following error:

error getting all parent devices: error constructing NVIDIA parent device: failed to construct NVIDIA PCI device: unable to detect physfn for 0000:3b:00.4: unable to read PCI device vendor id for 0000:3b:00.0: open /sys/class/mdev_bus/0000:3b:00.0/vendor: no such file or directory"

0000:3b:00.0 is the PF for 0000:3b:00.4 (VF). 0000:3b:00.4 and all the other VFs will have entries at /sys/class/mdev_bus, but the PF does not.

Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
@cdesiniotis cdesiniotis marked this pull request as ready for review July 15, 2024 21:36
@PiotrProkop
Copy link
Contributor

Good catch!
/lgtm

@cdesiniotis cdesiniotis merged commit d3091e7 into NVIDIA:main Jul 16, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants