Wrap all opaque types as interfaces#112
Conversation
f85ef81 to
7e28d72
Compare
elezar
left a comment
There was a problem hiding this comment.
Initial change look good. The main question I have is whether to top-level library functions that operate ON other types make sense, or whether we should always use the functions off the types?
pkg/nvml/api.go
Outdated
| GetVgpuDriverCapabilities(Capability VgpuDriverCapability) (bool, Return) | ||
| } | ||
|
|
||
| // Define package level methods as aliases to Interface methods of libnvml |
There was a problem hiding this comment.
Do we have toolking to generate this automatically?
There was a problem hiding this comment.
Not at the moment. It was fairly easy to do though. I just grepped for func (l* library) copied that in and did some search and replace on it.
There was a problem hiding this comment.
The tool has been added in the latest revision.
| var DeviceGetComputeRunningProcesses = deviceGetComputeRunningProcesses_v1 | ||
| var DeviceGetGraphicsRunningProcesses = deviceGetGraphicsRunningProcesses_v1 | ||
| var DeviceGetMPSComputeRunningProcesses = deviceGetMPSComputeRunningProcesses_v1 | ||
| var deviceGetComputeRunningProcesses = deviceGetComputeRunningProcesses_v1 |
There was a problem hiding this comment.
Maybe a silly question: Can you remind me why we don't have an nvml prefix here?
There was a problem hiding this comment.
The nvml "functions" in this block come from #defines in nvml.h. The underlying versioned functions that get assigned to these variables are generated as part of the bindings with the v1, v2, v3, etc. prefixes and then assigned in updateVersionedSymbols below.
The difference with these functions is that we need to wrap the underlying nvml*_v1, nvml*_v2, etc. functions with some extra, version specific logic. Because of this, we create a level of indirection before assinging them back out to the "unversioned" variable.
pkg/nvml/api.go
Outdated
| DeviceGetVgpuSchedulerCapabilities(Device Device) (VgpuSchedulerCapabilities, Return) | ||
| GpuInstanceGetComputeInstancePossiblePlacements(GpuInstance GpuInstance, Info *ComputeInstanceProfileInfo) ([]ComputeInstancePlacement, Return) | ||
| GpuInstanceCreateComputeInstanceWithPlacement(GpuInstance GpuInstance, Info *ComputeInstanceProfileInfo, Placement *ComputeInstancePlacement, ComputeInstance *ComputeInstance) Return | ||
| GpuInstanceCreateComputeInstanceWithPlacement(GpuInstance GpuInstance, Info *ComputeInstanceProfileInfo, Placement *ComputeInstancePlacement) (ComputeInstance, Return) |
There was a problem hiding this comment.
Is this a bug? Does it make sense to do this in a separate PR so as to not have it go "missing" in the changelog?
There was a problem hiding this comment.
Yes, it is a bug. However, if I do it as a separate PR, then I can't make the updates in the commit that follows to turn ComputeInstance into an Interface. I'd rather have all the interface conversion in a single PR. I could look at maybe updating it in a "pre" PR, but I don't know how hard it will be to rebase if I do that.
There was a problem hiding this comment.
I'm happy to leave this in this PR as long as we remember to call it out in the release notes once we tag a new version.
There was a problem hiding this comment.
I will make sure and call it out in the release notes
7e28d72 to
81e362c
Compare
I'd be fine returning concrete types for everything (including the top level library) and moving the API definitions to the |
No, that's not what I meant -- although it is a valid question. I meant that we have, as an example, three functions that do the same thing: My question was whether we would consider moving towards requiring the use of |
I would be inclined to keep them since that is what matches the actual NVML API. That said, we should add tooling to autogenerate the top level Interface as well as the package level aliases from the functions hanging off the |
8cfa643 to
c4bbc89
Compare
elezar
left a comment
There was a problem hiding this comment.
Some minor comments on the generation toolking to start with.
gen/nvml/generateapi.go
Outdated
|
|
||
| func getWriter(outputFile string) (io.Writer, func() error, error) { | ||
| if outputFile == "-" { | ||
| return os.Stdout, nil, nil |
There was a problem hiding this comment.
Instead of nil as the "Closer" can we return:
func() error { return nil }
then we don't need to check for nil at the close site?
Alternatively, we can return a io.WriteCloser and implement a noopCloser that we combine with os.Stdout:
type noopCloser struct {
io.Writer
}
func (n *noopCloser) Close() error {
return nil
}
And then our return statement becomes:
return &noopCloser{Writer: os.Stdout}, nil
(and we can just return os.Create(outputFile) directly in the file case).
There was a problem hiding this comment.
I went with the first suggestion
There was a problem hiding this comment.
As you know, I am not a fan of "noop code" and prefer to be explicit in the caller (which is why I hesitated even changing this at all), but this is minor, so its fine.
gen/nvml/generateapi.go
Outdated
| return nil | ||
| }) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("walking the embed.FS path: %w\n", err) |
There was a problem hiding this comment.
Leftover from an old iteration, removing
217cd99 to
b02ebca
Compare
elezar
left a comment
There was a problem hiding this comment.
Minor comment about where we put our Mock sytems implementations.
There was a problem hiding this comment.
As a thought. Does it make sense to have the "implementations" under something like mock/system or mock/platform?
There was a problem hiding this comment.
What value does the extra level of nesting bring? To me it seems better to say mock.DGXA100Server or mock.A100Device over system.A100Device or platform.*.
There was a problem hiding this comment.
I'd be happy with a dgxa100 folder though. Then we could change the names to dgxa100.Server and dgxa100.Device, dgxa100.GpuInstance, etc.
There was a problem hiding this comment.
I made this change -- let me know what you think
There was a problem hiding this comment.
What value does the extra level of nesting bring?
It's more about organisation than anything else. The mocks themselves are all generated wherease the systems are implemented.
b02ebca to
0d26ac8
Compare
Signed-off-by: Kevin Klues <kklues@nvidia.com>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
As part of this, redefine all package level methods as methods hanging off of the 'library' type and create aliases for all package level methods to those methods hanging off of the default 'libnvml' instance of the 'library' type. Signed-off-by: Kevin Klues <kklues@nvidia.com>
0d26ac8 to
8d282c0
Compare
8d282c0 to
f343a06
Compare
f343a06 to
3937a82
Compare
7b57959 to
abf8e8e
Compare
| } | ||
|
|
||
| // Update the errorStringFunc to point to nvml.ErrorString | ||
| errorStringFunc = nvmlErrorString |
There was a problem hiding this comment.
Do we need to set this to the default once we close the library?
elezar
left a comment
There was a problem hiding this comment.
Thanks. Looks good.
Minor comment regarding the error string.
| } | ||
| return nvmlErrorString(r) | ||
| func (l *library) ErrorString(r Return) string { | ||
| return r.Error() |
There was a problem hiding this comment.
We either need to keep the lookup logic as before or reset the default if the library is unloaded.
There was a problem hiding this comment.
I can't leave it as before because in the r.Error() and r.String() methods I don't have access to l. I will update it reset the function on unload.
pkg/nvml/lib.go
Outdated
| l.dl = dl.New(o.path, o.flags) | ||
| } | ||
|
|
||
| func (l *library) GetExtendedInterface() ExtendedInterface { |
There was a problem hiding this comment.
I'm not sold on GetExtendedInterface and would prefer GetExtensions(), but this is not a blocker. I'm happy to keep it as is and revisit this later once we have some usage examples.
There was a problem hiding this comment.
Updated as suggested.
There was a problem hiding this comment.
Actually, I went for a hybrid -- the interface is called ExtendedInterface to be symmetrical with Interface, but the method call is Extensions() (which is the only actual symbol anyone will see in practice), e.g.:
nvml.Extensions().LookupSymbol("symbol")
or
envml := nvml.Extensions()
envml.LookupSymbol("symbol")
Its possible that the type itself might be embedded in a struct, but I only see that happening in testing (if at all) and I like the symmetry of that with Interface for the core API.
There was a problem hiding this comment.
That sounds fine. I'm not too concerned about the type name, since as you mention that isn't visible at the call site.
Signed-off-by: Kevin Klues <kklues@nvidia.com>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
A new ComputeInstance should be returned, not passed in by reference Signed-off-by: Kevin Klues <kklues@nvidia.com>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
A new GpmSample should be returned, not passed in by reference. As part of this, add methods to hang of of the GpmSample type (which were missing previously). Signed-off-by: Kevin Klues <kklues@nvidia.com>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
Signed-off-by: Kevin Klues <kklues@nvidia.com>
We should eventually expand this package to include a unified mock server like we have in the go-nvlib testing and mig-parted testing. Signed-off-by: Kevin Klues <kklues@nvidia.com>
This code was pulled over (mostly) directly from 'mig-parted' Signed-off-by: Kevin Klues <kklues@nvidia.com>
The methods in this interface represent extensions to the core NVML API that are only accessible through calling GetExtensions() against the Interface in use (or at the package level for the default interface). Signed-off-by: Kevin Klues <kklues@nvidia.com>
abf8e8e to
1fa43fd
Compare
No description provided.