Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add imex mode to CDI spec generation #807

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

elezar
Copy link
Member

@elezar elezar commented Nov 21, 2024

This change adds a imex mode to CDI spec generation. This mode detected generates CDI specifications for existing IMEX channels. By default these devices have the fully qualified CDI device names:

nvidia.com/imex-channel={{ .ID }}

@@ -183,17 +183,6 @@ func (m command) validateFlags(c *cli.Context, opts *options) error {
return fmt.Errorf("invalid output format: %v", opts.format)
}

opts.mode = strings.ToLower(opts.mode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question -- why was this removed? Also, is there a reason for not documenting all of these valid modes in the --discovery-mode usage string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was removed because I didn't want to maintain the list in two places. Let me pull in some changes that I have in another commit to improve things


// GetSpec returns a CDI spec for all available IMEX channels.
func (l *imexlib) GetSpec() (spec.Interface, error) {
return nil, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not implemented?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually never call GetSpec at this level and always call GetSpec of the wrapper. Definitely needs a cleanup, but that refactor is out of scope for this PR.

cdiVersion: 0.5.0
containerEdits:
env:
- NVIDIA_VISIBLE_DEVICES=void
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question -- where in the code is this set? Are we always adding this envvar to our specs regardless of the mode (i.e. nvml, wsl, imex)?

Copy link
Member Author

@elezar elezar Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is set in the wrapper GetSpec in lib.go. I think these changes call out that this may not be the best place to do it since it seems a little out of place. The intent is to ENSURE that we don't inadvertently trigger runtime or hook injection when we're already applying changes using CDI.

update: It's in GetCommonEdits:

func (m *wrapper) GetCommonEdits() (*cdi.ContainerEdits, error) {
edits, err := m.Interface.GetCommonEdits()
if err != nil {
return nil, err
}
edits.Env = append(edits.Env, image.EnvVarNvidiaVisibleDevices+"=void")
return edits, nil
}

@elezar elezar force-pushed the add-cdi-imex-channels branch from e827f76 to 3384ce2 Compare November 25, 2024 12:41
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change adds a imex mode to CDI spec generation. This mode detected
generates CDI specifications for existing IMEX channels. By default these
devices have the fully qualified CDI device names:

nvidia.com/imex-channel=<ID>

Signed-off-by: Evan Lezar <elezar@nvidia.com>
@elezar elezar force-pushed the add-cdi-imex-channels branch from 3384ce2 to 8603d60 Compare November 25, 2024 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants