Skip to content

FPGA: support CDI #1745

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 28, 2024
Merged

FPGA: support CDI #1745

merged 7 commits into from
May 28, 2024

Conversation

bart0sh
Copy link
Member

@bart0sh bart0sh commented May 22, 2024

This PR uses CDI support for Device Plugins to implement FPGA programming hooks.

CRI-O hooks are no longer needed and removed.

As CDI is currently supported by CRI-O and Containerd, this PR enables FPGA orchestration programmed operation mode for both most used CRI runtimes: CRI-O and Containerd. Previously it was only supported by CRI-O.

Ref: #1457

@bart0sh bart0sh force-pushed the PR155-fpga-support-CDI branch from 03c7322 to e81ad2a Compare May 22, 2024 10:16
@bart0sh bart0sh force-pushed the PR155-fpga-support-CDI branch 2 times, most recently from 37bfcf3 to 20e8676 Compare May 22, 2024 10:42
@bart0sh bart0sh force-pushed the PR155-fpga-support-CDI branch from 20e8676 to d1e5597 Compare May 22, 2024 11:57
@bart0sh bart0sh marked this pull request as ready for review May 22, 2024 12:14
@bart0sh bart0sh changed the title FPGA: support cdi FPGA: support CDI May 22, 2024
@bart0sh bart0sh force-pushed the PR155-fpga-support-CDI branch from d1e5597 to 1fa557e Compare May 22, 2024 12:59
@mythi
Copy link
Contributor

mythi commented May 22, 2024

@bart0sh thanks! quick question: I believe the word prestart is deprecated in the OCI runtime spec. would it make sense to get the new functionality to follow the official name createRuntime?

@bart0sh
Copy link
Member Author

bart0sh commented May 22, 2024

@bart0sh thanks! quick question: I believe the word prestart is deprecated in the OCI runtime spec. would it make sense to get the new functionality to follow the official name createRuntime?

Thanks for pointing out! I didn't know that. Will try to replace.

@bart0sh
Copy link
Member Author

bart0sh commented May 22, 2024

@mythi renamed in the code. Will update docs if/when all tests pass.

`prestart` hook is marked as deprecated in the OCI runtime spec:
https://github.com/opencontainers/runtime-spec/blob/main/config.md#posix-platform-hooks

Renamed `prestart` to the `createRuntime` as suggested in the spec.

Replaced `CDI hook` with `OCI hook` to be more clear. CDI is just a
way to update OCI config and theoretically there is no such thing as
CDI hook.
@bart0sh bart0sh force-pushed the PR155-fpga-support-CDI branch from cea73ac to e58369e Compare May 22, 2024 16:57
@bart0sh
Copy link
Member Author

bart0sh commented May 22, 2024

@mythi done: e58369e

Copy link
Contributor

@mythi mythi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bart0sh thanks so much! Only one thing I'm not fully sure: the API change for NewDeviceInfo(). IIRC with topology info, we added NewDeviceInfoWithTopologyHints() when we needed to allow user provided hints. Would NewDeviceInfoWithCDI() make any sense?

@mythi
Copy link
Contributor

mythi commented May 23, 2024

@mythi done: e58369e

looks good!

@bart0sh
Copy link
Member Author

bart0sh commented May 23, 2024

@tkatila Can you review this PR please?

Copy link
Contributor

@tkatila tkatila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

As this change removes the old prestart hook approach, should the README note that if one still wants to use that way, he or she must use <=0.30.0 version of the FPGA components?

Copy link
Contributor

@mythi mythi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bart0sh
Copy link
Member Author

bart0sh commented May 24, 2024

@tkatila

As this change removes the old prestart hook approach, should the README note that if one still wants to use that way, he or she must use <=0.30.0 version of the FPGA components?

New approach designed to be a replacement for the old one. It shouldn't require any configuration or other changes comparing to the previous one. I'd consider it as an internal change and wouldn't add anything to the README as it might confuse users. However, if you think it's needed, I'd be happy to do that. Just let me know.

@tkatila
Copy link
Contributor

tkatila commented May 24, 2024

New approach designed to be a replacement for the old one. It shouldn't require any configuration or other changes comparing to the previous one. I'd consider it as an internal change and wouldn't add anything to the README as it might confuse users. However, if you think it's needed, I'd be happy to do that. Just let me know.

In that case, I don't think it's required.

@tkatila
Copy link
Contributor

tkatila commented May 24, 2024

@uniemimu and/or @hj-johannes-lee can you review this? Seems to require one more approval.

@mythi
Copy link
Contributor

mythi commented May 24, 2024

It shouldn't require any configuration or other changes comparing to the previous one.

@bart0sh has it also been unconditionally enabled in kubelet since it was added?

@bart0sh
Copy link
Member Author

bart0sh commented May 27, 2024

@mythi yes, it's graduated to Beta in Kubernetes in 1.29 and since then it's enabled by default.

@mythi
Copy link
Contributor

mythi commented May 28, 2024

@mythi yes, it's graduated to Beta in Kubernetes in 1.29 and since then it's enabled by default.

In section "Configuring CRI runtimes" we add some comments how the feature is available. We don't say much about kubelet itself. In theory there's a gap but I don't think it's important at all since by the time we release this, the oldest k8s we "support" is 1.29 which has it enabled.

@bart0sh
Copy link
Member Author

bart0sh commented May 28, 2024

@mythi If we'll release this earlier, I'll add a notice about Kubelet version to the documentation as a separate PR.

@mythi @tkatila Is anything else still needed to merge this PR? I'm asking that this is a show-stopper for graduating the feature to GA in Kubernetes.

@tkatila
Copy link
Contributor

tkatila commented May 28, 2024

Nothing from my side. Good to go.

Copy link
Member

@kad kad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks ok, one small improvement that can be added: in documentation part, write down explicitly requirements for CDI mode: k8s,containerd/cri-o.

@tkatila tkatila merged commit 11c9753 into intel:main May 28, 2024
73 checks passed
@bart0sh
Copy link
Member Author

bart0sh commented May 28, 2024

@kad

in documentation part, write down explicitly requirements for CDI mode: k8s,containerd/cri-o.

CRI-O and Containerd are mentioned here. Would it make sense to add version info for CRI-O, Containerd and Kubernetes there? They only make sense for the hook, so it looks like a good place for me. WDYT?

@kad
Copy link
Member

kad commented May 28, 2024

@kad

in documentation part, write down explicitly requirements for CDI mode: k8s,containerd/cri-o.

CRI-O and Containerd are mentioned here. Would it make sense to add version info for CRI-O, Containerd and Kubernetes there? They only make sense for the hook, so it looks like a good place for me. WDYT?

there or in overall FPGA plugin README, stating that programming mode is available for systems with CDI enabled, which means k8s+containerd/cri-o of not less than....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants