-
Notifications
You must be signed in to change notification settings - Fork 374
[RFC] Support multiple isolation mechanisms using Kata #1082
Comments
/cc @tallclair |
In the case of containerd-shim-v2 today the options has a binary name but no configuration. It may be worthwhile to explore supporting passed of additional arguments for each plugin.
|
@mcastelino The Kata runtime can define their own options, e.g. If everyone thinks that an opaque string list works best for Kubernetes other than a well-defined struct, we can always add a general |
I don't see the point in adding As mentioned by @Random-Liu, there is no need for containerd to know about the options format, but instead we can simply assume the shim implementation will know what to do with it. That being said, we need to define a common behavior about what to do from CRI-containerd and CRI-O with those extra parameters, and how to handle them. |
@sboeuf as indicated in original issue the preferred choice is option 4. Not option 3. So I think you are agreeing? And yes, we need to unify the behavior across CRI-containerd and CRI-O for option 4. |
@Random-Liu that makes sense. The only small issue I see here is that this helper needs to be modified each time a new runtime comes along that may want its own custom options. |
I would argue that for flexibility some k8s users will want to run pods with different isolation mechanisms (qemu, firecracker, etc). On the other hand, I think just having the config option either in containerd or CRIO is simpler. What's the argument behind not letting this be handled in a common |
@raravena80 not sure I understand what you mean by "Support a single isolation mechanism at a time". Are you suggesting a given node should only use one type of isolation mechanism? That is possible even with this proposal. But then you need to start tagging nodes at the kubernetes layer. All this proposal is trying to do it allow the flexibility for kata to expose multiple isolation mechanisms on the same node without duplicating binaries. This paired with an admission control policy would allow us to choose the most restrictive isolation mechanism that kata can provide while satisfying the needs of a given POD. Addressing your other concern, the reason for multiple configuration files, one per isolation mechanism is to address the fact that each isolation mechanism has different limitations and profiles. And the toml file today contains tuneable sections for other kata components besides the hypervisor. For example you would typically run kata with macvtap networking and firecracker with tcfilter for optimal performance. |
Tracking the current CRI-O proposal. We need to unify what we do across both. |
Chatted with @mcastelino. The answer here is that given the example of firecracker and qemu, we can allow flexibility for the user or cluster operator to run on nodes with firecracker and qemu support, and also run on nodes with only firecracker or only qemu support. Possibly, having |
@Random-Liu It seems |
Yes sorry I missed the fact that 3 and 4 were two different approaches... And yes I agree 4 is the best approach here. |
@kata-containers/runtime |
@lifupan maybe I'm misunderstanding here, but I thought we would introduce to CRI-containerd and CRI-O the proper implementation to know what to do with a new field |
@mcastelino Option 4 is the cleanest one imho. One thing I'd like us to improve is the ability to have per hypervisor sections in our |
+1 for 4 and +1 for finding a way to reduce duplication: now might be a good time to start re-assessing our current configuration handling. We could explore the possibility of supporting re-assembling config fragments into a complete file. This isn't fully fleshed out, but we could do something like:
The runtime would read
Alternatively, we could look at creating a more declarative configuration language where users would not explicitly specify the hypervisor by name, they'd somehow specify the behaviours / constraints they want and Kata would DTRT (tm) and determine an appropriate configuration. We'd clearly need to handle the scenario where
... users could say something like:
The runtime would then resolve these values (via a new |
@sboeuf Yes, the Options is specific to kata, but it's opaque to containerd/cri, containerd will pass it to kata shimv2 blindly and we can parse it in kata shimv2 side. By this, we can not only pass the confFile, we can even pass other options if needed such as the "hypervisor type" if we want to support much more hypervisors in a a configure file just as @sameo said above. The containerd's config file will be configured as below: [plugins.cri.containerd.runtimes.kata] |
@sameo @jodh-intel the issue with this approach would be that all hypervisors would be forced to use the same setup for all other components. This could end up being a limiting factor as there may be hypervisors which would need multually incompatible options. Say tcmirror vs macvtap or something else. Hence having a fully defined environment per hypervisor would be better to make this future proof. |
@mcastelino - I'm suggesting that if the runtime is invoked with, say,
Where,
The common file could set Isn't that what you want? |
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
containerd/cri's different runtime handlers can pass different config files to shimv2 by a generic runtime options, by this kata can launch the pods using different VMM for different runtime handlers. Fixes:kata-containers#1082 Signed-off-by: Fupan Li <lifupan@gmail.com>
Description of problem
Kata recently added support for the Firecracker VMM. What that means is that we can now support different hypervisors in Kata.
However it is more than just that. Firecracker is a VMM, but in reality it is a different enough that it can be considered a different isolation mechanism even though the underlying hardware framework is VT-x.
For example the resource profile/model of a Firecracker POD will be significantly different. This is in addition to any security differences (i.e. sandbox capabilities) and limitations.
We have multiple ways of exposing these isolation mechanisms to the end user of kubernetes
Single runtime with different annotations within the POD (which becomes more kata specific).
This may not be ideal just like non standard annotations that came before.
Expose it directly as different runtime classes.
This means that there may have to be two different kata binaries with different configurations, or the same kata binary that chooses a different configuration based on argv[0] (like busybox).
So
kata-runtime-firecracker
would chooserconfiguration-firecracker.toml
as kata already has support forNote: Both 1 and 2 are very kata specific.
Note: 3 is requires changes to both kubernetes and crio/containerd.
Hence the crio.conf would look like
And the RuntimeHandler would need to be enhanced along the lines of
https://github.com/kubernetes-sigs/cri-o/blob/master/oci/oci.go#L125
Note: 4, would not need any changes to kubernetes. It is also generic and not kata specific.
The text was updated successfully, but these errors were encountered: