-
Notifications
You must be signed in to change notification settings - Fork 129
Conversation
@bergwolf The config field in the sandbox is needed for the new clear containers agent, that is based on libcontainer. It needs to get an OCI config file to set the container inside the VM. |
@sameo The StartSandbox() operation is supposed to set pod level configurations (hostname and dns) inside the guest, while AddContainer() is the one adding new containers to the sandbox. If libcontainer needs the OCI spec details, how about passing it through AddContainer() instead? The idea is to separate container creation from sandbox creation. It seems libcontainer also has its own config and converts OCI spec to it in specconv.CreateLibcontainerConfig(). Is it the one requiring OCI spec in the new agent? |
4a089f2
to
3033895
Compare
Yes, that makes sense. I updated the PR.
Yes. |
3033895
to
2d94857
Compare
hyperstart/api/grpc/hyperstart.proto
Outdated
rpc WaitProcess(WaitProcessRequest) returns (WaitProcessResponse); // wait & reap like waitpid(2) | ||
rpc CreateContainer(CreateContainerRequest) returns (google.protobuf.Empty); | ||
rpc StartContainer(StartContainerRequest) returns (google.protobuf.Empty); | ||
rpc Exec(ExecRequest) returns (google.protobuf.Empty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to combine StartContainer and Exec just like above AddProcess does. Is there any special requirement to separate starting container and exec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we can merge both StartContainer()
and Exec()
into StartProcess()
since that's what this is. We have CreateContainer()
which really means that we are creating a container but without starting any process in it. And we could have StartProcess()
having a flag to make the difference between the container process or any other process that we want to start.
But at the same time, I think it might be okay to have separate functions for each case since there are 2 distincts use cases. The question is, do we mind having some kind of overlapping behaviors ? I think it's okay, the agent implementation will rely on the same code for the main part of those 2 functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bergwolf From a code perspective, they will look very similar, but from an API perspective they have different semantics (Their arguments are even different, you don't need to specify a Process when starting an already created container). So I think the API is clearer and more explicit with those 2 separated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hyperstart/api/grpc/hyperstart.proto
Outdated
// config is a marshalled OCI configuration JSON file (a.k.a. config.json). | ||
// This is optional and can be used by agents implementing the hyperstart protocol to apply | ||
// any resource or security constraints defined by this configuration. | ||
string config = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer defining the desired fields in the message type Container&Process.
I don't want an embedded string with another language(json).
We might not use the OCI configuration JSON in the hyperstart since we will use libcontainer directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about simply putting the OCI specs.Spec and specs.Process in place (with named fields instead of dummy string)? libcontainer in the new agent can use them directly and we do not have to modify hyperstart protocol every time when we want to add new supported fields. Of course it would add a dependency on OCI spec versions but I think it is OK since we want to be OCI compliant anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oci.proto sounds good. But it should be along with a verification code(or tool) to prove that the oci.proto and the imported OCI specs are fully matched.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laijs @bergwolf I'm implementing oci.proto
and because of gRPC Go conversion assumptions, we will have to handle 2 different structures:
The gRPC generated Go structure:
type Spec struct {
Version string `protobuf:"bytes,1,opt,name=Version" json:"Version,omitempty"`
Process *OCIProcess `protobuf:"bytes,2,opt,name=Process" json:"Process,omitempty"`
Root *Root `protobuf:"bytes,3,opt,name=Root" json:"Root,omitempty"`
Hostname string `protobuf:"bytes,4,opt,name=Hostname" json:"Hostname,omitempty"`
Mounts []*Mount `protobuf:"bytes,5,rep,name=Mounts" json:"Mounts,omitempty"`
Hooks *Hooks `protobuf:"bytes,6,opt,name=Hooks" json:"Hooks,omitempty"`
Annotations map[string]string `protobuf:"bytes,7,rep,name=Annotations" json:"Annotations,omitempty" protobuf_key:"bytes,1,opt,name=key" protobuf_val:"bytes,2,opt,name=value"`
Linux *Linux `protobuf:"bytes,8,opt,name=Linux" json:"Linux,omitempty"`
}
The OCI Spec Go structure:
type Spec struct {
Version string `json:"ociVersion"`
Process *Process `json:"process,omitempty"`
Root Root `json:"root"`
Hostname string `json:"hostname,omitempty"`
Mounts []Mount `json:"mounts,omitempty"`
Hooks *Hooks `json:"hooks,omitempty" platform:"linux,solaris"`
Annotations map[string]string `json:"annotations,omitempty"`
Linux *Linux `json:"linux,omitempty" platform:"linux"`
Solaris *Solaris `json:"solaris,omitempty" platform:"solaris"`
Windows *Windows `json:"windows,omitempty" platform:"windows"`
}
As you can see those are different structures and we will have to go through the following steps:
runtime (host) -> OCI spec structure ---- conversion ----> gRPC spec structure -> gRPC binary -> gRPC structure --- conversion ----> OCI spec structure -> agent (guest)
So by not tunnelling a marshalled OCI spec through gRPC, we're adding 2 additional conversion layers: from OCI to gRPC (on the host) and then from gRPC to OCI (on the guest).
Are we all ok with that ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @jodh-intel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bergwolf Here is mine: https://gist.github.com/sameo/d8dc8c8720a67ae92100fc6d75f78987
For all non scalar fields, gRPC
will generate a pointer to the struct. So if you create the Mount
message in your test, you will see that Spec
becomes:
type Spec struct {
Version string `protobuf:"bytes,1,opt,name=Version" json:"Version,omitempty"`
Process *OCIProcess `protobuf:"bytes,2,opt,name=Process" json:"Process,omitempty"`
Root *Root `protobuf:"bytes,3,opt,name=Root" json:"Root,omitempty"`
Hostname string `protobuf:"bytes,4,opt,name=Hostname" json:"Hostname,omitempty"`
Mounts []*Mount `protobuf:"bytes,5,rep,name=Mounts" json:"Mounts,omitempty"`
Hooks *Hooks `protobuf:"bytes,6,opt,name=Hooks" json:"Hooks,omitempty"`
Annotations map[string]string `protobuf:"bytes,7,rep,name=Annotations" json:"Annotations,omitempty" protobuf_key:"bytes,1,opt,name=key" protobuf_val:"bytes,2,opt,name=value"`
Linux *Linux `protobuf:"bytes,8,opt,name=Linux" json:"Linux,omitempty"`
}
and for example you get Mounts []*Mount
as opposed to Mounts []Mount
for the OCI spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laijs @bergwolf What about something more structured than my initial proposal:
diff --git a/hyperstart/api/grpc/hyperstart.proto b/hyperstart/api/grpc/hyperstart.proto
index 7a062ad..2041cec 100644
--- a/hyperstart/api/grpc/hyperstart.proto
+++ b/hyperstart/api/grpc/hyperstart.proto
@@ -35,10 +35,25 @@ service HyperstartService {
message CreateContainerRequest {
Container container = 1;
Process init = 2;
- // config is a marshalled OCI configuration JSON file (a.k.a. config.json).
- // This is optional and can be used by agents implementing the hyperstart protocol to apply
- // any resource or security constraints defined by this configuration.
- string config = 3;
+
+ // spec is the container specification.
+ // With OCI its payload will be a marshalled config.json.
+ Spec spec = 3;
+}
+
+message Spec {
+ enum SpecType {
+ OCI = 0;
+ }
+
+ // type is the specification type. For now we only support OCI.
+ SpecType type = 1;
+
+ // version is the specification version
+ string version = 2;
+
+ // payload is the marshalled specification payload, e.g. config.json.
+ string payload = 3;
}
message StartContainerRequest {
string container = 1; | ||
message StartContainerRequest { | ||
string container_id = 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on separating AddContainer() into CreateContainer and StartContainer().
hyperstart/api/grpc/hyperstart.proto
Outdated
rpc StartContainer(StartContainerRequest) returns (google.protobuf.Empty); | ||
rpc Exec(ExecRequest) returns (google.protobuf.Empty); | ||
rpc Signal(SignalRequest) returns (google.protobuf.Empty); | ||
rpc Wait(WaitRequest) returns (WaitResponse); // wait & reap like waitpid(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The names with "Process" would be clearer.
How about ExecProces()(renamed from AddProcess()) SignalProcess() WaitProcess()(unchanged)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laijs Fine with me. Done.
747f410
to
fea976b
Compare
fea976b
to
5c9137b
Compare
hyperstart/api/grpc/hyperstart.proto
Outdated
message AddProcessRequest { | ||
string container = 1; | ||
message ExecProcessRequest { | ||
string container_id = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought about the syntax. Why don't we use containerID
here instead of container_id
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because protoc
translate foo_id
into fooID
when being told to compile a .proto file to Go.
See hyperstart.pb.go in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks
Container container = 1; | ||
Process init = 2; | ||
Spec OCI = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we need this in order to give all the information related to the container.
However, I am not very familiar with the gRPC code/protocol and I was wondering why we need to specify this Spec in details ? I mean we could use an array of bytes (basically a string) as the third parameter here. From the runtime, we marshal whatever Spec structure we want and from the agent, we unmarshal based on the same structure. I am mentioning this because having the entire spec described here means that we would have to change the protocol if we wanted a new field in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth pointing out here that this representation implies that the agent process must be updated "in lock step" with (at the same time as) the other system gRPC elements. For example, if oci.proto
is updated to support a new version of the OCI spec (and Spec.Version
is increased) that will clearly alter the format of the messages meaning the agent won't be able to unpack them unless it too has been updated to handle OCI.Version
. It won't know that though as, ironically, it won't be able to unpack the OCI
message to query Version
and check if it can handle that.
It also potentially implies that the agent might need to support older versions of the spec which don't actually change the representation of Spec
(but might for example have certain fields that are mandatory whereas a newer version of the spec makes them optional).
The above has an impact on system upgrade: it implies that if the gRPC format changes, all containers need to be stopped (or restarted) to pick up the new gRPC message formats (and associated assets including the new agent).
5c9137b
to
20c4326
Compare
cc @markdryan |
20c4326
to
55b5aa4
Compare
hyperstart/api/grpc/utils.go
Outdated
} | ||
|
||
for i := 0; i < to.NumField(); i++ { | ||
if err := copyValue(to.Field(i), from.Field(i)); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be copied based on the filed name instead of the index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laijs I thought about that option but I'd rather avoid it due to this documented bug:
https://golang.org/pkg/reflect/#pkg-note-BUG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since they can be decoded from json string, which conversion is based on the filed names/tags. I think the problem can be resolved here. (https://github.com/golang/go/blob/a3e013b0824ba53168b5d91abdb6ce191510a89d/src/encoding/json/encode.go#L1231)
I'm worried when the oci spec is changed in future, how would we maintain the compatibilities. (vs. older gRPC, vs. older OCI, and vs. newer OCI)? I seams that copying manually field by field is good. We can use this code, because we can change it to "manually copying" any time when any problem happen, or to a new method any time when we found one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laijs I'm not sure if you're asking for a change here or not. Implementing something similar to the JSON implementation means we would have to maintain additional metadata for all our fields, sort them through a similar function as well, etc...I believe this is an overkill.
I can add an additional check to verify that we're copying fields with the same name and error out when/if we're not. Any OCI config incompatibility would then be caught at unit testing time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laijs I changed the logic in copyStructValue
to verify that:
- We copy from fields with the same name
- If fields don't have the same name, then we go through a loop searching for the right field in the from structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laijs I think having this full OCI specification structure will be a pain to maintain. Don't you think that having only a string
field would be easier ? I mean we could pass the entire config.json file serialized as a JSON and we would unmarshal from the agent. That way, we just need to make sure that both runtime and agent rely on the same OCI spec version. I don't see any benefits in having this described into the gRPC protocol description here.
"reflect" | ||
|
||
"github.com/opencontainers/runtime-spec/specs-go" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is very nice to provide this utils. It is very useful.
hyperstart/api/grpc/utils.go
Outdated
toVal = toVal.Elem() | ||
fromVal = fromVal.Elem() | ||
|
||
if toVal.NumField() != fromVal.NumField() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check isn't really needed as it is performed again in copyStructValue.
hyperstart/api/grpc/utils.go
Outdated
if from.Type() != to.Type() && from.Type().ConvertibleTo(to.Type()) { | ||
to.Set(from.Convert(to.Type())) | ||
return nil | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case where the types are different and not convertible is not really handled. We'll fall into this if statement and presumably panic. Is this an error case we want to handle explicitly or is the intention to trap the panic at a higher level?
hyperstart/api/grpc/utils.go
Outdated
|
||
for j := 0; j < mapLen; j++ { | ||
if err := copyValue(to.Index(j), from.Index(j)); err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this code will panic. Value.Index(int) cannot be used on maps. There are MapIndex and SetMapIndex functions that can be used instead.
switch toKind { | ||
case reflect.Struct: | ||
return copyStructValue(to, from) | ||
case reflect.Slice: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to handle arrays as well? They have a different reflection type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markdryan We do not need to support arrays for now.
hyperstart/api/grpc/utils.go
Outdated
} | ||
|
||
func copyStruct(to interface{}, from interface{}) error { | ||
toVal := reflect.ValueOf(to) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to consider trapping panics and returning an error here just in case there's an error case that isn't explicitly handled by the code.
55b5aa4
to
3ff767b
Compare
Unit tests added and I also addressed @markdryan comments. |
81813bb
to
c5197c1
Compare
RestoreContainer(c *hyperstartapi.Container) error | ||
AddProcess(container string, p *hyperstartapi.Process) error | ||
ExecProcess(container string, p *hyperstartapi.Process) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the hyperstartapi here is actually hyperstart json + serial API.
We are going to covert this interface to use hyperstart grpc API directly.
But it needs more changes and needs to be suspended until the grpc API become a little more mature.
This is just for your information.
int64 Major = 1; | ||
|
||
// Minor is the device's minor number. | ||
int64 Minor = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test doesn't test the copying about it, which is required and important since LinuxWeightDevice in oci.Specs embeds linuxBlockIODevice as an anonymous field here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I fixed it now and the code supports copying embedded structures.
f8bdb53
to
359f5d3
Compare
hyperstart/api/grpc/oci.proto
Outdated
string Mems = 7; | ||
} | ||
|
||
message linuxBlockIODevice { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be dropped I think as the members are now part of LinuxWeightDevice
and LinuxThrottleDevice
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, good call. I will update oci.proto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jodh-intel Done
OCIProcess Process = 2; | ||
|
||
// Root configures the container's root filesystem. | ||
Root Root = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this (and all the other optional entries like ConsoleSize
, Mounts
, ...) be annotated with optional
to comply as closely as possible with the OCI spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jodh-intel The latest protobuf version (v3) no longer supports optional
:
https://developers.google.com/protocol-buffers/docs/reference/go-generated#fields
That's unfortunate, but seems it's been a long debated design decision among the protobuf community members.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - thanks. I wasn't aware of that.
Container container = 1; | ||
Process init = 2; | ||
Spec OCI = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth pointing out here that this representation implies that the agent process must be updated "in lock step" with (at the same time as) the other system gRPC elements. For example, if oci.proto
is updated to support a new version of the OCI spec (and Spec.Version
is increased) that will clearly alter the format of the messages meaning the agent won't be able to unpack them unless it too has been updated to handle OCI.Version
. It won't know that though as, ironically, it won't be able to unpack the OCI
message to query Version
and check if it can handle that.
It also potentially implies that the agent might need to support older versions of the spec which don't actually change the representation of Spec
(but might for example have certain fields that are mandatory whereas a newer version of the spec makes them optional).
The above has an impact on system upgrade: it implies that if the gRPC format changes, all containers need to be stopped (or restarted) to pick up the new gRPC message formats (and associated assets including the new agent).
Yes, that's correct. A few comments:
|
- Rename part the execution API - Add an OCI specification field to ContainerCreateRequest Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We use the reflection package to avoid implementing error prone manually converting routines. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We try to check that all supported types are properly converted. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The gRPC generated Go structures will never have embedded fields, while the OCI ones have some. This forces us to support copying between structs with different number of fields when one field is embedded on one side while it's not on the other. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
359f5d3
to
e33dd53
Compare
I've just found the |
@laijs Do you have any other concerns/comments on this PR? |
I think it in a good shape now, we could push it forward. |
Why is this still not merged ? |
could you remove the last grpc stream commit, and I merge it? @sameo |
@sboeuf I felt the API not completed. Example, it seems the storage APIs still missing. |
5f41cf0
to
e33dd53
Compare
@laijs Streams commit removed. |
@laijs @sameo @bergwolf @sboeuf @jodh-intel Based on our discussion on skype, I will send pr of the protocol to kata-containers/runtimes repo, which will help us accelerating the following discussion and development. I will take a snapshot of the protocol in this PR, and we could move our discussion to the correspond new PR. |
@gnawux thx |
based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 Signed-off-by: Wang Xu <gnawux@gmail.com>
based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 Signed-off-by: Wang Xu <gnawux@gmail.com>
based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and kata-containers/runtime#2
based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and kata-containers/runtime#2 Signed-off-by: Wang Xu <gnawux@gmail.com>
based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and kata-containers/runtime#2 Signed-off-by: Wang Xu <gnawux@gmail.com>
based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and kata-containers/runtime#2 Signed-off-by: Wang Xu <gnawux@gmail.com>
TODO: