Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

[RFC ]Initial gRPC protocol for agent communication #2

Closed
wants to merge 1 commit into from

Conversation

gnawux
Copy link
Member

@gnawux gnawux commented Nov 21, 2017

based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 .

and there are a few guys think it is better to put the protocol in an independent repo, what's your opinion?


message CreateContainerRequest {
Container container = 1;
Process init = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls replace the 2 fields above which container_id.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need the configuration of the process, Or has it already defined by the OCI spec?

string type = 1;
uint64 hard = 2;
uint64 soft = 3;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the above Container/Mount/Process/Rlimit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename User to StringUser.
StringUser is needed since the oci spec don't allow string user ids.

string Version = 1;

// Process configures the container process.
OCIProcess Process = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the OCI prefixes for OCIProcess, OCIMount, OCIUser...

based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628

Signed-off-by: Wang Xu <gnawux@gmail.com>
@gnawux
Copy link
Member Author

gnawux commented Nov 21, 2017

updated based on the above comments

@sameo
Copy link

sameo commented Nov 21, 2017

@gnawux You need to regenerate the *.pb.go files or else go test -v fails.

}

message ExecProcessRequest {
string container_id = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add StringUser here

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laijs This would be redundant with the Process.User field, wouldn't it ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quote from comments by @laijs above

StringUser is needed since the oci spec don't allow string user ids

correct this if anything wrong

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, to make container boot quickly, avoiding mounting the root block device on the host is very important.

Copy link

@sameo sameo Nov 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnawux @laijs the OCI spec passes a uint32 for both UID and GID, but that should be enough. Why would a uint32 not be good enough for a UID/GID and why would we absolutely need a string?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sboeuf That usename is used for windows only: opencontainers/runtime-spec@f9e48e0

It is a little weird if we reuse this variable. @gnawux do you accept it ?

Copy link
Contributor

@laijs laijs Nov 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always mount the block device inside the container to access to the /etc/passwd.

But we avoid do mount the block device in the host (for accessing to the /etc/passwd and filling the numerical user id in the spec).

  1. the host might not have the ability to mount it. for example, if the block device is from ceph, and the host doesn't have krbd.ko. And even the host has it, userspace ceph lib + qume are alway the best choice.

  2. Security, if the block device was once mounted inside the vm, the host should not mount it. the code in the vm might hack into the guest kernel, and modify the metadata of the filesystem of the block device. The host kernel might be also broken into when mounting the block device.

  3. performance: to speed up the starting of the container, hyper avoid mounting the block device on the host.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laijs not sure I follow everything here because I thought that /etc/passwd was passed through 9pfs through the shared directory. And in case /etc/passwd comes from the block device passed through the VM, it is mounted somewhere, meaning that you can specify libcontainer that you expect /etc/passwd to be mounted, right ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laijs just to be clear, I am not saying that I want to mount the block device on the host, I agree it would take some time. We don't do that either.
But the block device needs to be mounted inside the guest by the agent and then we can provide libcontainer with the right mounts so that it will mount /etc/passwd in the right place. This way, it would be able to know what 1000:1000 means for the user/group.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right.
Runtime --> agent --> libcontainer.
Runtime has to pass the string user name to agent if string name is specified by upper layer tools. Agent parses the string name from /etc/passwd and pass the digit user id to the libcontainer.
"Runtime --> agent" is the API, so we need the string user name in the API.

docker-runc does't allow the string user name, so the docker/moby does the parsing of the username. kata runtime cli will not allow the string user name either. But kata runtime can also be a lib for hyperd/frakti, the string user name will be passed to the agent in this case if requested.

rpc AddRoute(AddRouteRequest) returns (google.protobuf.Empty);
rpc OnlineCPUMem(OnlineCPUMemRequest) returns (google.protobuf.Empty);
}

Copy link
Contributor

@laijs laijs Nov 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message Storage {
    string driver = 1;   // empty in most cases. it will support "drbd", "bcache" ...
    string source = 2; // "/dev/sdb", "/dev/disk/by-scsi/xxxx", "none",...
    string fstype = 3; //"xfs", "ext4" etc. for block dev, or "9p" for shared fs, or  "tmpfs" for shared /dev/shm for all containers ...
    repeated string options = 4;
    string mount_point = 5; // mount_point is only visible by VM, not by containers. This mount point can be used on oci.Mount.Source as "/Stroage/mount/point/{rootfs|_data}".
}

I think this Storage + oci.Mount can reassemble current hyperstart's storage model.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laijs Would you mind describing use cases for this Storage payload?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sameo In short, for the backward compatibility to current hyperd/hyperstart.

The Storage derives from the volumes of hyperd and runv, in which we could insert a block device (the source, loop device or ceph rbd is ok as well) as a volume of container (as filesystem fstype, to the mount_point).

According to assumptions of the WIP CSI, the block devices would be inserted as block devices instead of mounted filesystems, i.e., for most current container images, there need a k8s init container to mount the block devices before the devices could be accessed.

The Storage struct here could be think as a shortcut for the volume init container behavior because the block devices are much more common in a virtualized world. Then we don't need run a volume init container before launch a container, especially for the case not working with k8s.

Any additional explanation, @laijs ?

Copy link
Contributor

@laijs laijs Nov 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnawux explained it in high level view. Here I focus on low level view:

We need the initial storage config for mounting the 9p which can be configured by this Storage in StartSandboxRequest.

oci.Spec can't ask you "mount /dev/sda1 mount to /kata/storage/xxxx and use /kata/storage/xxxx/rootfs as container rootfs".

oci.Spec can't ask you "mount /dev/sdd mount to /kata/storage/yyy and use /kata/storage/xxxx/_data as one of the container volume".

oci.Mount can't ask you mount a tmpfs on /kata/storage/shm, and bind it to all container's /dev/shm.

and more ....

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laijs I was about to ask for a shareDir parameter for StartSandbox since we need a way to pass some file and rootfs through 9pfs. Your Storage structure looks good because it is a generic way to pass some things to the VM through any type of filesystem that need to be shared accross all containers, and that are not hotpluggable actually.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, to give some more details here, we cannot completely rely on block devices to pass rootfs to the VM. In case of overlay, we don't have a block device and I don't think we want to spend a large amount of time preparing a block device based on this, it would consume too much time. Also I am not sure how it would work to reflect the changes on the block device on the overlay layers.
That's why having a mount point that can be shared when starting the VM would be the best option to support multiple cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laijs we have renamed the driver field to format in runV, and the content should be stuff like raw, qcow2, dir... we don't mind it is bcache device or a loop file if they could be treat as a block device.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use block devices for passing rootfs to the VM in several cases. Host storage drivers such as devmapper, (cow-)rawblock, ceph(hyper.sh) fit well into runv/hyperstart. In these cases, block devices containing container rootfs or volume are hotplug into the VM. In near future, DAX is going to be supported which also relies on block devices.

Only when the host storage drivers are overlayfs/aufs/btrfs, 9pfs is used for passing rootfs.
So both ways(block&sharedfs) are supported. @sboeuf

}

message CreateContainerRequest {
string container_id = 1;
Copy link
Contributor

@laijs laijs Nov 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ repeated Storage
+ StringUser

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with repeated Storage.


message StartSandboxRequest {
string hostname = 1;
repeated string dns = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ repeated Storage

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laijs Yes, we need that one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need it in both Create and Start API?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need it for the StartSandbox for storage endpoints that are shared across all containers and in CreateContainer for container specific ones.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm...

should we change StartSandbox to CreateSandbox? It looks a bit strange to start without a create. @sameo @laijs

Copy link

@sameo sameo Nov 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here StartSandbox is actually a create and start operation. But I'm fine if we rename it to CreateSandbox(). It's also consistent with DestroySandbox().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "Sandbox" has some runtime semantics and it is not persisted, which could be created as running, paused, or destroyed, and there should not be a created and stopped sandbox in our context.

Copy link

@sameo sameo Nov 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnawux Right. So what I'm saying is that I'm fine if we only have:

	rpc CreateSandbox(CreateSandboxRequest) returns (google.protobuf.Empty);
	rpc DestroySandbox(DestroySandboxRequest) returns (google.protobuf.Empty);

because in our case we're only going to create and destroy sandboxes.

Are you suggesting we should have:

	rpc CreateSandbox(CreateSandboxRequest) returns (google.protobuf.Empty);
	rpc StartSandbox(StartSandboxRequest) returns (google.protobuf.Empty);
	rpc StopSandbox(StopSandboxRequest) returns (google.protobuf.Empty);
	rpc DestroySandbox(DestroySandboxRequest) returns (google.protobuf.Empty);

?

@gnawux
Copy link
Member Author

gnawux commented Nov 22, 2017

@sameo I think I did, but I found someting different from the code you generated, which version of protoc are you using? And did you generated the .pb.go with the command line given in the hack/ dir?

#
# SPDX-License-Identifier: Apache-2.0
#
protoc --proto_path=pkg/agent/api/grpc --go_out=plugins=grpc:pkg/agent/api/grpc pkg/agent/api/grpc/hyperstart.proto pkg/agent/api/grpc/oci.proto
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you also want to add a -I=$GOPATH/src/github.com/google/protobuf/src/ to this command. Without it, protoc seems to be unable to find the Empty message:

$ hack/update-generated-agent-proto.sh 
google/protobuf/empty.proto: File not found.
hyperstart.proto: Import "google/protobuf/empty.proto" was not found or had errors.
hyperstart.proto:15:62: "google.protobuf.Empty" is not defined.
hyperstart.proto:16:60: "google.protobuf.Empty" is not defined.
hyperstart.proto:17:54: "google.protobuf.Empty" is not defined.
hyperstart.proto:18:58: "google.protobuf.Empty" is not defined.
hyperstart.proto:25:52: "google.protobuf.Empty" is not defined.
hyperstart.proto:26:56: "google.protobuf.Empty" is not defined.
hyperstart.proto:29:56: "google.protobuf.Empty" is not defined.
hyperstart.proto:30:60: "google.protobuf.Empty" is not defined.
hyperstart.proto:31:62: "google.protobuf.Empty" is not defined.
hyperstart.proto:32:48: "google.protobuf.Empty" is not defined.
hyperstart.proto:33:56: "google.protobuf.Empty" is not defined.

Am I missing something ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➜  runtimes git:(grpc_proto) protoc --version
libprotoc 3.5.0
➜  runtimes git:(grpc_proto) hack/update-generated-agent-proto.sh
➜  runtimes git:(grpc_proto)

Works on my Mac.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sameo Ah I know why you got the Empty errors.

My protoc is built from source and installed together with the include files.

If you download the prebuilt protoc, and only put the bin/protoc binary in your $PATH without install the include/ dir, you will get the import error.

However, I have not generated the code with the getters in the diff yet.

Copy link

@sameo sameo Nov 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnawux I installed protoc from my distro protobuf packages (FC27, protobuf and protobuf-compiler)

$ rpm -q protobuf-compiler
protobuf-compiler-3.3.1-2.fc27.x86_644

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sameo Then does it have an -devel or -dev like package?

And we need figure out why they generate different code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finally I solved the getter issue. I found an older protoc-gen-go in a higher priority position in my $PATH

@sameo
Copy link

sameo commented Nov 22, 2017

@gnawux I'm using:

$ protoc --version
libprotoc 3.3.1

And when generating the pb.go files with the hack/ script, I get the following diff https://gist.github.com/sameo/acb737c5370957afb96e5e7d905d1b24

@gnawux
Copy link
Member Author

gnawux commented Nov 22, 2017

@sameo I found the diff before I post the PR as well, and I checked many things but didn't get what's missing...

@laijs
Copy link
Contributor

laijs commented Nov 23, 2017

I think it seems better to create a new repo for the api, such as agent-api.

Or move this api into the agent repo.
There are 3 repos using the api: runtime, shim, agent.
runtime --api--> agent. shim --api--> agent. (runtime--spawn-->shim)
It seems agent is more appropriated to host the api than the runtime.

@sameo
Copy link

sameo commented Nov 23, 2017

@laijs @gnawux

It seems agent is more appropriated to host the api than the runtime.

I agree with that. @gnawux Would you mind sending this PR to kata-containers/agent instead?

@gnawux
Copy link
Member Author

gnawux commented Nov 23, 2017

@sameo no, @laijs and I discussed the location issue before, we notice the problems when I create the PR. or should we have an independent repo for protocols?

@sameo
Copy link

sameo commented Nov 23, 2017

@gnawux Let's use the agent repo as it's the one that makes the most sense for now.

@gnawux
Copy link
Member Author

gnawux commented Nov 23, 2017

Then will close this PR after update it to the agent repo

gnawux added a commit to gnawux/kata-agent that referenced this pull request Nov 23, 2017
based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and
kata-containers/runtime#2
gnawux added a commit to gnawux/kata-agent that referenced this pull request Nov 23, 2017
based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and
kata-containers/runtime#2

Signed-off-by: Wang Xu <gnawux@gmail.com>
gnawux added a commit to gnawux/kata-agent that referenced this pull request Nov 23, 2017
based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and
kata-containers/runtime#2

Signed-off-by: Wang Xu <gnawux@gmail.com>
gnawux added a commit to gnawux/kata-agent that referenced this pull request Nov 23, 2017
based on hyperstart gRPC protocol and the discussion in hyperhq/runv#628 and
kata-containers/runtime#2

Signed-off-by: Wang Xu <gnawux@gmail.com>
@gnawux
Copy link
Member Author

gnawux commented Nov 23, 2017

close this PR and moved the protocol code to kata-containers/agent repo

@gnawux gnawux closed this Nov 23, 2017
jodh-intel pushed a commit to jodh-intel/runtimes that referenced this pull request Mar 14, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants