Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

agent: add GetGuestDetails gRPC function #358

Merged
merged 1 commit into from
Sep 11, 2018

Conversation

linzichang
Copy link
Contributor

Fixes #354

@codecov
Copy link

codecov bot commented Sep 5, 2018

Codecov Report

Merging #358 into master will increase coverage by 0.28%.
The diff coverage is 60%.

@@            Coverage Diff             @@
##           master     #358      +/-   ##
==========================================
+ Coverage   46.92%   47.21%   +0.28%     
==========================================
  Files          15       15              
  Lines        2391     2442      +51     
==========================================
+ Hits         1122     1153      +31     
- Misses       1129     1140      +11     
- Partials      140      149       +9

@linzichang linzichang force-pushed the master branch 2 times, most recently from 91504c0 to 15cf4fb Compare September 5, 2018 08:07
@jodh-intel
Copy link
Contributor

Let's get input from others on whether this approach is a good one first.

/cc @sboeuf, @bergwolf, @jon.

@sboeuf
Copy link

sboeuf commented Sep 5, 2018

@linzichang Thanks for the PR, but it looks a bit hacky to me to use Check() to retrieve the amount of memory from the guest. I think introducing a new function dedicated to this would be more appropriate.

I like the approach since we would rely on actual data from the guest OS, instead of having some hardcoded value from each hypervisor implementation, depending on the architecture.

That being said, we need a fallback, by proposing some default values, in case the grpc function is not supported by the agent, or return a non valid value for some reason.

@linzichang
Copy link
Contributor Author

@sboeuf So you think we should:

  1. Add a new grpc function like getMemBlockSize.
  2. In kata-runtime side, give a default values if agent don't support or fail for getMemBlockSize.

@jodh-intel
Copy link
Contributor

@linzichang - I think that's what @sboeuf is saying, yes. I have to admit, I'm still unclear about the gRPC best practice we're encountering here - should we be adding as many discrete functions as we need? Or should we be trying to provide more general functions that could return the values we need.

In this case, the latter approach might be to create a new call like:

message GuestDetailsResponse {
    uint64 block_size_bytes = 1;

    // XXX: add more fields in future when we discover we actually need more guest information.
}

rpc GetGuestDetails() returns (GuestDetailsResponse);

Of course if each gRCP call took a significant amount of time, we'd be more motivated to go for the general approach. Although we now have tracing in the runtime and although that includes gRCP client flows, it's not that easy to determine how much time is attributable directly to the protocol handling itself. As noted in kata-containers/kata-containers#27 (comment), the first gRPC call (Check health check request) is extremely slow as I think it's waiting for the gRPC server in the agent to be available (aka waiting for the VM to boot)).

However, I added a second call to that essentially "empty" / NOP call in virtcontainers/kata_agent.go and that takes between 0.7ms and 1ms on an i5 NUC which gives a very crude estimation of the end-to-end gRPC call time.

/cc @grahamwhaley.

@linzichang
Copy link
Contributor Author

@jodh-intel Like we talk in #357 , it's very wonderful for some experts to give a gRPC design guide for our project. In this single case, we can just add a getGuestInfo to get this done.

@jodh-intel
Copy link
Contributor

Hi @linzichang - I agree, but I also think we're a little uncertain about the implications of gRPC changes. I've seen comments suggesting we should effectively "do all the gRPC changes 'now'" as the project is still changing rapidly". But at the same time, we're also being cautious to avoid breaking gRPC compatibility.

We now have new stable branches and things like kata-containers/documentation#196 to think about now along with upgrade / downgrade strategies.

I don't want us to stall on these sorts of issues but I do think we need clarification on such changes asap which suggests another discussion at the Architecture Committee meeting to me.

/cc @egernst, @bergwolf, @sboeuf, @grahamwhaley.

@bergwolf
Copy link
Member

bergwolf commented Sep 7, 2018

@linzichang Could you do this in a separate gRPC call? The check gRPC is too frequent for such a one time attribute query. What about something like,

diff --git a/protocols/grpc/agent.proto b/protocols/grpc/agent.proto
index 4c09d9c..40e10e4 100644
--- a/protocols/grpc/agent.proto
+++ b/protocols/grpc/agent.proto
@@ -54,6 +54,7 @@ service AgentService {
        rpc DestroySandbox(DestroySandboxRequest) returns (google.protobuf.Empty);
        rpc OnlineCPUMem(OnlineCPUMemRequest) returns (google.protobuf.Empty);
        rpc ReseedRandomDev(ReseedRandomDevRequest) returns (google.protobuf.Empty);
+       rpc QueryResource(QueryResourceRequest) returns (QueryResourceResponse);
 }

 message CreateContainerRequest {
@@ -350,6 +351,18 @@ message ReseedRandomDevRequest {
        bytes data = 2;
 }

+message QueryResourceRequest {
+       // MemBlockSize asks server to return the system memory block size that can be used
+       // for memory hotplug alignment. Typically the server returns what's in
+       // /sys/devices/system/memory/block_size_bytes.
+       bool mem_block_size = 1;
+}
+
+message QueryResourceResponse {
+       // MemBlockSizeBytes returns the system memory block size in bytes.
+       uint64 mem_block_size_bytes = 1;
+}
+
 // Storage represents both the rootfs of the container, and any volume that
 // could have been defined through the Mount list of the OCI specification.
 message Storage {

Then kata-runtime only needs to query when necessary. And the API is extensible for future needs if we want to query more resource attributes.

@linzichang
Copy link
Contributor Author

@bergwolf Yes, after discuss in this conversation, we all decide to make new gRPC call. I will rework on this PR.

@jodh-intel
Copy link
Contributor

Thanks @linzichang - sorry for the extra work as it was my suggestion to re-use (aka abuse) Check() :)

@bergwolf's - I like your suggested solution. btw, if you use the magic ```diff, you get pretty coloured diff output 😄

@linzichang linzichang changed the title agent: make Check grpc response return memory's block_size_bytes agent: add QueryResource gRPC function Sep 7, 2018
@bergwolf
Copy link
Member

bergwolf commented Sep 7, 2018

@linzichang Please fix CI as the mock server also needs to implement the new grpc API. Other than that it looks good to me. Thanks!

@linzichang linzichang force-pushed the master branch 2 times, most recently from 485c6f4 to c215570 Compare September 10, 2018 02:17
@linzichang
Copy link
Contributor Author

Fixed CI and added a go test.

grpc_test.go Outdated
MemBlockSize: true,
}

_, err := a.QueryResource(context.TODO(), req)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about also comparing the return value with /sys/devices/system/memory/block_size_bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think about this too but it's dispensable. I will update it.

@jodh-intel
Copy link
Contributor

jodh-intel commented Sep 10, 2018

Just one test suggestion.

lgtm

Approved with PullApprove Approved with PullApprove

@linzichang linzichang force-pushed the master branch 2 times, most recently from 455e03e to 1d8606d Compare September 10, 2018 09:45
@linzichang
Copy link
Contributor Author

PTAL

@devimc
Copy link

devimc commented Sep 10, 2018

LGTM

Approved with PullApprove Approved with PullApprove

@jodh-intel
Copy link
Contributor

Based on the discussion we just had in the Architecture Meeting, we probably want to change this to a more generic name as outlined in #358 (comment).

/cc @gnawux, @bergwolf, @egernst, @sboeuf.

@sboeuf
Copy link

sboeuf commented Sep 10, 2018

Yes I agree. @linzichang the PR looks good and is almost ready to be merged, but please change QueryResource() to GuestDetails() as it is more generic naming.

@bergwolf
Copy link
Member

@sboeuf @jodh-intel I'd like to keep the leading Query action word in the name. It makes it clear that the RPC is a query action that does not change anything in the guest, and all the other gRPCs are named as actions.

@linzichang
Copy link
Contributor Author

Yes, I think QueryGuestDetails is better.

@linzichang
Copy link
Contributor Author

The latest kata-runtime build failed.

@bergwolf
Copy link
Member

@linzichang yeah, see kata-containers/runtime#710

@jodh-intel
Copy link
Contributor

a Query prefix is fine with me, but Get is shorter ;)

@linzichang linzichang force-pushed the master branch 2 times, most recently from c2eb99c to e492e4b Compare September 11, 2018 07:34
To get memory hotplug align boundary, need read
/sys/devices/system/memory/block_size_bytes in guest.
Add a function to do this and we can also extend to
get other details in the future.

Fixes kata-containers#354

Signed-off-by: Zichang Lin <linzichang@huawei.com>
@linzichang linzichang changed the title agent: add QueryResource gRPC function agent: add GetGuestDetails gRPC function Sep 11, 2018
@linzichang
Copy link
Contributor Author

CI is OK.

Copy link
Member

@caoruidong caoruidong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants