Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

clh: use http client #2275

Merged
merged 9 commits into from
Dec 7, 2019
Merged

Conversation

jcvenegas
Copy link
Member

@jcvenegas jcvenegas commented Nov 26, 2019

  • Generate API client using OpenAPI tools ( similar to fc implementation does)
  • Instead of build a command, use Cloud Hypervisor http API

Despite of all the noise in the PR for generated code, the main changes are in

virtcontainers/clh.go

Fixes: #2165

Depends-on: github.com/kata-containers/tests#2100

Signed-off-by: Bo Chen chen.bo@intel.com
Signed-off-by: Jose Carlos Venegas Munoz jose.carlos.venegas.munoz@intel.com

@jcvenegas jcvenegas requested review from sameo and sboeuf November 26, 2019 16:37
@jcvenegas jcvenegas requested a review from a team as a code owner November 26, 2019 16:37
@jcvenegas
Copy link
Member Author

\cc @ericooper could you review this PR - please notice that cli builder will be gone, so we may want to wait adding more unit test on current code base.

@egernst egernst self-assigned this Nov 26, 2019
@jcvenegas jcvenegas requested a review from egernst November 26, 2019 17:27
Copy link
Member

@egernst egernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions/comments on the openAPI bits. Review of the rest is ongoing.

Thanks guys.

@egernst
Copy link
Member

egernst commented Nov 26, 2019

For at least the functions which implement the Hypervisor interface, it'd be helpful to describe exactly what is being done with respect to the configuration, the VMM and the VM itself within each function. See comments in fc.go as a reference for what I'm hoping to read.

@egernst egernst assigned egernst and unassigned egernst Nov 26, 2019
clh.vmconfig.Vsock = []chclient.VsockConfig{{Cid: defaultGuestVSockCID, Sock: v.UdsPath}}
case types.Volume:

if clh.config.SharedFS != config.VirtioFS {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it support block based volumes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not tried but is not possible to hotAdd volumes, for K8s this the only way to go.

@jcvenegas jcvenegas force-pushed the ch-api-support branch 3 times, most recently from 706f2a3 to ef1352c Compare November 27, 2019 21:15
@jcvenegas
Copy link
Member Author

/test-ch

@jcvenegas jcvenegas force-pushed the ch-api-support branch 3 times, most recently from 2680524 to 2fa74a4 Compare November 28, 2019 18:29
@jcvenegas
Copy link
Member Author

/test-ch

2 similar comments
@jcvenegas
Copy link
Member Author

/test-ch

@jcvenegas
Copy link
Member Author

/test-ch

@jcvenegas jcvenegas force-pushed the ch-api-support branch 2 times, most recently from 8378145 to c9e9d80 Compare December 4, 2019 17:57
@jcvenegas jcvenegas changed the title WIP: clh: use http client clh: use http client Dec 4, 2019
@egernst
Copy link
Member

egernst commented Dec 4, 2019

nit on 1dd2135

Can we be consistent on naming (s/CH/CLH/)

@codecov
Copy link

codecov bot commented Dec 5, 2019

Codecov Report

Merging #2275 into master will increase coverage by 3.66%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #2275      +/-   ##
==========================================
+ Coverage   50.38%   54.04%   +3.66%     
==========================================
  Files         111       45      -66     
  Lines       16067     4613   -11454     
==========================================
- Hits         8095     2493    -5602     
+ Misses       6984     1917    -5067     
+ Partials      988      203     -785

Copy link

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcvenegas @likebreath
The PR looks good so far! I just have a few nits ;)

clh.vmconfig.Memory.File = "/dev/shm"
// Set initial amount of cpu's for the virtual machine
clh.vmconfig.Cpus = chclient.CpuConfig{
// cast to int32, as openAPI has a limitation that it does not support unsigned values
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes unfortunately :(

// Set initial amount of cpu's for the virtual machine
clh.vmconfig.Cpus = chclient.CpuConfig{
// cast to int32, as openAPI has a limitation that it does not support unsigned values
CpuCount: int32(clh.config.NumVCPUs),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this changed into boot_vcpus on latest CH codebase.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I say I needed to lock the hypervisor to a commit previous to that, you want we update now or after merge this PR, the functionality for the version that is lock this PR should be enough for what it cover this PR.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really up to you, but I don't think you should worry about it right now. A follow up PR will be fine. I simply wanted to let you know that this part had changed very recently :)

if err := clh.waitVMM(clhTimeout); err != nil {
clh.Logger().WithField("error", err).WithField("output", clh.cmdOutput.String()).Warn("cloud-hypervisor init failed")
clh.shutdownVirtiofsd()
return err
}

clh.state.PID = pid
if err := clh.bootVM(ctx); err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment here. You chose to create the VMM and boot the VM in the same startSandbox() function, but you could have done it slightly differently too. The VMM can be started with only the API socket as argument (from createSandbox()), and later on, you can issue a VmCreate(vmConfig) to start the VM from startSandbox().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think matching FC.go is okay for now. We can look to see if there are any measurable benefits in a follow up.

@jcvenegas
Copy link
Member Author

/test-ch

@jcvenegas
Copy link
Member Author

/test

NumQueues: clhFsQueues,
QueueSize: clhFsQueueSize,
},
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is fine and will work as expected 👍

NumQueues: clhFsQueues,
QueueSize: clhFsQueueSize,
},
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the behavior from this part. By not providing the CacheSize, and because the OpenAPI says that it expects an int64, I think the default value is going to be 0. But that will be translated into Some(0) for the Rust code, meaning it will still try to use DAX with the shared memory region when really what you want is no cache at all.
Please give it a try and let me know if that's the case, but TBH I think that's something we need to fix in CH.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcvenegas I've just opened cloud-hypervisor/cloud-hypervisor#515 to fix this. It'd be nice if you could give it a shot as I think this is the right way to configure FsConfig.
Note that because I added dax default to true and cache_size default to 8G, you shouldn't need to specify dax and cache_size when the only information you get from your user is virtioFsCacheAlways = "always".

// remove after PR is merged:
// https://github.com/cloud-hypervisor/cloud-hypervisor/pull/480
ip := "0.0.0.0"
mask := "0.0.0.0"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcvenegas @likebreath I've just submitted cloud-hypervisor/cloud-hypervisor#516 which should solve this issue. Let me know if that works for you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#516 has been merged.
Can the above be removed ?

Copy link
Member

@egernst egernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Doing one more pass for sanity :)

@egernst egernst self-requested a review December 6, 2019 16:41
@likebreath
Copy link
Contributor

/test-ch

@egernst
Copy link
Member

egernst commented Dec 6, 2019

/test

Copy link
Member

@egernst egernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ericooper for the base, and @jcvenegas and @likebreath for this API. Neither was light lifting!

Looks good, let's put this in, and add improvements in follow on PRs.

@jcvenegas
Copy link
Member Author

/test

Instead of build a command, use Cloud Hypervisor http API.

Fixes: kata-containers#2165

Signed-off-by: Bo Chen <chen.bo@intel.com>
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Remove cli builder code as now that we use http client

Signed-off-by: Bo Chen <chen.bo@intel.com>
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
test with recent API changes of CH.

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
remove dirtory created for VM. This should be refactored in all
hypervisors

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Add initial unit test around http client

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
@jcvenegas
Copy link
Member Author

/test

@jcvenegas
Copy link
Member Author

/test-ch

1 similar comment
@jcvenegas
Copy link
Member Author

/test-ch

@egernst
Copy link
Member

egernst commented Dec 7, 2019

@jcvenegas @likebreath can you PTAL at the CI failure?

@jcvenegas
Copy link
Member Author

For ARM I just check and is not related with it, I restarted the job.

The CLH CI fail is failing in different docker run every time, this is due to kata-containers/tests#2141 , so this is expected to fail until we find the root cause. I am trying seems that is a race condition with the agent, but I could not reproduce it recently, lets ignore clh CI until is more stable.

@egernst
Copy link
Member

egernst commented Dec 7, 2019

Since this occurs w CLH on master already today (not a regression), let’s get this change in and focus on resolving the test failures. Thanks folks.

@egernst egernst merged commit a660d80 into kata-containers:master Dec 7, 2019
timeStart := time.Now()
cl := clh.client()
for {
ctx, cancel := context.WithTimeout(context.Background(), clhAPITimeout*time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If, at the end of iteration N, the remaining time (w.r.t. timeout) is shorter than clhAPITimeout, we would still wait for clhAPITimeout in iteration N+1. This would make the total wait longer than timeout.

Is this Okay ?

@jcvenegas jcvenegas deleted the ch-api-support branch January 23, 2020 19:44
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cloud Hypervisor: Use HTTP API
6 participants