-
Notifications
You must be signed in to change notification settings - Fork 373
Conversation
\cc @ericooper could you review this PR - please notice that cli builder will be gone, so we may want to wait adding more unit test on current code base. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of questions/comments on the openAPI bits. Review of the rest is ongoing.
Thanks guys.
For at least the functions which implement the Hypervisor interface, it'd be helpful to describe exactly what is being done with respect to the configuration, the VMM and the VM itself within each function. See comments in fc.go as a reference for what I'm hoping to read. |
virtcontainers/clh.go
Outdated
clh.vmconfig.Vsock = []chclient.VsockConfig{{Cid: defaultGuestVSockCID, Sock: v.UdsPath}} | ||
case types.Volume: | ||
|
||
if clh.config.SharedFS != config.VirtioFS { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it support block based volumes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not tried but is not possible to hotAdd volumes, for K8s this the only way to go.
706f2a3
to
ef1352c
Compare
/test-ch |
2680524
to
2fa74a4
Compare
/test-ch |
2 similar comments
/test-ch |
/test-ch |
8378145
to
c9e9d80
Compare
nit on 1dd2135 Can we be consistent on naming (s/CH/CLH/) |
c9e9d80
to
08efa97
Compare
Codecov Report
@@ Coverage Diff @@
## master #2275 +/- ##
==========================================
+ Coverage 50.38% 54.04% +3.66%
==========================================
Files 111 45 -66
Lines 16067 4613 -11454
==========================================
- Hits 8095 2493 -5602
+ Misses 6984 1917 -5067
+ Partials 988 203 -785 |
621757a
to
795bcc9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jcvenegas @likebreath
The PR looks good so far! I just have a few nits ;)
clh.vmconfig.Memory.File = "/dev/shm" | ||
// Set initial amount of cpu's for the virtual machine | ||
clh.vmconfig.Cpus = chclient.CpuConfig{ | ||
// cast to int32, as openAPI has a limitation that it does not support unsigned values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes unfortunately :(
// Set initial amount of cpu's for the virtual machine | ||
clh.vmconfig.Cpus = chclient.CpuConfig{ | ||
// cast to int32, as openAPI has a limitation that it does not support unsigned values | ||
CpuCount: int32(clh.config.NumVCPUs), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, this changed into boot_vcpus
on latest CH codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I say I needed to lock the hypervisor to a commit previous to that, you want we update now or after merge this PR, the functionality for the version that is lock this PR should be enough for what it cover this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really up to you, but I don't think you should worry about it right now. A follow up PR will be fine. I simply wanted to let you know that this part had changed very recently :)
if err := clh.waitVMM(clhTimeout); err != nil { | ||
clh.Logger().WithField("error", err).WithField("output", clh.cmdOutput.String()).Warn("cloud-hypervisor init failed") | ||
clh.shutdownVirtiofsd() | ||
return err | ||
} | ||
|
||
clh.state.PID = pid | ||
if err := clh.bootVM(ctx); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment here. You chose to create the VMM and boot the VM in the same startSandbox()
function, but you could have done it slightly differently too. The VMM can be started with only the API socket as argument (from createSandbox()
), and later on, you can issue a VmCreate(vmConfig)
to start the VM from startSandbox()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think matching FC.go is okay for now. We can look to see if there are any measurable benefits in a follow up.
/test-ch |
/test |
795bcc9
to
10b25f6
Compare
NumQueues: clhFsQueues, | ||
QueueSize: clhFsQueueSize, | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is fine and will work as expected 👍
NumQueues: clhFsQueues, | ||
QueueSize: clhFsQueueSize, | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the behavior from this part. By not providing the CacheSize
, and because the OpenAPI says that it expects an int64
, I think the default value is going to be 0. But that will be translated into Some(0)
for the Rust code, meaning it will still try to use DAX with the shared memory region when really what you want is no cache at all.
Please give it a try and let me know if that's the case, but TBH I think that's something we need to fix in CH.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jcvenegas I've just opened cloud-hypervisor/cloud-hypervisor#515 to fix this. It'd be nice if you could give it a shot as I think this is the right way to configure FsConfig
.
Note that because I added dax
default to true
and cache_size
default to 8G
, you shouldn't need to specify dax
and cache_size
when the only information you get from your user is virtioFsCacheAlways = "always"
.
10b25f6
to
99f3519
Compare
// remove after PR is merged: | ||
// https://github.com/cloud-hypervisor/cloud-hypervisor/pull/480 | ||
ip := "0.0.0.0" | ||
mask := "0.0.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jcvenegas @likebreath I've just submitted cloud-hypervisor/cloud-hypervisor#516 which should solve this issue. Let me know if that works for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#516 has been merged.
Can the above be removed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Doing one more pass for sanity :)
/test-ch |
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ericooper for the base, and @jcvenegas and @likebreath for this API. Neither was light lifting!
Looks good, let's put this in, and add improvements in follow on PRs.
/test |
Instead of build a command, use Cloud Hypervisor http API. Fixes: kata-containers#2165 Signed-off-by: Bo Chen <chen.bo@intel.com> Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Remove cli builder code as now that we use http client Signed-off-by: Bo Chen <chen.bo@intel.com> Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
test with recent API changes of CH. Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
remove dirtory created for VM. This should be refactored in all hypervisors Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Add initial unit test around http client Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
99f3519
to
0afeb52
Compare
/test |
/test-ch |
1 similar comment
/test-ch |
@jcvenegas @likebreath can you PTAL at the CI failure? |
For ARM I just check and is not related with it, I restarted the job. The CLH CI fail is failing in different docker run every time, this is due to kata-containers/tests#2141 , so this is expected to fail until we find the root cause. I am trying seems that is a race condition with the agent, but I could not reproduce it recently, lets ignore clh CI until is more stable. |
Since this occurs w CLH on master already today (not a regression), let’s get this change in and focus on resolving the test failures. Thanks folks. |
timeStart := time.Now() | ||
cl := clh.client() | ||
for { | ||
ctx, cancel := context.WithTimeout(context.Background(), clhAPITimeout*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If, at the end of iteration N, the remaining time (w.r.t. timeout) is shorter than clhAPITimeout, we would still wait for clhAPITimeout in iteration N+1. This would make the total wait longer than timeout.
Is this Okay ?
Despite of all the noise in the PR for generated code, the main changes are in
virtcontainers/clh.go
Fixes: #2165
Depends-on: github.com/kata-containers/tests#2100
Signed-off-by: Bo Chen chen.bo@intel.com
Signed-off-by: Jose Carlos Venegas Munoz jose.carlos.venegas.munoz@intel.com