Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

hypervisor/qemu: add memory hotplug support #470

Merged
merged 3 commits into from
Jul 9, 2018

Conversation

bergwolf
Copy link
Member

@bergwolf bergwolf commented Jul 5, 2018

As discussed in #303, I'm splitting the memory hotplug part from the vm factory PR to get more eyes on it. The govmm part of change is already merged so we just need to change qemu to call the APIs properly.

However, because this is just part of the vm factory PR, there is no actual caller for the internal memory device hotplug API yet. That said, I did test it roughly with following hack and saw qmp hotplug commands succeed:

diff --git a/virtcontainers/sandbox.go b/virtcontainers/sandbox.go
index 87f8eda..6bba236 100644
--- a/virtcontainers/sandbox.go
+++ b/virtcontainers/sandbox.go
@@ -947,6 +947,14 @@ func (s *Sandbox) startVM() error {

        s.Logger().Info("VM started")

+       // HACK: test memory hotplug
+       defer func() {
+               _, err := s.hypervisor.hotplugAddDevice(&memoryDevice{0, 128}, memoryDev)
+               if err != nil {
+                       s.Logger().WithError(err).Warn("bergwolf: memory hotplug failed")
+               }
+
+       }()
        // Once startVM is done, we want to guarantee
        // that the sandbox is manageable. For that we need
        // to start the sandbox inside the VM.

bergwolf added 2 commits July 5, 2018 15:25
To include vm factory related commits. Full list:
54caf78 (mine/templating, templating) qmp: add hotplug memory
e66a9b4 qemu: add appendMemoryKnobs helper
8aeca15 qmp: add migrate set arguments
a03d496 qmp: add set migration capabilities
0ace417 qemu: allow to set migration incoming
723bc5f qemu: allow to create a stopped guest
283d7df qemu: add file backed memory device support

Signed-off-by: Peng Tao <bergwolf@gmail.com>
I got following warning after upgrading dep tool:

Warning: the following project(s) have [[constraint]] stanzas in Gopkg.toml:

  ✗  github.com/hashicorp/yamux

However, these projects are not direct dependencies of the current project:
they are not imported in any .go files, nor are they in the 'required' list in
Gopkg.toml. Dep only applies [[constraint]] rules to direct dependencies, so
these rules will have no effect.

Either import/require packages from these projects so that they become direct
dependencies, or convert each [[constraint]] to an [[override]] to enforce rules
on these projects, if they happen to be transitive dependencies,

So let's convert constraint to override over yamux. In the meanwhile,
update the yamux vendor. Full commit list:

4c2fe0d (origin/b-consul-3040) Dont output keepalive error when the session is closed
f21aae5 Make sure to drain the timer channel on defer, and a clarifying comment
601ccd8 Make receive window update logic a bit cleaner
02d320c Uses timer pool in sendNoWait, like in waitForSendErr
cf433c5 window update unit test for partial read; benchmark large buffer
ca8dfd0 improve memory utilization in receive buffer, fix flow control
683f491 Fix race around read and write deadlines in Stream (#52)
40b86b2 Add public session CloseChan method (#44)

Note that commit 4c2fe0d might also help kata-containers/agent/issues/231.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
@katacontainersbot
Copy link
Contributor

PSS Measurement:
Qemu: 160289 KB
Proxy: 4516 KB
Shim: 8826 KB

Memory inside container:
Total Memory: 2045968 KB
Free Memory: 2007220 KB

@bergwolf bergwolf requested review from sboeuf and amshinde and removed request for sboeuf July 5, 2018 09:40
@katacontainersbot
Copy link
Contributor

PSS Measurement:
Qemu: 160700 KB
Proxy: 4477 KB
Shim: 8898 KB

Memory inside container:
Total Memory: 2045968 KB
Free Memory: 2006228 KB

@codecov
Copy link

codecov bot commented Jul 5, 2018

Codecov Report

Merging #470 into master will decrease coverage by 0.05%.
The diff coverage is 27.65%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #470      +/-   ##
==========================================
- Coverage    64.1%   64.05%   -0.06%     
==========================================
  Files          87       87              
  Lines        8885     8925      +40     
==========================================
+ Hits         5696     5717      +21     
- Misses       2577     2594      +17     
- Partials      612      614       +2
Impacted Files Coverage Δ
virtcontainers/hypervisor.go 72.78% <ø> (ø) ⬆️
virtcontainers/qemu.go 22.84% <27.65%> (+2.4%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b2bec33...66a3e81. Read the comment docs.

}

hostMemMb := uint64(float64(hostMemKb / 1024))
return uint64(float64(hostMemKb / 1024)), nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right to me - should this be float64(hostMemKb)? Or can we just drop the float64()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed. I was following the old code but you are right that float64 is unnecessary here. I'll fix it. Thanks!

@@ -837,6 +852,61 @@ func (q *qemu) hotplugRemoveCPUs(amount uint32) (uint32, error) {
return amount, q.sandbox.storage.storeHypervisorState(q.sandbox.id, q.state)
}

func (q *qemu) hotplugMemory(slot uint32, memMB int, op operation) error {
if memMB <= 0 {
return errors.New("cannot hotplug zero or negative MB memory")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make the memMB a uint64 to avoid having to add this check? Same question for the function below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. But I'm keeping the one in q.hotplugAddMemory since the govmm requires int and we need a sanity check better early than later.

@bergwolf bergwolf force-pushed the memory-hotplug branch 3 times, most recently from 220f231 to 2e2a89b Compare July 5, 2018 15:35
@bergwolf
Copy link
Member Author

bergwolf commented Jul 5, 2018

@jodh-intel PR updated. PTAL.

@sboeuf sboeuf requested a review from jcvenegas July 5, 2018 15:43
@sboeuf
Copy link

sboeuf commented Jul 5, 2018

@jcvenegas PTAL!

@katacontainersbot
Copy link
Contributor

PSS Measurement:
Qemu: 164755 KB
Proxy: 4742 KB
Shim: 8790 KB

Memory inside container:
Total Memory: 2045968 KB
Free Memory: 2007268 KB

@gnawux
Copy link
Member

gnawux commented Jul 5, 2018

lgtm

Approved with PullApprove

return q.hotplugAddMemory(slot, int(memMB))
}

func (q *qemu) hotplugAddMemory(slot uint32, amount int) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the qmp function takes an ID I wonder if for clarity would be better name it id or to if want to keep it refer that indeed we are adding an slot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to convert the slot number to an ID, either we do it in this function, or we need to do it somewhere else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with any case. I ask because maybe a reader can think that is it is slot number where the memory will be inserted in the VM. This maybe not true all the time, qemu inserts it into the first empty slot.

}

// calculate current memory
currentMemory := int(q.config.DefaultMemSz)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This configuration is the one that comes from kata-containers.toml if that is the case this may change if the user changes the value. In that case current memory wont be the same memory that we used to boot.

If that is the case we have a few options here:

  • When we start the VM save the memory that was used to start it.
  • We can query to qmp to get the current memory in the guest.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think saving the amount of memory used when the VM was started is the best option here. Or we could simply save the current amount of memory plugged to the VM (but this is because we cannot unplug that I am thinking about this).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcvenegas

This configuration is the one that comes from kata-containers.toml if that is the case this may change if the user changes the value. In that case current memory wont be the same memory that we used to boot.

Sorry, I don't quit follow it. Do you mean the sandbox config we used to boot can change later when someone connects to the sandbox again? It doesn't sound like a solid solution. But I think we only use kata-containers.toml when creating the sandbox and later access to the sandbox all fetches the sandbox config from persistent metadata with the sandbox ID. Am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bergwolf yes, well I am considering a future case were someone want to update the memory in a container. And we want to increase the memory. In that case exist the case that the config file was modified.

Another case is when we create a container inside a Pod. In that case the pod may be created with one config file, but then may be updated with a different default memory, and then create a container inside a pod.

But agree for the current use case does not have any issue with that.

@jcvenegas
Copy link
Member

@bergwolf I think this is the initial implementation to make VM templating work right?

Do we expect to add more changes to PR or you only want to add this by now?


// rule out extremely large parameter
if memMB > math.MaxInt32 {
return fmt.Errorf("cannot hotplug %d MB memory: size too large", memMB)
Copy link
Member

@jcvenegas jcvenegas Jul 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The config file for kata ask for memory in MiB we may want to follow the same metric.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll fix the error message. The actual code does take memory in MiB.

"fmt"
"time"
)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious how re-vendoring govmm ended up adding those two files github.com/docker/go-units/duration.go and docker/go-units/ulimit.go

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indirect dependency? These were added by dep anyway.

HotpluggedVCPUs []CPUDevice
UUID string
HotpluggedVCPUs []CPUDevice
HotpluggedMemory int
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't you using a memoryDevice type instead here? Also, this would force this structure to be public.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you want memoryDevice to be public? Are you suggesting adding sandbox.HotplugMemory API?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not change this because memoryDevice.slot is internal knowledge. Even if we have sandbox.HotplugMemory, we should not expose it. @sboeuf let me know if you think otherwise.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what I meant is that we could have something like HotpluggedMemory []MemoryDevice the same way we have HotpluggedVCPUs []CPUDevice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think we need that. We only need to track how much memory has been hotplugged. Unlike CPU devices, we do not support memory hot unplug, nor need to unplug memory the same chunk as they were plugged in even if we support memory hot unplug in future.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough !

Copy link

@ghost ghost Aug 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bergwolf Sorry I don't quite understand why you put "slot" field in struct MemoryDevice rather than struct QemuState , as an input from user will not involve "slot" field. When we hotplug memory device, we use a qmp command like "object_add ... mem1 ...", in which "1" is memory slot number. In my opnion, "slot" field should be stored in QemuState(hypervisor.json) so that we could get memory slot number when we need to hotplug memory.

@@ -837,6 +853,64 @@ func (q *qemu) hotplugRemoveCPUs(amount uint32) (uint32, error) {
return amount, q.sandbox.storage.storeHypervisorState(q.sandbox.id, q.state)
}

func (q *qemu) hotplugMemory(slot uint32, memMB uint64, op operation) error {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use memoryDevice instead of slot uint32, memMB uint64 parameters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll change it.

@@ -169,6 +174,7 @@ func (q *qemu) init(sandbox *Sandbox) error {
return err
}

q.vmConfig = sandbox.config.VMConfig
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory coming through the Resources structure in sandbox.config.VMConfig is part of some legacy code that should be removed IMO. We need a unified way to provide the resources, and that's using HypervisorConfig that we already have through q.config.
It is a bit odd to have CPU and Memory resources provided through 2 different structures.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed it is weird but I'd expect the clean up to be handled in a separate PR.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you do that directly as part of this PR (separate commit of course). Do you think that's a lot of work ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the arch meeting, it will break ondisk metadata compatibility so I prefer to defer it to a standalone PR at least.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed during Arch committee meeting, this rework will be done in a separate PR to prevent from API breakage.

}

// calculate current memory
currentMemory := int(q.config.DefaultMemSz)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think saving the amount of memory used when the VM was started is the best option here. Or we could simply save the current amount of memory plugged to the VM (but this is because we cannot unplug that I am thinking about this).

@bergwolf
Copy link
Member Author

bergwolf commented Jul 6, 2018

@jcvenegas

Do we expect to add more changes to PR or you only want to add this by now?

This is all I need for vm factory. Feel free to add more PR based on it to support memory hotplug in other use cases. I assume that you want to work on that, right?

So that we can add more memory to an existing guest.

Fixes: kata-containers#469

Signed-off-by: Peng Tao <bergwolf@gmail.com>
@bergwolf
Copy link
Member Author

bergwolf commented Jul 9, 2018

@jcvenegas @sboeuf PR updated. PTAL.

@katacontainersbot
Copy link
Contributor

PSS Measurement:
Qemu: 164587 KB
Proxy: 4566 KB
Shim: 9016 KB

Memory inside container:
Total Memory: 2045968 KB
Free Memory: 2006980 KB

@sboeuf sboeuf merged commit a8952fb into kata-containers:master Jul 9, 2018
@egernst egernst mentioned this pull request Aug 22, 2018
@sboeuf sboeuf added the feature New functionality label Sep 12, 2018
@bergwolf bergwolf deleted the memory-hotplug branch September 13, 2018 03:27
zklei pushed a commit to zklei/runtime that referenced this pull request Jun 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature New functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants