-
Notifications
You must be signed in to change notification settings - Fork 374
hypervisor/qemu: add memory hotplug support #470
Conversation
To include vm factory related commits. Full list: 54caf78 (mine/templating, templating) qmp: add hotplug memory e66a9b4 qemu: add appendMemoryKnobs helper 8aeca15 qmp: add migrate set arguments a03d496 qmp: add set migration capabilities 0ace417 qemu: allow to set migration incoming 723bc5f qemu: allow to create a stopped guest 283d7df qemu: add file backed memory device support Signed-off-by: Peng Tao <bergwolf@gmail.com>
I got following warning after upgrading dep tool: Warning: the following project(s) have [[constraint]] stanzas in Gopkg.toml: ✗ github.com/hashicorp/yamux However, these projects are not direct dependencies of the current project: they are not imported in any .go files, nor are they in the 'required' list in Gopkg.toml. Dep only applies [[constraint]] rules to direct dependencies, so these rules will have no effect. Either import/require packages from these projects so that they become direct dependencies, or convert each [[constraint]] to an [[override]] to enforce rules on these projects, if they happen to be transitive dependencies, So let's convert constraint to override over yamux. In the meanwhile, update the yamux vendor. Full commit list: 4c2fe0d (origin/b-consul-3040) Dont output keepalive error when the session is closed f21aae5 Make sure to drain the timer channel on defer, and a clarifying comment 601ccd8 Make receive window update logic a bit cleaner 02d320c Uses timer pool in sendNoWait, like in waitForSendErr cf433c5 window update unit test for partial read; benchmark large buffer ca8dfd0 improve memory utilization in receive buffer, fix flow control 683f491 Fix race around read and write deadlines in Stream (#52) 40b86b2 Add public session CloseChan method (#44) Note that commit 4c2fe0d might also help kata-containers/agent/issues/231. Signed-off-by: Peng Tao <bergwolf@gmail.com>
PSS Measurement: Memory inside container: |
PSS Measurement: Memory inside container: |
Codecov Report
@@ Coverage Diff @@
## master #470 +/- ##
==========================================
- Coverage 64.1% 64.05% -0.06%
==========================================
Files 87 87
Lines 8885 8925 +40
==========================================
+ Hits 5696 5717 +21
- Misses 2577 2594 +17
- Partials 612 614 +2
Continue to review full report at Codecov.
|
virtcontainers/qemu.go
Outdated
} | ||
|
||
hostMemMb := uint64(float64(hostMemKb / 1024)) | ||
return uint64(float64(hostMemKb / 1024)), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look right to me - should this be float64(hostMemKb)
? Or can we just drop the float64()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, indeed. I was following the old code but you are right that float64
is unnecessary here. I'll fix it. Thanks!
virtcontainers/qemu.go
Outdated
@@ -837,6 +852,61 @@ func (q *qemu) hotplugRemoveCPUs(amount uint32) (uint32, error) { | |||
return amount, q.sandbox.storage.storeHypervisorState(q.sandbox.id, q.state) | |||
} | |||
|
|||
func (q *qemu) hotplugMemory(slot uint32, memMB int, op operation) error { | |||
if memMB <= 0 { | |||
return errors.New("cannot hotplug zero or negative MB memory") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make the memMB
a uint64
to avoid having to add this check? Same question for the function below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. But I'm keeping the one in q.hotplugAddMemory
since the govmm requires int
and we need a sanity check better early than later.
220f231
to
2e2a89b
Compare
@jodh-intel PR updated. PTAL. |
@jcvenegas PTAL! |
PSS Measurement: Memory inside container: |
virtcontainers/qemu.go
Outdated
return q.hotplugAddMemory(slot, int(memMB)) | ||
} | ||
|
||
func (q *qemu) hotplugAddMemory(slot uint32, amount int) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the qmp function takes an ID I wonder if for clarity would be better name it id
or to if want to keep it refer that indeed we are adding an slot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to convert the slot number to an ID, either we do it in this function, or we need to do it somewhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with any case. I ask because maybe a reader can think that is it is slot number where the memory will be inserted in the VM. This maybe not true all the time, qemu inserts it into the first empty slot.
} | ||
|
||
// calculate current memory | ||
currentMemory := int(q.config.DefaultMemSz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This configuration is the one that comes from kata-containers.toml
if that is the case this may change if the user changes the value. In that case current memory wont be the same memory that we used to boot.
If that is the case we have a few options here:
- When we start the VM save the memory that was used to start it.
- We can query to qmp to get the current memory in the guest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think saving the amount of memory used when the VM was started is the best option here. Or we could simply save the current amount of memory plugged to the VM (but this is because we cannot unplug that I am thinking about this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This configuration is the one that comes from kata-containers.toml if that is the case this may change if the user changes the value. In that case current memory wont be the same memory that we used to boot.
Sorry, I don't quit follow it. Do you mean the sandbox config we used to boot can change later when someone connects to the sandbox again? It doesn't sound like a solid solution. But I think we only use kata-containers.toml
when creating the sandbox and later access to the sandbox all fetches the sandbox config from persistent metadata with the sandbox ID. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bergwolf yes, well I am considering a future case were someone want to update the memory in a container. And we want to increase the memory. In that case exist the case that the config file was modified.
Another case is when we create a container inside a Pod. In that case the pod may be created with one config file, but then may be updated with a different default memory, and then create a container inside a pod.
But agree for the current use case does not have any issue with that.
@bergwolf I think this is the initial implementation to make VM templating work right? Do we expect to add more changes to PR or you only want to add this by now? |
virtcontainers/qemu.go
Outdated
|
||
// rule out extremely large parameter | ||
if memMB > math.MaxInt32 { | ||
return fmt.Errorf("cannot hotplug %d MB memory: size too large", memMB) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The config file for kata ask for memory in MiB we may want to follow the same metric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll fix the error message. The actual code does take memory in MiB.
"fmt" | ||
"time" | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious how re-vendoring govmm ended up adding those two files github.com/docker/go-units/duration.go
and docker/go-units/ulimit.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indirect dependency? These were added by dep
anyway.
HotpluggedVCPUs []CPUDevice | ||
UUID string | ||
HotpluggedVCPUs []CPUDevice | ||
HotpluggedMemory int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why aren't you using a memoryDevice
type instead here? Also, this would force this structure to be public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would you want memoryDevice
to be public? Are you suggesting adding sandbox.HotplugMemory
API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not change this because memoryDevice.slot
is internal knowledge. Even if we have sandbox.HotplugMemory
, we should not expose it. @sboeuf let me know if you think otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, what I meant is that we could have something like HotpluggedMemory []MemoryDevice
the same way we have HotpluggedVCPUs []CPUDevice
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't think we need that. We only need to track how much memory has been hotplugged. Unlike CPU devices, we do not support memory hot unplug, nor need to unplug memory the same chunk as they were plugged in even if we support memory hot unplug in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bergwolf Sorry I don't quite understand why you put "slot" field in struct MemoryDevice rather than struct QemuState , as an input from user will not involve "slot" field. When we hotplug memory device, we use a qmp command like "object_add ... mem1 ...", in which "1" is memory slot number. In my opnion, "slot" field should be stored in QemuState(hypervisor.json) so that we could get memory slot number when we need to hotplug memory.
virtcontainers/qemu.go
Outdated
@@ -837,6 +853,64 @@ func (q *qemu) hotplugRemoveCPUs(amount uint32) (uint32, error) { | |||
return amount, q.sandbox.storage.storeHypervisorState(q.sandbox.id, q.state) | |||
} | |||
|
|||
func (q *qemu) hotplugMemory(slot uint32, memMB uint64, op operation) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use memoryDevice
instead of slot uint32, memMB uint64
parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll change it.
@@ -169,6 +174,7 @@ func (q *qemu) init(sandbox *Sandbox) error { | |||
return err | |||
} | |||
|
|||
q.vmConfig = sandbox.config.VMConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The memory coming through the Resources
structure in sandbox.config.VMConfig
is part of some legacy code that should be removed IMO. We need a unified way to provide the resources, and that's using HypervisorConfig
that we already have through q.config
.
It is a bit odd to have CPU and Memory resources provided through 2 different structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, indeed it is weird but I'd expect the clean up to be handled in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you do that directly as part of this PR (separate commit of course). Do you think that's a lot of work ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in the arch meeting, it will break ondisk metadata compatibility so I prefer to defer it to a standalone PR at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed during Arch committee meeting, this rework will be done in a separate PR to prevent from API breakage.
} | ||
|
||
// calculate current memory | ||
currentMemory := int(q.config.DefaultMemSz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think saving the amount of memory used when the VM was started is the best option here. Or we could simply save the current amount of memory plugged to the VM (but this is because we cannot unplug that I am thinking about this).
This is all I need for vm factory. Feel free to add more PR based on it to support memory hotplug in other use cases. I assume that you want to work on that, right? |
So that we can add more memory to an existing guest. Fixes: kata-containers#469 Signed-off-by: Peng Tao <bergwolf@gmail.com>
@jcvenegas @sboeuf PR updated. PTAL. |
PSS Measurement: Memory inside container: |
…bump # Kata Containers 1.6.0-rc1
As discussed in #303, I'm splitting the memory hotplug part from the vm factory PR to get more eyes on it. The govmm part of change is already merged so we just need to change qemu to call the APIs properly.
However, because this is just part of the vm factory PR, there is no actual caller for the internal memory device hotplug API yet. That said, I did test it roughly with following hack and saw qmp hotplug commands succeed: