Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for assigning cpu group on creation #1006

Merged
merged 1 commit into from
Apr 28, 2021

Conversation

dcantah
Copy link
Contributor

@dcantah dcantah commented Apr 21, 2021

In recent builds of Windows there was support added to the HCS to allow assigning
a cpugroup at creation time of the VM instead of afterwards. The current approach in this
repo of adding a vm after start was only a workaround as this wasn't supported at the time.
The current approach isn;t ideal due to some wonky behavior on machines with
multiple NUMA nodes as we can suffer performance penalties because of remote memory access on
machines with > 1 node when adding a VM after start.

Signed-off-by: Daniel Canter dcanter@microsoft.com

@dcantah dcantah requested a review from a team as a code owner April 21, 2021 21:53
@dcantah dcantah marked this pull request as draft April 21, 2021 23:05
@dcantah dcantah marked this pull request as ready for review April 26, 2021 17:25
}

// SetCPUGroup setups up the cpugroup for the VM with the requested id
func (uvm *UtilityVM) SetCPUGroup(ctx context.Context, id string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can't set it at runtime anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell this was just a workaround as we didn't support assigning it at creation time. I'd asked @katiewasnothere if we even needed to support runtime add anymore and there was no concerns but I can certainly leave this in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was just a workaround. The expectation is that the orchestrator or host owner will create and manage cpugroups, so if they want to move a VM to a different cpugroup during runtime, they would have to handle it themselves, we don't provide knobs for that.

I don't think we need to keep this workaround, but make sure to still have a build block for assigning during create.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can expect (at least for the time being) for cpugroup creation/config/deletion to be handled out-of-band, but it seems like assigning a VM to a cpugroup is something we should own. It seems reasonable to me that we keep this exposed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By assigning do you mean just the initial assignment or additional reassignments during runtime?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of us supporting the initial assignment is just so we can ensure that the VM starts immediately in the target cpugroup and that the VM's memory and cpu resources come from the same numa node(s) on start.

I don't think that exposing reassignment through us adds much and since the orchestrator is already in charge of managing the rest of the aspects of a cpugroup, I think it might be weird to have them call into us specifically for reassignment (this would need to be done through an update pod request).

Another thing to note: CO+ builds [will] support directly moving VMs between cpugroups (iow no need to move to the null group and then to the target new build).

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the background Kathryn :). It sounds like we don't need these really.. and afaik no ones even using this at the moment either. The runtime add/removes aren't actually hooked up to anything right? The functionality is here but there's nothing that someone could call externally to end up invoking any of this besides the previous way the initial assignment worked (add to group in uvm.Start and teardown in uvm.Close). I'm not tied to leaving or removing them, I don't see any harm in leaving them but if the plan was for reassignments to happen outside of the shim then I don't see a reason to keep them.

Copy link
Member

@kevpar kevpar Apr 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't think there is any need to support changing what cpugroup a VM is assigned to, I'm fine with taking this out. But if that's a requirement, I think we will need this capability, as there is no other path for the orchestrator to take.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it last night as from what Kathryn was saying it sounds like reassignments should be handled by whoever created the thing also. It would be trivial to re-add this as it wasn't actually hooked up to anything in the first place (but it would also be trivial to just leave it here and fix the linter warnings :P).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aaaaand added it back per discussion :)

@dcantah dcantah force-pushed the cpugroup-onstart branch 2 times, most recently from d8f7d72 to 2487edf Compare April 27, 2021 18:09
@dcantah dcantah force-pushed the cpugroup-onstart branch 2 times, most recently from 9a94df0 to d0f5644 Compare April 27, 2021 19:40
Copy link
Member

@kevpar kevpar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

const cpuGroupCreateBuild = 20124

var errCPUGroupCreateNotSupported = fmt.Errorf("cpu group assignment on create requires a build of %d or higher", cpuGroupCreateBuild)

// ReleaseCPUGroup unsets the cpugroup from the VM
func (uvm *UtilityVM) ReleaseCPUGroup(ctx context.Context) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we keeping this for the future modify path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

In recent builds of Windows there was support added to the HCS to allow assigning
a cpugroup at creation time of the VM instead of afterwards. The current approach in this
repo of adding a vm after start was only a workaround as this wasn't supported at the time.
The current approach isn;t ideal due to some wonky behavior on machines with
multiple NUMA nodes as we can suffer performance penalties because of remote memory access on
machines with > 1 node when adding a VM after start.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
Copy link
Contributor

@katiewasnothere katiewasnothere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants