Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set default scratch size for containers/UVMs and reuse LCOW scratch #886

Merged
merged 1 commit into from
Nov 19, 2020

Conversation

dcantah
Copy link
Contributor

@dcantah dcantah commented Oct 21, 2020

  • Add containerd.toml options to be able to set the default scratch space size for containers and the UVMs scratch size for WCOW.

  • Add containerd.toml option to be able to specify that we'd like to share the sandbox containers scratch space for workload containers for LCOW.

  • Evaluate symlinks for the sandbox.vhd in the scratch layer as that's what is expected to be set up by the LCOW snapshotter instead of an absolute path to a previous containers sandbox.vhd.

Signed-off-by: Daniel Canter dcanter@microsoft.com

@dcantah dcantah requested a review from a team as a code owner October 21, 2020 13:43
@katiewasnothere
Copy link
Contributor

Can you rebase this PR? It shows Kevin's recent changes for scaling CPU as part of this PR.

Comment on lines 75 to 85
int32 default_container_scratch_size_in_gb = 12;

// default_vm_scratch_size_in_gb is the default scratch size (sandbox.vhd)
// to be used for the UVM. This only applies to WCOW as LCOW doesn't mount a scratch
// specifically for the UVM.
int32 default_vm_scratch_size_in_gb = 13;

// share_scratch specifies if we'd like to reuse scratch space between multiple containers.
// This currently only affects LCOW. The sandbox containers scratch space is re-used for all
// subsequent containers launched in the pod.
bool share_scratch = 14;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we adding in these options here but then never using them in this PR? I'd like to see this moved to another PR if possible.

Copy link
Contributor Author

@dcantah dcantah Oct 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're config options for containerd that will be used for the snapshotter. Theres nothing we can do with the values once we receive them in the shim, they need to be applied/used at the cri layer, they're useless here. Theres no other PR these could go in really

@dcantah
Copy link
Contributor Author

dcantah commented Oct 23, 2020

Can you rebase this PR? It shows Kevin's recent changes for scaling CPU as part of this PR.

I don't see this?

// scale_cpu_limits_to_sandbox indicates that container CPU limits should
// be adjusted to account for the difference in number of cores between the
// host and UVM.
bool scale_cpu_limits_to_sandbox = 11;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcantah here is the dup code

Copy link
Contributor Author

@dcantah dcantah Oct 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's right above in the diff

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah looks like it was just reformatted.

@ambarve
Copy link
Contributor

ambarve commented Oct 26, 2020

Have you already created a PR for corresponding changes in containerd? Can you include a link for that PR in here?

@dcantah
Copy link
Contributor Author

dcantah commented Oct 26, 2020

Have you already created a PR for corresponding changes in containerd? Can you include a link for that PR in here?

jterry75/cri#79, This is the LCOW work but you can see that they're utilized here. I'm just directly editing containerd in the vendor folder for ease of review so you'll have to click on load diff for all of the containerd changes even if they're minimal.

Might as well lump this in here as well: microsoft/opengcs#383

@@ -285,6 +297,12 @@ func UnmountContainerLayers(ctx context.Context, layerFolders []string, containe
// Unload the SCSI scratch path
if (op & UnmountOperationSCSI) == UnmountOperationSCSI {
hostScratchFile := filepath.Join(layerFolders[len(layerFolders)-1], "sandbox.vhdx")
// Need to check if the sandbox is a symlink to another sandbox to handle the case
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need to update the DestroyLayer function call to not delete the VHD if it is a symlink? I am not sure what the vmcompute.DestroyLayer API will do if the vhd is a symlink.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DestroyLayer won't be called for LCOW layers which is the only time this would be a symlink so we're good here. We don't have any ideas for how we'd go about sharing scratch for WCOW yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we somehow arrived at a similar solution for WCOW, I don't believe we have to do anything special. DestroyLayer just calls DeleteFile on the vhd which seems like it handles symlinks just fine. https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-deletefile

Symbolic link behavior—

If the path points to a symbolic link, the symbolic link is deleted, not the target. To delete a target, you must call CreateFile and specify FILE_FLAG_DELETE_ON_CLOSE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I am an idiot. I didn't realize that this was only for LCOW. Anyway we now also know that DestroyLayer will still work if we do this for WCOW.

internal/layers/layers.go Outdated Show resolved Hide resolved
… LCOW container scratch

* Add containerd.toml options to be able to set the default scratch space size for containers and
the UVMs scratch size for WCOW.

* Add containerd.toml option to be able to specify that we'd like to share the sandbox containers
scratch space for workload containers for LCOW.

* Evaluate symlinks for the sandbox.vhd in the scratch layer as that's what is expected to be
set up by the LCOW snapshotter instead of an absolute path to a previous containers sandbox.vhd.

Signed-off-by: Daniel Canter <dcanter@microsoft.com>
@ambarve
Copy link
Contributor

ambarve commented Nov 5, 2020

LGTM!

Copy link
Contributor

@katiewasnothere katiewasnothere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants