Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node Cache for Builds #216

Merged
merged 1 commit into from
Apr 1, 2020

Conversation

adambkaplan
Copy link
Contributor

Restoring (in part) the node-level cache used in OCP 3.x builds.
This proposal utilizes buildah's additional stores feature to mount the
node image store as read-only.

DEVEXP-334

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 18, 2020
@adambkaplan
Copy link
Contributor Author

/assign @bparees

/cc @nalind @rhatdan @TomSweeneyRedHat

fyi @openshift/openshift-team-developer-experience

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 18, 2020
enhancements/builds/node-cache-for-builds.md Outdated Show resolved Hide resolved
enhancements/builds/node-cache-for-builds.md Outdated Show resolved Hide resolved
enhancements/builds/node-cache-for-builds.md Outdated Show resolved Hide resolved
enhancements/builds/node-cache-for-builds.md Outdated Show resolved Hide resolved
enhancements/builds/node-cache-for-builds.md Show resolved Hide resolved
enhancements/builds/node-cache-for-builds.md Show resolved Hide resolved
enhancements/builds/node-cache-for-builds.md Show resolved Hide resolved
enhancements/builds/node-cache-for-builds.md Show resolved Hide resolved
Copy link
Member

@derekwaynecarr derekwaynecarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few questions and comments.

Copy link
Contributor

@gabemontero gabemontero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions/comments/asks around facilitating tekton/buildv2 in addition to fully solving buildv1 @adambkaplan as we discussed in planning last week.

@bparees FYI

enhancements/builds/node-cache-for-builds.md Outdated Show resolved Hide resolved
enhancements/builds/node-cache-for-builds.md Outdated Show resolved Hide resolved
enhancements/builds/node-cache-for-builds.md Show resolved Hide resolved
@bparees
Copy link
Contributor

bparees commented Feb 25, 2020

needs containers team signoff but looks good from my perspective.

@adambkaplan
Copy link
Contributor Author

/cc @mrunalp @nalind @runcom

Pinging folks from the containers-engine team on this thread. It seems that we may need to enhance the containers/storage library (or buildah) such that cache hits in the read-only additional store can be copied into buildah's read/writeable store. Otherwise we risk builds failing due to layers being removed from underneath the build container.

@cgwalters
Copy link
Member

This will also involve potentially different versions of the containers/storage library concurrently accessing the same storage, right? Seems safer to have a high level protocol where the build pod can basically:

  • list images on the host
  • ask a process running on the host to serialize an image to a pipe/socket, then deserialize

That also avoids issues with locking.

If we wanted to have the build pod directly use the merged (mounted) filesystem path, we could potentially extend the protocol to have the build pod request locks on images via the protocol.

Basically I'm arguing for having two processes communicate over e.g. a local Unix domain socket - if we wanted to pass merged/mounted filesystem trees, that could be done via file descriptor passing of a directory FD or so?

@bparees
Copy link
Contributor

bparees commented Feb 27, 2020

This will also involve potentially different versions of the containers/storage library concurrently accessing the same storage

only to read the layers. Is there a versioning issue there?

@cgwalters
Copy link
Member

only to read the layers. Is there a versioning issue there?

I'm not a containers/storage maintainer; but today e.g. for OpenShift we have a crio-wipe.service which blows away the container storage across major version upgrades because they didn't want to have to think about upgrade compatibility. podman doesn't do that - there have been bugs (though I'm not sure if those are in the podman layer or lower)

Here we're not just talking about handling the case of N+1 being able to read N - the host container storage version may be either older or newer than the one in the build container in the general case.

@adambkaplan
Copy link
Contributor Author

@cgwalters can containers/storage bump its major version as part of an OpenShift z-stream update? Requiring buildah, openshift builds, and crio to be in sync with respect to the major version of containers/storage is a drawback, but not insurmountable IMO.

@bparees
Copy link
Contributor

bparees commented Mar 5, 2020 via email

@adambkaplan
Copy link
Contributor Author

Per discussion with @nalind version skew with containers/storage is a low probability issue - the actual storage format is considered stable/backwards compatible between major versions (breaking this is considered a bug).

We do have a separate blocker in that today buildah/podman can't push layers that are present in the additional storage layer. This is problematic when pushing to the internal registry - there is no guarantee that layers cached on the node are present in the internal registry (or any external registry for that matter).

@adambkaplan
Copy link
Contributor Author

@cgwalters is the version of containers/storage something that would have a major version bump delivered as a z-stream OpenShift upgrade? Ex: If we upgrade from 4.5.0 to 4.5.1, could the version of containers/storage change from v5 to v6?

*Mitigation:* The image layer format in use by `containers/storage` is designed to be backwards
compatible between major versions. If in the event data in the additional store is not usable,
buildah should fall back to its default behavior of pulling image content from the upstream
registry.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nalind does buildah behave like this today?

enhancements/builds/node-cache-for-builds.md Show resolved Hide resolved
*Mitigation:* Builds already run as a privileged container, and thus are able to obtain more
privileges than the parent kubelet. If for any reason the node does not allow privileged containers
to read contents in the node image cache, buildah should be able to fall back to its current
behavior.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nalind another item that we should verify - if the additional store can't be read by buildah, will it default to "no additional store?"

@cgwalters
Copy link
Member

cgwalters commented Mar 24, 2020

@cgwalters is the version of containers/storage something that would have a major version bump delivered as a z-stream OpenShift upgrade? Ex: If we upgrade from 4.5.0 to 4.5.1, could the version of containers/storage change from v5 to v6?

Remember, containers/storage doesn't exist on its own - it's a Go library that gets vendored into another binary, most notably podman and crio (currently, independently).

The answer overall then is that RHEL can absolutely rebase mid-OpenShift Z, and that has already happened for 8.0 ➡️ 8.1 in 4.2. Now, almost the entire platform is structured as pods scheduled by kubelet - hence, the version of c/storage used by crio is most important, and that runs solely on the OpenShift lifecycle, there's no reason for it to be rebased mid-Z.

The main things that aren't run via crio are the bootstrap process and OS updates, both of which use podman. Of those two, we obviously only care about in-place changes for the latter. (And, to be honest if we needed to we could make OS updates use a pod too)

Now I forget offhand if crio and podman share an image store or not. That's an important thing to determine. If they do we already had two c/storage versions in play already, having build pods access that same storage would bring a potential third version in.

@bparees
Copy link
Contributor

bparees commented Mar 24, 2020

@adambkaplan it lgtm, ping me tomorrow if you're ready for the label (allowing some time for others to make a final pass if they desire)

@rhatdan
Copy link

rhatdan commented Mar 24, 2020

podman and CRI-O share the image store.

Restoring (in part) the node-level cache used in OCP 3.x builds.
This proposal utilizes buildah's additional stores feature to mount the
node image store as read-only. Builds can then utilize image layers in
the store to bypass image pulls from a registry.

DEVEXP-334
@adambkaplan
Copy link
Contributor Author

@bparees @nalind @cgwalters added a note on the existing containers/storage version skew we have with podman and cri-o. Squashed commits and marked implementable.

@bparees
Copy link
Contributor

bparees commented Apr 1, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 1, 2020
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adambkaplan, bparees

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [adambkaplan,bparees]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit ecbcea0 into openshift:master Apr 1, 2020
@adambkaplan adambkaplan deleted the build-cache branch April 1, 2020 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.