-
Notifications
You must be signed in to change notification settings - Fork 484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node Cache for Builds #216
Node Cache for Builds #216
Conversation
/assign @bparees /cc @nalind @rhatdan @TomSweeneyRedHat fyi @openshift/openshift-team-developer-experience |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few questions and comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions/comments/asks around facilitating tekton/buildv2 in addition to fully solving buildv1 @adambkaplan as we discussed in planning last week.
@bparees FYI
needs containers team signoff but looks good from my perspective. |
Pinging folks from the containers-engine team on this thread. It seems that we may need to enhance the |
This will also involve potentially different versions of the containers/storage library concurrently accessing the same storage, right? Seems safer to have a high level protocol where the build pod can basically:
That also avoids issues with locking. If we wanted to have the build pod directly use the merged (mounted) filesystem path, we could potentially extend the protocol to have the build pod request locks on images via the protocol. Basically I'm arguing for having two processes communicate over e.g. a local Unix domain socket - if we wanted to pass merged/mounted filesystem trees, that could be done via file descriptor passing of a directory FD or so? |
only to read the layers. Is there a versioning issue there? |
I'm not a containers/storage maintainer; but today e.g. for OpenShift we have a Here we're not just talking about handling the case of N+1 being able to read N - the host container storage version may be either older or newer than the one in the build container in the general case. |
@cgwalters can |
Not clear to me the implications for consuming existing storage(created
before upgrade) after an upgrade.
Ben Parees | OpenShift
…On Wed, Mar 4, 2020, 18:01 Adam Kaplan ***@***.***> wrote:
@cgwalters <https://github.com/cgwalters> can containers/storage bump its
major version as part of an OpenShift z-stream update? Requiring buildah,
openshift builds, and crio to be in sync with respect to the major version
of containers/storage is a drawback, but not insurmountable IMO.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#216?email_source=notifications&email_token=ABF6LXTXKPXMJR3AYBNZIK3RF3MVZA5CNFSM4KXOIK3KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEN23MVA#issuecomment-594916948>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABF6LXTWVKCUPGFFHPJHPDTRF3MVZANCNFSM4KXOIK3A>
.
|
Per discussion with @nalind version skew with We do have a separate blocker in that today buildah/podman can't push layers that are present in the additional storage layer. This is problematic when pushing to the internal registry - there is no guarantee that layers cached on the node are present in the internal registry (or any external registry for that matter). |
@cgwalters is the version of |
*Mitigation:* The image layer format in use by `containers/storage` is designed to be backwards | ||
compatible between major versions. If in the event data in the additional store is not usable, | ||
buildah should fall back to its default behavior of pulling image content from the upstream | ||
registry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nalind does buildah behave like this today?
*Mitigation:* Builds already run as a privileged container, and thus are able to obtain more | ||
privileges than the parent kubelet. If for any reason the node does not allow privileged containers | ||
to read contents in the node image cache, buildah should be able to fall back to its current | ||
behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nalind another item that we should verify - if the additional store can't be read by buildah, will it default to "no additional store?"
Remember, containers/storage doesn't exist on its own - it's a Go library that gets vendored into another binary, most notably The answer overall then is that RHEL can absolutely rebase mid-OpenShift Z, and that has already happened for 8.0 ➡️ 8.1 in 4.2. Now, almost the entire platform is structured as pods scheduled by kubelet - hence, the version of c/storage used by The main things that aren't run via crio are the bootstrap process and OS updates, both of which use Now I forget offhand if crio and podman share an image store or not. That's an important thing to determine. If they do we already had two c/storage versions in play already, having build pods access that same storage would bring a potential third version in. |
@adambkaplan it lgtm, ping me tomorrow if you're ready for the label (allowing some time for others to make a final pass if they desire) |
podman and CRI-O share the image store. |
Restoring (in part) the node-level cache used in OCP 3.x builds. This proposal utilizes buildah's additional stores feature to mount the node image store as read-only. Builds can then utilize image layers in the store to bypass image pulls from a registry. DEVEXP-334
4a8bb8a
to
feebf16
Compare
@bparees @nalind @cgwalters added a note on the existing containers/storage version skew we have with podman and cri-o. Squashed commits and marked |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adambkaplan, bparees The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Restoring (in part) the node-level cache used in OCP 3.x builds.
This proposal utilizes buildah's additional stores feature to mount the
node image store as read-only.
DEVEXP-334