Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a proposal for a shared layers directory #163

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sambhav
Copy link
Member

@sambhav sambhav commented May 21, 2021

Signed-off-by: Sambhav Kothari skothari44@bloomberg.net

Readable

Related RFC #155


This RFC introduces the concept of shared build layers which multiple buildpacks can collaborate on/re-use during build time which are not meant to end up in the final application image and are cacheable. Typical use cases include shared caches for tools like CCache, pip cache etc.

> NOTE: This RFC should be implemented as an atomic change alongside RFC https://github.com/buildpacks/rfcs/pull/155
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> NOTE: This RFC should be implemented as an atomic change alongside RFC https://github.com/buildpacks/rfcs/pull/155
> NOTE: This RFC should be implemented as an atomic change alongside [RFC#155](https://github.com/buildpacks/rfcs/pull/155)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hone is that notation confusing given the RFC number doesn't match the PR number. E.g. if the referenced PR were merged today it would be RFC 0089.

Copy link
Member

@hone hone Jun 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it might be but we seem to use "RFC#XXX" for GitHub PR numbers and "RFC XXXX" for RFC numbers. Of course we don't define this anywhere. I'm also fine with just doing [RFC](https://github.com/buildpacks/rfcs/pull/155) and cutting out the number.


All layers created in the `shared-layers` directory are build layers by default. The `cache` layer flag decides if the layer will be cached for the next run or not. This flag will follow the same semantics as [specified here.](https://github.com/buildpacks/spec/blob/main/buildpack.md#cached-layers). The `<shared-layer>.toml` for each layer will be restored with the same semantics as [specified here.](https://github.com/buildpacks/spec/blob/main/buildpack.md#cached-layers) with the `types` table being omitted during subsequent restores to avoid stale layers.

A `shared-layer` is associated with a specific buildpack that created it and it is expected to be wiped if the associated buildpack is not involved during the build process. The `env` directory is co-located with the `shared-layer` so that the associated environment variables that may point to directories in the `shared-layer` follow the same lifecycle as the `shared-layer` itself.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there ever a case that if that specific buildpack is gone that subsequent buildpacks would rely/still expect it to be there?

# How it Works
[how-it-works]: #how-it-works

All layers created in the `shared-layers` directory are build layers by default. The `cache` layer flag decides if the layer will be cached for the next run or not. This flag will follow the same semantics as [specified here.](https://github.com/buildpacks/spec/blob/main/buildpack.md#cached-layers). The `<shared-layer>.toml` for each layer will be restored with the same semantics as [specified here.](https://github.com/buildpacks/spec/blob/main/buildpack.md#cached-layers) with the `types` table being omitted during subsequent restores to avoid stale layers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a buildpack author would have to set:

types.cache = true

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may need to explicitly require users to set build = true for each of these layers. Otherwise it will be impossible to distinguish between a stale layer and a new cache = false shared layer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, mostly just asking for clarification of the ask. This makes sense to me.

@hone
Copy link
Member

hone commented Jun 2, 2021

I like this version of the proposal a lot better with a dedicated shared layer section.

└── env
```

Each buildpack will be passed an environment variable `CNB_BP_SHARED_LAYERS_DIR` with the location to the shared layers directory for a buildpack.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we worried about collision of shared layered directories that we still want them to be "owned" by a buildpack?

# Definitions
[definitions]: #definitions

- shared layers: A build-time only layer which can be cached and re-used during builds. This layer is "created" by a specific buildpack but can be used by subsequent buildpacks during the build process. Unlike normal buildpack layers, these layers are not expected to follow the unix directory strcuture and the lifecycle SHOULD NOT modidy variables like `PATH`, `LD_LIBRARY_PATH` etc. with folders present in these layers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- shared layers: A build-time only layer which can be cached and re-used during builds. This layer is "created" by a specific buildpack but can be used by subsequent buildpacks during the build process. Unlike normal buildpack layers, these layers are not expected to follow the unix directory strcuture and the lifecycle SHOULD NOT modidy variables like `PATH`, `LD_LIBRARY_PATH` etc. with folders present in these layers.
- shared layers: A build-time only layer which can be cached and re-used during builds. This layer is "created" by a specific buildpack but can be used by subsequent buildpacks during the build process. Unlike normal buildpack layers, these layers are not expected to follow the unix directory structure and the lifecycle SHOULD NOT modify variables like `PATH`, `LD_LIBRARY_PATH` etc. with folders present in these layers.

- Added maintanence complexity.
- Additional complexity for buildpack authors to account for.

# Alternatives
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we list an alternative that is a simple build cache dumping ground (aka not owned by a specific buildpack)? Whatever is there is always restored, no automatic cleanup, etc.

# Unresolved Questions
[unresolved-questions]: #unresolved-questions

- How would we accomodate for older platform APIs that may not have a mounted volume for `shared-layers`?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it matter if it is a volume or not? As long as the shared-layers folder exists and is writeable, things should be fine I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be impossible for the exporter to cache shared layers if they are not on a volume that is passed between phases. I think we have a couple options here:

  1. Accept that shared layers won't be cached on older platforms and thus newer buildpacks that use this feature might be less performant on older platforms.
  2. Have the builder create a shared layers dir under layers. It'll be ugly but given the requirement that a buildpack ID is in the form <namespace>/<name>, shared or shared-layers is not a valid buildpack ID and therefore there is no danger of a conflict.
  3. Have the builder copy these layers into the regular buildpack layers dir after the build is complete. This will cause some performance degradation on older platforms, but it probably isn't as dramatic as option 1.
  4. Make this an optional feature of the buildpack API. e.g the platform MAY give you a shared layers dir, if not assumed layers is fine. This probably foists too much of the complexity onto buildpack authors.
  5. Assume that by the time this is implemented the build phase will have taken over responsibility for archiving layers and putting them somewhere the exporter can read them Make build layers read only for subsequent buildpacks #155. But that change is a platform API change so it doesn't really solve the problem? Unless we ignore some things that are technically specified in the inter-phase API - if competing implementations of individuals phases is a non-goal, we potentially specified more than we should have...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The volume is only required for untrusted builder paths right? creator flows wouldn't have to use a volume. Subsequent builds would restore into the shared-layers dir from the cache volume/image right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably answer this in this RFC before it goes through.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with 2. Although buildpacks published to the registry must be of the form <namespace>/<name> the spec doesn't prohibit buildpacks which don't have a namespace. It does however only allow alpha numeric + ./- characters. So went with @shared instead. https://github.com/buildpacks/spec/blob/main/buildpack.md#buildpacktoml-toml

Signed-off-by: Sambhav Kothari <skothari44@bloomberg.net>
Signed-off-by: Sambhav Kothari <skothari44@bloomberg.net>
@sambhav
Copy link
Member Author

sambhav commented Jul 8, 2021

Given the recent discussions around different ways of caching things using the buildpacks API and the complexity surrounding them, we are converting this RFC to a draft until further needs/use cases arise.

This might go hand in hand with the asset packages pause.

@sambhav sambhav marked this pull request as draft July 8, 2021 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants