Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-3857: Recursive Read-only (RRO) mounts #3858

Merged
merged 1 commit into from
Feb 8, 2024

Conversation

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/node Categorizes an issue or PR as relevant to SIG Node. labels Feb 8, 2023
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Feb 8, 2023
keps/sig-node/3857-rro-hostpath/README.md Outdated Show resolved Hide resolved
keps/sig-node/3857-rro-hostpath/README.md Outdated Show resolved Hide resolved
keps/sig-node/3857-rro-hostpath/README.md Outdated Show resolved Hide resolved
@AkihiroSuda
Copy link
Member Author

@thockin
Could you take a look again?
The latest revision takes a ternary value: recursivelyReadOnly: (Disabled|IfPossible|Required).
Defaults to Disabled for compatibility.

@thockin
Copy link
Member

thockin commented Feb 28, 2023

Did you add something about status? #3858 (comment)

"""
We also probably need a way to know what happened if the "If possible" path was chose, so a status field per-volume?

We also probably need a way for nodes to indicate that they do support it, so a node status field? Not sure if there is precedent for this. @mrunalp @bobbypage might know.
"""

@AkihiroSuda
Copy link
Member Author

status

PTAL: #3858 (comment)

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 4, 2023
keps/sig-node/3857-rro-hostpath/README.md Outdated Show resolved Hide resolved
keps/sig-node/3857-rro-hostpath/README.md Outdated Show resolved Hide resolved
keps/sig-node/3857-rro-hostpath/README.md Outdated Show resolved Hide resolved
AkihiroSuda added a commit to AkihiroSuda/containerd that referenced this pull request Feb 7, 2024
Revert PR 9713, as it appeared to break the compatibility too much
kubernetes/enhancements#3858 (comment)

This reverts commit b2f254f.

> Conflicts:
>	internal/cri/opts/spec_linux_opts.go

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@thockin
Copy link
Member

thockin commented Feb 7, 2024

@thaJeztah

I know this topic came up when working on the changes in Moby / Docker Engine, where it was decided that for the "docker run" case, the intent always was for :ro to mean fully (recursive) read-only

I am curious how that decision was reached? I suspect you are right, in general (that almost always this is what people want) and yet to that tiny fraction of users who actually depend on the non-recursive behavior, you just broke their app. How do other projects think about that tradeoff? Kube is erring on the side of not breaking users (as did Linux by making RRO distinct from RO), but that has costs, too.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda
Copy link
Member Author

@thockin For Docker, we can consider reverting the behavior when a client speaks an old API:

AkihiroSuda added a commit to AkihiroSuda/containerd that referenced this pull request Feb 7, 2024
Revert PR 9713, as it appeared to break the compatibility too much
kubernetes/enhancements#3858 (comment)

This reverts commit b2f254f.

> Conflicts:
>	internal/cri/opts/spec_linux_opts.go

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@thockin
Copy link
Member

thockin commented Feb 7, 2024

@AkihiroSuda

For Docker, we can consider reverting the behavior when a client speaks an old API

I'm not in a position to say what Docker should do - I was asking a question, not offering an opinion :)

AkihiroSuda added a commit to AkihiroSuda/containerd that referenced this pull request Feb 8, 2024
Revert PR 9713, as it appeared to break the compatibility too much
kubernetes/enhancements#3858 (comment)

This reverts commit b2f254f.

> Conflicts:
>	internal/cri/opts/spec_linux_opts.go

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda
Copy link
Member Author

/assign @johnbelamaric

AkihiroSuda added a commit to AkihiroSuda/containerd that referenced this pull request Feb 8, 2024
Revert PR 9713, as it appeared to break the compatibility too much
kubernetes/enhancements#3858 (comment)

This reverts commit b2f254f.

> Conflicts:
>	internal/cri/opts/spec_linux_opts.go

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
that might indicate a serious problem?
-->

Look for an event saying indicating RRO is not supported by the runtime.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

events are not easy for a cluster admin to observe. metrics are much better. So, a few things I can imagine we might need to look for:

  • crashing kubelet (especially when the field is set)
  • pod failures / errors from the runtime
  • pod rejections

Imagine you enable this on a cluster with 100 different users. How do you know if it's causing a problem? If it is causing a problem for some users, but other users are relying on it, can you roll it back?

Now imagine you have 10,000 clusters. You enable this on 100 clusters. You want to know if you should enable it on more clusters. How can you tell if users are using it? How can you tell if it is working?

That's what we're trying to get to with some of these questions.

Note these are non-blocking for alpha.

Yes, the feature is used if the following `jq` command prints non-zero number:

```bash
kubectl get pods -A -o json | jq '[.items[].spec.containers[].volumeMounts[]? | select(.recursiveReadOnly)] | length'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above discussing metrics and the "fleet admin" point of view

@johnbelamaric
Copy link
Member

/approve

Added some comments on the PRR sections that are for beta, but they are non-blocking for alpha.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: AkihiroSuda, johnbelamaric, mrunalp, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 8, 2024
@mrunalp
Copy link
Contributor

mrunalp commented Feb 8, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 8, 2024
@k8s-ci-robot k8s-ci-robot merged commit 1175bef into kubernetes:master Feb 8, 2024
4 checks passed
azr pushed a commit to azr/containerd that referenced this pull request Aug 30, 2024
Revert PR 9713, as it appeared to break the compatibility too much
kubernetes/enhancements#3858 (comment)

This reverts commit b2f254f.

> Conflicts:
>	internal/cri/opts/spec_linux_opts.go

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lead-opted-in Denotes that an issue has been opted in to a release lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status
Projects
Development

Successfully merging this pull request may close these issues.