Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed mount propagation won't work with Docker #648

Closed
jsafrane opened this issue May 22, 2017 · 8 comments
Closed

Proposed mount propagation won't work with Docker #648

jsafrane opened this issue May 22, 2017 · 8 comments
Labels
sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage.

Comments

@jsafrane
Copy link
Member

propagation.md says that we should "Make HostPath shared for privileged containers, slave for non-privileged."

I hacked this into kubelet (inspired by kubernetes/kubernetes#41683) and tried to build a container that would mount iSCSI, CephRBD and Gluster volumes without having iscsiadm, rbd and mount.glusterfs installed on the node.

The resulting pod looks like:

      spec:
        hostNetwork: true
        containers:
         - name: mounter
           image: jsafrane/mounter-daemonset:latest
           securityContext:
             privileged: true
           volumeMounts:
             - name: kubelet
               mountPath: /var/lib/kubelet
             - name: sys
               mountPath: /sys
             - name: dev
               mountPath: /dev
             - name: iscsi
               mountPath: /etc/iscsi
             - name: iscsilock
               mountPath: /run/lock/iscsi
             - name: modules
               mountPath: /lib/modules
        volumes:
          - name: kubelet
            hostPath:
              path: /var/lib/kubelet
          - name: sys
            hostPath:
              path: /sys
          - name: dev
            hostPath:
              path: /dev
          - name: iscsi
            hostPath:
              path: /etc/iscsi
          - name: iscsilock
            hostPath:
              path: /run/lock/iscsi
          - name: modules
            hostPath:
              path: /lib/modules

What the pod really needs with shared mount propagation is /var/lib/kubelet. All the rest should be either rslave or private and it breaks things when it's rshared:

  • Docker creates /dev/shm when running the container. /dev/ is shared between the host and the container -> anything that the host had in /dev/shm is lost. In addition, docker does not unmount it from some reason when the container dies. Private or rshared mount propagation would be enough for /dev.
  • I need /sys in the container to be able to talk to Ceph kernel module, however systemd in the container can't work with /sys/fs/cgroup as shared. It simply refuses to start. I need systemd to kill reap zombies of fuse daemons and to start NFS client daemons and iscsid during container startup. Now I am stuck with non-systemd init. Private or rshared mount propagation would be enough for /sys.
  • I have PR to make /var/lib/kubelet/ as shared during kubelet startup (https://github.com/kubernetes/kubernetes/pull/45724). This won't be enough, because now I need /etc, /dev, /sysand/run` as shared too.

IMO, exporting a HostPath as shared should be opt-in per VolumeHost (or VolumeMount), it should not be enabled by default for all HostPath volumes as agreed in the proposal.

Adding random people who were active in mount propagation PRs:
@euank @lpabon @lucab @thockin @ivan4th @vishh @lvlv
@kubernetes/sig-node-proposals @kubernetes/sig-storage-proposals

@k8s-ci-robot
Copy link
Contributor

@jsafrane: These labels do not exist in this repository: sig/node, sig/storage.

In response to this:

propagation.md says that we should "Make HostPath shared for privileged containers, slave for non-privileged."

I hacked this into kubelet (inspired by kubernetes/kubernetes#41683) and tried to build a container that would mount iSCSI, CephRBD and Gluster volumes without having iscsiadm, rbd and mount.glusterfs installed on the node.

The resulting pod looks like:

     spec:
       hostNetwork: true
       containers:
        - name: mounter
          image: jsafrane/mounter-daemonset:latest
          securityContext:
            privileged: true
          volumeMounts:
            - name: kubelet
              mountPath: /var/lib/kubelet
            - name: sys
              mountPath: /sys
            - name: dev
              mountPath: /dev
            - name: iscsi
              mountPath: /etc/iscsi
            - name: iscsilock
              mountPath: /run/lock/iscsi
            - name: modules
              mountPath: /lib/modules
       volumes:
         - name: kubelet
           hostPath:
             path: /var/lib/kubelet
         - name: sys
           hostPath:
             path: /sys
         - name: dev
           hostPath:
             path: /dev
         - name: iscsi
           hostPath:
             path: /etc/iscsi
         - name: iscsilock
           hostPath:
             path: /run/lock/iscsi
         - name: modules
           hostPath:
             path: /lib/modules

What the pod really needs with shared mount propagation is /var/lib/kubelet. All the rest should be either rslave or private and it breaks things when it's rshared:

  • Docker creates /dev/shm when running the container. /dev/ is shared between the host and the container -> anything that the host had in /dev/shm is lost. In addition, docker does not unmount it from some reason when the container dies. Private or rshared mount propagation would be enough for /dev.
  • I need /sys in the container to be able to talk to Ceph kernel module, however systemd in the container can't work with /sys/fs/cgroup as shared. It simply refuses to start. I need systemd to kill reap zombies of fuse daemons and to start NFS client daemons and iscsid during container startup. Now I am stuck with non-systemd init. Private or rshared mount propagation would be enough for /sys.
  • I have PR to make /var/lib/kubelet/ as shared during kubelet startup (https://github.com/kubernetes/kubernetes/pull/45724). This won't be enough, because now I need /etc, /dev, /sysand/run` as shared too.

IMO, exporting a HostPath as shared should be opt-in per VolumeHost (or VolumeMount), it should not be enabled by default for all HostPath volumes as agreed in the proposal.

Adding random people who were active in mount propagation PRs:
@euank @lpabon @lucab @thockin @ivan4th @vishh @lvlv
@kubernetes/sig-node-proposals @kubernetes/sig-storage-proposals

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@kfox1111
Copy link

I've wondered about the status of this for a while. Its affecting a lot of projects I work on. What is the path forward to shared mounts?

If its stagnating because no one can really figure out whats the best path forward, I'd say then its overly complicated to try and do it flagless and lets just add a flag so we can make progress?

@cmluciano
Copy link

/sig node
/sig storage

@k8s-ci-robot
Copy link
Contributor

@cmluciano: These labels do not exist in this repository: sig/node, sig/, sig/storage.

In response to this:

/sig node
/sig storage

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@cmluciano
Copy link

@kubernetes/sig-node-proposals

@k8s-ci-robot
Copy link
Contributor

@cmluciano: These labels do not exist in this repository: sig/node.

In response to this:

@kubernetes/sig-node-proposals

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot
Copy link

@jsafrane
There are no sig labels on this issue. Please add a sig label by:

  1. mentioning a sig: @kubernetes/sig-<group-name>-<group-suffix>
    e.g., @kubernetes/sig-contributor-experience-<group-suffix> to notify the contributor experience sig, OR

  2. specifying the label manually: /sig <label>
    e.g., /sig scalability to apply the sig/scalability label

Note: Method 1 will trigger an email to the group. You can find the group list here and label list here.
The <group-suffix> in the method 1 has to be replaced with one of these: bugs, feature-requests, pr-reviews, test-failures, proposals

@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Aug 15, 2017
@feiskyer
Copy link
Member

/sig node
/sig storage

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage. labels Aug 16, 2017
@k8s-github-robot k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Aug 16, 2017
k8s-github-robot pushed a commit that referenced this issue Aug 22, 2017
Automatic merge from submit-queue

Redesign mount propagation

The proposal won't work as it was merged, it makes too many directories as shared (see #648).

A different approach is needed, I've chosen 'Add an option in VolumeMount API', but I would be fine also with 'Add an option in HostPathVolumeSource', there is only very small difference to me.

The proposal also describes how it will be implemented, especially during alpha phase.

Fixes #648

@kubernetes/sig-node-proposals @kubernetes/sig-storage-proposals
MadhavJivrajani pushed a commit to MadhavJivrajani/community that referenced this issue Nov 30, 2021
Automatic merge from submit-queue

Redesign mount propagation

The proposal won't work as it was merged, it makes too many directories as shared (see kubernetes#648).

A different approach is needed, I've chosen 'Add an option in VolumeMount API', but I would be fine also with 'Add an option in HostPathVolumeSource', there is only very small difference to me.

The proposal also describes how it will be implemented, especially during alpha phase.

Fixes kubernetes#648

@kubernetes/sig-node-proposals @kubernetes/sig-storage-proposals
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

No branches or pull requests

6 participants