Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: mkdir /buildkite: read-only file system #729

Closed
torbjornvatn opened this issue Apr 13, 2018 · 25 comments
Closed

ERROR: mkdir /buildkite: read-only file system #729

torbjornvatn opened this issue Apr 13, 2018 · 25 comments

Comments

@torbjornvatn
Copy link

Several incidents with this error message when using the Helm chart to set up agents:

helm/charts#3526

@toolmantim
Copy link
Contributor

I’ve tried various permutations of volume configurations to try to solve this, but I couldn’t seem to get it to work with volume mounts.

I did get this DIND version working fine though:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: buildkite-agent
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: buildkite-agent
    spec:
      containers:
        - name: buildkite-agent
          image: buildkite/agent
          imagePullPolicy: Always
          env:
            - name: BUILDKITE_AGENT_TOKEN
              valueFrom: {secretKeyRef: {name: buildkite-agent, key: token}}
            - name: DOCKER_HOST
              value: tcp://localhost:2375
          volumeMounts:
            - mountPath: /buildkite/builds
              name: buildkite-builds
        - name: docker-dind
          image: docker:dind
          imagePullPolicy: Always
          securityContext:
            privileged: true
          volumeMounts:
            # Allows host volume mounts from buildkite-agent → docker-dind
            - mountPath: /buildkite/builds
              name: buildkite-builds
      volumes:
        - emptyDir: {}
          name: buildkite-builds

🤔

@lox
Copy link
Contributor

lox commented Apr 15, 2018

I reckon the key distinction there @toolmantim is buildkite-builds. So rather than trying to host mount in /buildkite from the host OS (which is a bad idea with Container OS because it's read-only), the version you posted uses a named volume for builds, which is 💯.

@lox
Copy link
Contributor

lox commented Apr 15, 2018

The other approach that might work is to set BUILDKITE_BUILD_PATH to a path that is writeable and persistent (see https://cloud.google.com/kubernetes-engine/docs/concepts/node-images#file_system_layout), I used /var/buildkite/builds.

@lox
Copy link
Contributor

lox commented Apr 15, 2018

Figured out the issue. The path /buildkite doesn't exist on the host, it's in the buildkite-agent container. Trying to mount it causes docker to mkdir /buildkite on the host (default behaviour if sourceDir doesn't exist), and then you get the error about the host filesystem being read-only.

Solution is to use a writeable path on the host system for builds, and use exactly the same path in the buildkite-agent container.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
 name: buildkite-agent
spec:
 replicas: 1
 template:
   metadata:
     labels:
       app: buildkite-agent
   spec:
     containers:
       - name: buildkite-agent
         image: buildkite/agent
         imagePullPolicy: Always
         securityContext:
           privileged: true
         env:
           - name: BUILDKITE_AGENT_TOKEN
             valueFrom: {secretKeyRef: {name: buildkite-agent, key: token}}
           - name: BUILDKITE_AGENT_DEBUG
             value: "true"
           - name: BUILDKITE_BUILD_PATH
             value: "/var/buildkite/builds"
         volumeMounts:
           - name: docker-binary
             mountPath: /usr/bin/docker
           - name: docker-socket
             mountPath: /var/run/docker.sock
           - name: buildkite-builds
             mountPath: /var/buildkite/builds
     volumes:
       - name: docker-binary
         hostPath: {path: /usr/bin/docker}
       - name: docker-socket
         hostPath: {path: /var/run/docker.sock}
       - name: buildkite-builds
         hostPath: {path: /var/buildkite/builds}

@rimusz
Copy link

rimusz commented Apr 15, 2018

thanks @lox I will cut a new chart release in https://github.com/buildkite/charts in a few next days

@lox
Copy link
Contributor

lox commented Apr 15, 2018

Thanks @rimusz! Otherwise I'm happy to PR that repo! Still figuring my way around the helm stuff.

@rimusz
Copy link

rimusz commented Apr 15, 2018

ok, cool, do the PR, and I will review it, then I have plans to maker major chart update to follow the latest helm best practices, and then merge it to helm charts upstream.

@sj26
Copy link
Member

sj26 commented Apr 16, 2018

Hi folks! If you're using Google Kubernetes Engine (GKE) on Google Cloud then this is a limitation of the host instance running Google's container-optimized OS which mounts the root filesystem as read-only: https://cloud.google.com/container-optimized-os/docs/concepts/security. They mention that /var is mounted as a stateful partition, so /var/buildkite might work instead. But beware that /var is mounted noexec so will not persist executable permissions.

It sounds like a more robust alternative might be to use a kubernetes persistent volume and figure out how to mount that into the builds' docker and docker-compose containers. Or use DinD, reluctantly.

Edit: sorry, I've jumped the gun on the issue. I see this is about mapping through the volumes, not the host filesystem.

@lox
Copy link
Contributor

lox commented May 18, 2018

I'm going to close this for now, there isn't anything that can be done on the agent side of this, it's a kubernetes/host issue.

@lox
Copy link
Contributor

lox commented May 23, 2018

They mention that /var is mounted as a stateful partition, so /var/buildkite might work instead. But beware that /var is mounted noexec so will not persist executable permissions.

Unfortunately this is going to make my above solution not viable. @jamiebuilds just pointed out that it will prevent scripts in your builds from executing as expected.

It sounds like a more robust alternative might be to use a kubernetes persistent volume and figure out how to mount that into the builds' docker and docker-compose containers

I'll look into that a bit more, it sounds like the better way to go. Any suggestions on this front @rimusz?

@emmenko
Copy link

emmenko commented Jun 1, 2018

@lox have you found a viable solution? I'm just getting started with Buildkite and tried to setup a minimal test build but bumped into this problem right away. I'm running the agents in a K8s cluster.

I'm a bit lost on how to proceed here. Many thanks! 🙏

@lox
Copy link
Contributor

lox commented Jun 26, 2018

@emmenko I think the answer in Kubernetes would be to create a persistent volume for the Agent pod. The bit we haven't solved yet is how to handle various docker plugins trying to mount $PWD from the host into build containers.

@nuxlli
Copy link

nuxlli commented Jul 23, 2018

I was able to make it work in my fork following DIND’s tip. It can be improved on in order to be an optional feature.

@Eun
Copy link

Eun commented Aug 10, 2018

@nuxlli I was not able to get your fork running, any other ideas?

@lox any other ideas? This is a major issue for us, using persistent volumes completely defeats the purpose of Kubernetes in GCE (we can not scale, since GCEPersistentDisk does not support ReadWriteMany... https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes)


Maybe you should also update your documentation (https://buildkite.com/docs/agent/v3/gcloud#running-the-agent-on-google-container-engine) and tell users that its not working with Docker on Kubernetes.

@bluemalkin
Copy link

bluemalkin commented Oct 19, 2018

Hi - I've ran into this issue and was wondering whether anyone has come up with a working solution ?

I guess I could still use GCEPersistentDisk mounted with a pod affinity one pod per node. Not ideal but I cannot think of anything else ? Or even better a StatefulSet with a PVC, which makes more sense.

@Eun
Copy link

Eun commented Oct 22, 2018

@bluemalkin Instead of using kubernetes I build my own image and used a vm:
https://github.com/talon-one/gce-buildkite-alpine

@lox
Copy link
Contributor

lox commented Oct 22, 2018

I'm sorry you weren't able to get it working @Eun and @bluemalkin. It's a complicated issue. The root issue is that Kubernetes is not designed to expose the host docker socket to containers, which makes it hard to run CI workloads that need a docker socket. This manifests as problems mapping paths inside containers to paths on the host. Our plugins frequently try and mount $PWD in to containers they create, which when the internal path is /buildkite results in the error that is discussed in this ticket.

We have several customers with large Kubernetes installations, I believe they are working around the issue either with Docker-in-Docker or with PVC's. There are still problems with figuring out how to translate the $PWD in a container into the volume on the host for the docker daemon, which makes plugin use hard.

@nhooyr
Copy link

nhooyr commented Feb 7, 2019

On google Kubernetes engine, you can use this path /home/kubernetes/flexvolume/buildkite/builds. Its not noexec and so should work normally.

@nhooyr
Copy link

nhooyr commented Feb 7, 2019

Also this issue should be reopened as its not fully solved.

@lox
Copy link
Contributor

lox commented Feb 7, 2019

@nhooyr the thing is I'm not sure that the agent issues is a good spot for the discussion of the issue. There isn't anything we could do in the agent to fix this, it's more something that we need to work with the Kubernetes community to figure out. We definitely need a spot to discuss Kubernetes+Buildkite though. Perhaps a Forum Category would be better? https://forum.buildkite.community

@nhooyr
Copy link

nhooyr commented Feb 7, 2019

I think it'd it be best to make an issue on the kubernetes repo and see what they say.

@if6was9
Copy link

if6was9 commented Mar 11, 2019

Is there an option to have the plugins mount elsewhere?

https://cloud.google.com/container-optimized-os/docs/concepts/security

Google explains that / is mounted read-only. But /buildkite is on /, thus the problem.

@toolmantim
Copy link
Contributor

@if6was9 sorry you had trouble there! The plugins-path agent configuration option is probably what you're after.

@AlistairB
Copy link

AlistairB commented Aug 31, 2020

A workaround to the noexec issue with /var/lib/buildkite is to hijack /var/lib/docker/buildkite which is writeable and executable.

@bluemalkin
Copy link

A workaround to the noexec issue with /var/lib/buildkite is to hijack /var/lib/docker/buildkite which is writeable and executable.

Could you please expand on this one ? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests