Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permission denied different steps are using a different workingDir (and one is a subdirectory of other's workdir step) #6842

Open
Sgitario opened this issue Jun 19, 2023 · 11 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@Sgitario
Copy link

Sgitario commented Jun 19, 2023

Update: the actual issue and cause is described in this comment.

I have a task definition that creates an empty volume like:

apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
  name: example
spec:
  podTemplate:
    securityContext:
      fsGroup: 65532
  serviceAccountName: example
  taskRef:
    name: example-build-and-deploy
  workspaces:
    - emptyDir:
        medium: Memory
      name: source

Where example-build-and-deploy is:

apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
  name: spring-boot-with-tekton-example-build-and-deploy
spec:
  params:
    - default: gcr.io/tekton-releases/github.com/tektoncd/pipeline/cmd/git-init:v0.40.2
      description: The image providing the git-init binary that this Task runs.
      name: gitCloneInitImage
      type: string
    - default: https://github.com/Sgitario/dekorate.git
      description: Repository URL to clone from.
      name: repoUrl
      type: string
    - default: 714d56acc3d398ec56905ba64352d9082d172536
      description: "Revision to checkout. (branch, tag, sha, ref, etc...)"
      name: revision
      type: string
  workspaces:
    - description: The workspace to hold all project sources
      name: source
      readOnly: false
  steps:
    - env:
        - name: PARAM_URL
          value: $(inputs.params.repoUrl)
        - name: PARAM_REVISION
          value: $(inputs.params.revision)
      image: $(inputs.params.gitCloneInitImage)
      name: git-clone
      script: |-
        #!/usr/bin/env sh
        set -eu
        CHECKOUT_DIR="$(workspaces.source.path)"

        git config --global --add safe.directory "$CHECKOUT_DIR"
        /ko-app/git-init \
          -url="${PARAM_URL}" \
          -revision="${PARAM_REVISION}" \
          -path="${CHECKOUT_DIR}"
      securityContext:
        runAsNonRoot: true
        runAsUser: 65532
      workingDir: $(workspaces.source.path)

When running the workflow, it fails with:

2023/06/19 11:22:27 Entrypoint initialization
2023/06/19 11:22:28 Decoded script /tekton/scripts/script-0-9fv8c
+ CHECKOUT_DIR=/workspace/source
+ git config --global --add safe.directory /workspace/source
+ /ko-app/git-init '-url=https://github.com/Sgitario/dekorate.git' '-revision=714d56acc3d398ec56905ba64352d9082d172536' '-path=/workspace/source'
{"level":"error","ts":1687173756.299481,"caller":"git/git.go:53","msg":"Error running git [checkout -f FETCH_HEAD]: exit status 128\nfatal: cannot create directory at 'examples/frameworkless-on-kubernetes-example': Permission denied\n","stacktrace":"github.com/tektoncd/pipeline/pkg/git.run\n\tgithub.com/tektoncd/pipeline/pkg/git/git.go:53\ngithub.com/tektoncd/pipeline/pkg/git.Fetch\n\tgithub.com/tektoncd/pipeline/pkg/git/git.go:164\nmain.main\n\tgithub.com/tektoncd/pipeline/cmd/git-init/main.go:53\nruntime.main\n\truntime/proc.go:250"}
{"level":"fatal","ts":1687173756.2995424,"caller":"git-init/main.go:54","msg":"Error fetching git repository: exit status 128","stacktrace":"main.main\n\tgithub.com/tektoncd/pipeline/cmd/git-init/main.go:54\nruntime.main\n\truntime/proc.go:250"}
2023/06/19 11:22:36 Skipping step because a previous step failed

I've found a similar issue #1872. However, the proposed solution didn't work for me (see the fsGroup and runAsUser)

Steps to Reproduce the Problem

I can prepare a reproducer, just let me know if you need it.

Additional Info

  • Kubernetes version:

    Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.5", GitCommit:"6b1d87acf3c8253c123756b9e61dac642678305f", GitTreeState:"clean", BuildDate:"2021-03-18T01:10:43Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.3", GitCommit:"9e644106593f3f4aa98f8a84b23db5fa378900bd", GitTreeState:"clean", BuildDate:"2023-03-30T06:34:50Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/amd64"}

  • Tekton Pipeline version:

    Output of tkn version or kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'

v0.39.0

Note that I'm trying to migrate out from the PipelineResources like:

apiVersion: tekton.dev/v1alpha1
kind: PipelineResource
metadata:
  name: spring-boot-with-tekton-example-git
spec:
  params:
    - name: url
      value: https://github.com/Sgitario/dekorate.git
    - name: revision
      value: 283012129c043643c8363d674ba5d322f20425ce
  type: git

And it needs to be compatible when running a PipelineRun and TaskRun.
This work is part of the Dekorate Tekton extension: https://github.com/dekorateio/dekorate#tekton

@Sgitario Sgitario added the kind/bug Categorizes issue or PR as related to a bug. label Jun 19, 2023
@Sgitario
Copy link
Author

Update: the same happens to me when using Tekton 0.47 and 0.48.
Also, if I try to clone the project in a subdirectory like $(workspaces.source.path)/repo works fine. Though, we can't do this because it seems the workingDir property is somehow shared among steps, example:

steps:
    - env: // ...
      image: $(inputs.params.gitCloneInitImage)
      name: git-clone
      script: |-
        #!/usr/bin/env sh
        set -eu
        CHECKOUT_DIR="$(workspaces.source.path)/repo"

        git config --global --add safe.directory "$CHECKOUT_DIR"
        /ko-app/git-init \
          -url="${PARAM_URL}" \
          -revision="${PARAM_REVISION}" \
          -path="${CHECKOUT_DIR}"

        cd "${CHECKOUT_DIR}"
        RESULT_SHA="$(git rev-parse HEAD)"
        EXIT_CODE="$?"
        if [ "${EXIT_CODE}" != 0 ] ; then
          exit "${EXIT_CODE}"
        fi
      workingDir: $(workspaces.source.path)
    - command:
        - $(inputs.params.projectBuilderCommand)
      image: $(inputs.params.projectBuilderImage)
      name: project-build
      workingDir: $(workspaces.source.path)

This works fine, but the second step fails because the cloned project is in a subdirectory, so this failure is expected. However:

steps:
    - env: // ...
      image: $(inputs.params.gitCloneInitImage)
      name: git-clone
      script: |-
        #!/usr/bin/env sh
        set -eu
        CHECKOUT_DIR="$(workspaces.source.path)/repo"

        git config --global --add safe.directory "$CHECKOUT_DIR"
        /ko-app/git-init \
          -url="${PARAM_URL}" \
          -revision="${PARAM_REVISION}" \
          -path="${CHECKOUT_DIR}"

        cd "${CHECKOUT_DIR}"
        RESULT_SHA="$(git rev-parse HEAD)"
        EXIT_CODE="$?"
        if [ "${EXIT_CODE}" != 0 ] ; then
          exit "${EXIT_CODE}"
        fi
      workingDir: $(workspaces.source.path)
    - command:
        - $(inputs.params.projectBuilderCommand)
      image: $(inputs.params.projectBuilderImage)
      name: project-build
      workingDir: $(workspaces.source.path)/repo # this is the only change I did.

This fails in the first step spites I didn't change anything in it.

@Sgitario
Copy link
Author

Another update (and found the root cause of the issue), my previous comment rings a bell to me since we were using another step to do more things. The full list of steps are:

apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
  name: spring-boot-with-tekton-example-build-and-deploy
spec:
  params: // ...
  steps:
    - env: // ...
      image: $(inputs.params.gitCloneInitImage)
      name: git-clone
      script: |-
        #!/usr/bin/env sh
        set -eu
        CHECKOUT_DIR="$(workspaces.source.path)"

        git config --global --add safe.directory "$CHECKOUT_DIR"
        /ko-app/git-init \
          -url="${PARAM_URL}" \
          -revision="${PARAM_REVISION}" \
          -path="${CHECKOUT_DIR}" <1>

        cd "${CHECKOUT_DIR}"
        RESULT_SHA="$(git rev-parse HEAD)"
        EXIT_CODE="$?"
        if [ "${EXIT_CODE}" != 0 ] ; then
          exit "${EXIT_CODE}"
        fi
      workingDir: $(workspaces.source.path)
    - args: 
        - "$(inputs.params.projectBuilderArgs[*])"
        - -Duser.name=jcarvaja
        - -Ddekorate.image-pull-secrets=spring-boot-with-tekton-example-registry-credentials
        - -Ddekorate.docker.registry=quay.io
      command:
        - $(inputs.params.projectBuilderCommand)
      image: $(inputs.params.projectBuilderImage)
      name: project-build
      workingDir: $(workspaces.source.path)
    - args:
        - "$(inputs.params.imageBuilderArgs[*])"
      command:
        - $(inputs.params.imageBuilderCommand)
      env:
        - name: DOCKER_CONFIG
          value: /tekton/home/.docker
      image: $(inputs.params.imageBuilderImage)
      name: image-build
      workingDir: $(workspaces.source.path)/examples/spring-boot-with-tekton # <2>
  workspaces:
    - description: The workspace to hold all project sources
      name: source
      readOnly: false

Having set the workingDir property with <2> triggers the permission denied issue when checking out a subdirectory in <1>.

Issue title updated with my findings.

@Sgitario Sgitario changed the title Permission denied when using an empty volume and running the first step Permission denied different steps are using a different workingDir (and one is a subdirectory of other's workdir step) Jun 20, 2023
@vdemeester
Copy link
Member

@Sgitario thanks for the issue. This is kind-of a kubernetes / container runtime issue. When you set workingDir in containers, in a Pod, kubernetes (kubelet and/or the container runtime) will have to make sure that directory exists (ahead of time). Tekton Task becomes a Pod, and all the containers are defined and will be prepared and started at the same time (that's the way kubernetes works).

What this means is that any workingDir specified in any of your step, will be created before almost anything else, and most likely (given the issue you are seeing), with the root permission. A way to verify this would be to create a Task with 2 steps ; the 2nd step would set a workingdir as a subdirectory of the first, and then the first step would just do a ls -la . .. (or something similar). You would then see that the subdirectory already exists and has, most likely, root permissions.

And all this is, kubernetes behavior, there is very little we could do on the Tekton side.

  • One workaround, on the user, is to use the same workingDir for the steps, and use script and cd commands to move around
  • One possible adjustement tektoncd/pipeline could do would be to make it so we are not setting the workingDir on the Pod but the entrypoint "magic" would be the one changing the working directory before executing the content of the step. This would be a bit confusing for people used to kubernetes and it would be less confusing overall. The only trick is, it would be hard if not impossible to make that change in a backward-compatible way and it might have shortcomings. cc @tektoncd/core-maintainers

Quick additionnal note: you should migrate to use params.XXX instead of inputs.params.XXX as this is only still working because of backward compatibility but there is not such things as inputs.params anymore.

@vdemeester
Copy link
Member

vdemeester commented Jun 20, 2023

Actually, I stand corrected on this 😅. I did dig into tektoncd/pipeline code base and, the workingDir initialization is probably on the tekton side of things, due to workingdirinit command/image… and prior to #6515, it does run as root..

@Sgitario so there is a slight chance that behavior gets better with 0.49 and above (0.50 would be the first LTS to have this fix) as #6515 will be part of 0.49 release.

@Sgitario
Copy link
Author

@vdemeester many thanks for your comment.

  • One workaround, on the user, is to use the same workingDir for the steps, and use script and cd commands to move around

I did what you suggested here and I could move forward.

Quick additionnal note: you should migrate to use params.XXX instead of inputs.params.XXX as this is only still working because of backward compatibility but there is not such things as inputs.params anymore.

I will. Many thanks for noticing it!

@ksingh-scogo
Copy link

ksingh-scogo commented Jul 2, 2023

Actually, I stand corrected on this 😅. I did dig into tektoncd/pipeline code base and, the workingDir initialization is probably on the tekton side of things, due to workingdirinit command/image… and prior to #6515, it does run as root..

@Sgitario so there is a slight chance that behavior gets better with 0.49 and above (0.50 would be the first LTS to have this fix) as #6515 will be part of 0.49 release.

@vdemeester @Sgitario i am currently using v0.49 but still facing this issue :(

Any suggestions on workarounds

  • Logs from taskrun
[clone] + '[' -d /workspace/output/ ]
[clone] + rm -rf /workspace/output//lost+found
[clone] + rm -rf '/workspace/output//.[!.]*'
[clone] + rm -rf '/workspace/output//..?*'
[clone] + test -z 
[clone] + test -z 
[clone] + test -z 
[clone] + git config --global --add safe.directory /workspace/output
[clone] + /ko-app/git-init '-url=https://github.com/brightzheng100/spring-boot-docker' '-revision=' '-refspec=' '-path=/workspace/output/' '-sslVerify=true' '-submodules=true' '-depth=1' '-sparseCheckoutDirectories='
[clone] {"level":"error","ts":1688315850.2345598,"caller":"git/git.go:53","msg":"Error running git [init /workspace/output/]: exit status 1\n/workspace/output/.git: Permission denied\n","stacktrace":"github.com/tektoncd/pipeline/pkg/git.run\n\tgithub.com/tektoncd/pipeline/pkg/git/git.go:53\ngithub.com/tektoncd/pipeline/pkg/git.Fetch\n\tgithub.com/tektoncd/pipeline/pkg/git/git.go:88\nmain.main\n\tgithub.com/tektoncd/pipeline/cmd/git-init/main.go:53\nruntime.main\n\truntime/proc.go:250"}
[clone] {"level":"fatal","ts":1688315850.2347317,"caller":"git-init/main.go:54","msg":"Error fetching git repository: exit status 1","stacktrace":"main.main\n\tgithub.com/tektoncd/pipeline/cmd/git-init/main.go:54\nruntime.main\n\truntime/proc.go:250"}
  • TaskRun
apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
  generateName: git-clone-
spec:
  taskRef:
    kind: Task
    name: git-clone
  params:
  - name: url
    value: https://github.com/brightzheng100/spring-boot-docker
  - name: deleteExisting
    value: "true"
  workspaces:
    - name: output
      #emptyDir: {}
      persistentVolumeClaim:
        claimName: shared-workspace

@ksingh-scogo
Copy link

just for the record , managed to get it work by adding securityContext

apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
  generateName: git-clone-
spec:
  podTemplate:
    securityContext:
      fsGroup: 65532 

@cjnosal
Copy link

cjnosal commented Jul 28, 2023

Could the workingdirinit inherit the runAsUser from the podTemplate.securityContext? My task pod is created with runAsUser/runAsGroup/fsGroup all set to 65534 but the working directory is still created with ownership root:root

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 26, 2023
@vdemeester
Copy link
Member

/remove-lifecycle stale

@tekton-robot tekton-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 27, 2023
@vdemeester
Copy link
Member

Could the workingdirinit inherit the runAsUser from the podTemplate.securityContext? My task pod is created with runAsUser/runAsGroup/fsGroup all set to 65534 but the working directory is still created with ownership root:root

I think that's something we should explore yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

5 participants