Skip to content

Latest commit

 

History

History
234 lines (156 loc) · 9.55 KB

app-scanning-troubleshooting.hbs.md

File metadata and controls

234 lines (156 loc) · 9.55 KB

Troubleshooting Supply Chain Security Tools - Scan 2.0

This topic helps you troubleshoot Supply Chain Security Tools (SCST) - Scan 2.0.

Overview

When Scan 2.0 creates an ImageVulnerabilityScan, the following resources are also created:

  • Tekton PipelineRun with the following Tasks:
    • workspace-setup-task
    • scan-task
    • publish-task
  • Tekton TaskRun corresponding to each Task
  • Pod corresponding to each TaskRun

Viewing resources

  • To view all resources:

    kubectl get imagevulnerabilityscans,pipelineruns,taskruns,pods -n DEV-NAMESPACE

    Where DEV-NAMESPACE is the name of your developer namespace.

  • To verify which resources are failing, proceed to the following debugging sections:

    NAME                                                                SUCCEEDED   REASON
    imagevulnerabilityscan.app-scanning.apps.tanzu.vmware.com/my-scan   False       Failed
    
    NAME                                   SUCCEEDED   REASON      STARTTIME   COMPLETIONTIME
    pipelinerun.tekton.dev/my-scan-5kllf   False       Failed      2m10s       85s
    
    NAME                                                    SUCCEEDED   REASON      STARTTIME   COMPLETIONTIME
    taskrun.tekton.dev/my-scan-5kllf-publish-task           False       Failed      94s         85s
    taskrun.tekton.dev/my-scan-5kllf-scan-task              True        Succeeded   2m1s        94s
    taskrun.tekton.dev/my-scan-5kllf-workspace-setup-task   True        Succeeded   2m9s        2m1s
    
    NAME                                         READY   STATUS      RESTARTS   AGE
    pod/my-scan-5kllf-publish-task-pod           0/4     Completed   1          94s
    pod/my-scan-5kllf-scan-task-pod              0/4     Completed   1          2m
    pod/my-scan-5kllf-workspace-setup-task-pod   0/2     Completed   1          2m10s

Debugging commands

The following sections describe commands you run to get logs and details about scanning errors.

Debugging resources

If a resource fails or has errors, inspect the resource. If multiple resources are involved, inspecting them all can provide a broader understanding. For example, inspecting the corresponding TaskRun to a failed Pod.

To get status conditions on a resource:

kubectl describe RESOURCE RESOURCE-NAME -n DEV-NAMESPACE

Where:

  • RESOURCE is one of the following: ImageVulnerabilityScan, PipelineRun, TaskRun, or Pod.
  • RESOURCE-NAME is the name of the RESOURCE.
  • DEV-NAMESPACE is the name of your developer namespace.

Debugging scan pods

You can use the following methods to debug scan pods:

  • To get error logs from a pod when scan pods fail:

    kubectl logs SCAN-POD-NAME -n DEV-NAMESPACE

    Where SCAN-POD-NAME is the name of the scan pod.

    For information about debugging Kubernetes pods, see the Kubernetes documentation.

    A scan run that has an error can indicate that one of the following step containers has a failure:

    • step-workspace-setup
    • step-write-certs
    • step-cred-helper
    • step-SCANNER
    • step-publisher
    • sidecar-sleep

    Where step-SCANNER is your scanner step.

  • To verify which step container had a failed exit code:

    kubectl get taskrun TASKRUN-NAME -o json | jq .status

    Where TASKRUN-NAME is the name of the TaskRun.

  • To inspect a specific step container in a pod:

    kubectl logs scan-pod-name -n DEV-NAMESPACE -c step-container-name

    Where DEV-NAMESPACE is your developer namespace.

    For information about debugging a TaskRun, see the Tekton documentation.

  • To debug inside of the scan-task pod: Add an additional step with a sleep command after your scanner step in the ImageVulnerabilityScan. For example:

    ...
    spec:
      ...
      steps:
      - name: SCANNER-STEP
        ...
      - name: view
        image: busybox:latest
        args:
        - -c
        - sleep 6000

    This keeps the pod in a running state so that you can exec into it. Re-run the scan and then exec into the pod:

        kubectl exec SCAN-TASK-POD-NAME -n DEV-NAMESPACE -c step-view --stdin --tty -- sh

    Where SCAN-TASK-POD-NAME is the name of your scan-task pod.

Viewing the Scan-Controller manager logs

You can run these commands to view the Scan-Controller manager logs:

  • Retrieve scan-controller manager logs:

    kubectl logs deployment/app-scanning-controller-manager -n app-scanning-system
  • Tail scan-controller manager logs:

    kubectl logs -f deployment/app-scanning-controller-manager -n app-scanning-system

Troubleshooting issues

Volume permission error

If you encounter a permission error for accessing, opening, and writing to the files inside cluster volume, such as:

unsuccessful cred copy: ".git-credentials" from "/tekton/creds" to "/home/app-scanning": unable to open destination: open /home/app-scanning/.git-credentials: permission denied

Ensure that the problematic step runs with the proper user and group ids.

Incompatible Tekton version

Tanzu Application Platform v1.7.0 or later includes app-scanning.apps.tanzu.vmware.com version 0.2.0 and Tekton Pipelines version 0.50.1. The app-scanning.apps.tanzu.vmware.com package is incompatible with previous versions of Tekton Pipelines as v1 CRDs were not enabled. You must upgrade Tanzu Application Platform to v1.7.0 or later before upgrading app-scanning.apps.tanzu.vmware.com.

If you did not upgrade Tanzu Application Platform before upgrading app-scanning.apps.tanzu.vmware.com, you can encounter ImageVulnerabilityScans not progressing:

NAME      SUCCEEDED   REASON
my-scan

To resolve this issue:

  1. Confirm that the issue is due to installing an incompatible Tekton version by viewing the controller manager logs by running:

    kubectl -n app-scanning-system logs -f deployment/app-scanning-controller-manager -c manager

    If you encounter the following error, proceed to the next step:

    ERROR  controller-runtime.source.EventHandler  failed to get informer from cache  {"error": "failed to get API group resources: unable to retrieve the complete list of server APIs: tekton.dev/v1: the server could not find the requested resource"}
  2. Upgrade Tanzu Application Platform to v1.7.0 or later. See Upgrade your Tanzu Application Platform.

Scan results empty

The publish-task task fails if the scan-results-path (default value of /workspace/scan-result) is empty. To confirm, view the logs of the publish-task pod:

kubectl logs PUBLISH-TASK-POD-NAME -c step-publisher -n DEV-NAMESPACE

Where PUBLISH-TASK-POD-NAME is the name of your publish-task pod.

2023/08/22 17:09:49 results folder /workspace/scan-results is empty

To resolve this issue, you can debug within the scan-task pod by following the instructions under Debugging scan pods. You must use an image with both a shell and your scanner CLI image to run the sleep command and troubleshoot your scanner commands from within the container.

Scanning in a cluster with restricted Kubernetes Pod Security Standards

As part of compliance with the restricted profile for Kubernetes Pod Security Standards, you must set the securityContext of containers and initContainers. When a pod does not meet pod Security Standards, it is not created and vulnerability scanning cannot proceed. For more information, see the Kubernetes documentation.

You might see an error message similar to the following when describing the TaskRun:

pods "trivy-ivs-abcd-scan-task-pod" is forbidden: violates PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "prepare" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "prepare" must set securityContext.capabilities.drop=["ALL"]), seccompProfile (pod or container "prepare" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost"). Maybe invalid TaskSpec. ScanPodError PodNotFound: no pod found

To resolve this issue:

  1. Update your Tekton Pipelines package configuration in your tap-values.yaml with the following changes:

    tekton_pipelines:
        feature_flags:
            set_security_context: "true"

    Setting the securityContext resolves the prepare initContainer violation.

  2. Update your Tanzu Application Platform installation by running:

    tanzu package installed update tap -p tap.tanzu.vmware.com -v TAP-VERSION  --values-file tap-values.yaml -n tap-install

    Where TAP-VERSION is the version of Tanzu Application Platform installed.

  3. Re-run the scan.