-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server CrashLoopBackOff
while calling log
API without container
: panic: runtime error: invalid memory address or nil pointer dereference
#13585
Comments
CrashLoopBackOff
while calling the API with curl - panic: runtime error: invalid memory address or nil pointer dereference
There should be a stack trace following that log pointing to the exact line of the
It normally doesn't; I'm suspecting there's a missing |
CrashLoopBackOff
while calling the API with curl - panic: runtime error: invalid memory address or nil pointer dereference
CrashLoopBackOff
while calling log
API without container
: panic: runtime error: invalid memory address or nil pointer dereference
there was no more log concerning the error unfortunately. |
Can you run https://pkg.go.dev/cmd/addr2line with the program counter as an argument along with the binary? |
Hello, can I work on this one? |
I've just tested . So I switched to the pre-production argo-workflows (where I'd encountered the problem), I tested our real workflowTemplate and there was the CrashLoopBackOff. By re-running the tests with the basic worklowtemplates tested in LAB, there's no problem. I put the preprod PODS in debug mode, I redid the crash test but there's no more info in the log. Note for the basic templates, the spec:
containers:
- args:
- server
- --configmap=argo-workflows-workflow-controller-configmap
- --auth-mode=client
- --auth-mode=sso
- --secure=false
- --namespaced
- --loglevel
- debug
- --gloglevel
- "0"
- --log-format
- text
```
> workflow server pod1 log
```log-
time="2024-11-18T10:16:33.744Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=GetWorkflowTemplate grpc.service=workflowtemplate.WorkflowTemplateService grpc.start_time="2024-11-18T10:16:33Z" grpc.time_ms=13.259 span.kind=server system=grpc
time="2024-11-18T10:16:33.747Z" level=info duration=17.33487ms method=GET path=/api/v1/workflow-templates/argo-workflows/workflow-controlm size=6690 status=0
time="2024-11-18T10:16:48.263Z" level=info duration="112.732µs" method=GET path=index.html size=487 status=0
time="2024-11-18T10:17:08.263Z" level=info duration="181.525µs" method=GET path=index.html size=487 status=0
time="2024-11-18T10:17:28.263Z" level=info duration="146.051µs" method=GET path=index.html size=487 status=0
time="2024-11-18T10:17:48.264Z" level=info duration="139.866µs" method=GET path=index.html size=487 status=0
time="2024-11-18T10:18:08.264Z" level=info duration="347.373µs" method=GET path=index.html size=487 status=0
time="2024-11-18T10:18:28.263Z" level=info duration="315.198µs" method=GET path=index.html size=487 status=0
time="2024-11-18T10:18:41.558Z" level=debug msg="List options" namespace=argo-workflows options="{{ } workflows.argoproj.io/workflow=workflow-controlm-44vpo false false <nil> 0 }" workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.558Z" level=debug msg="Log options" namespace=argo-workflows options="&PodLogOptions{Container:,Follow:false,Previous:false,SinceSeconds:nil,SinceTime:<nil>,Timestamps:false,TailLines:nil,LimitBytes:nil,InsecureSkipTLSVerifyBackend:false,}" workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.560Z" level=debug msg="Listing workflow pods" namespace=argo-workflows workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.570Z" level=debug msg="Ensuring pod logs stream" alreadyStreaming=false namespace=argo-workflows podName=workflow-controlm-44vpo-build-1554548131 podPhase=Succeeded workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.570Z" level=debug msg="Ensuring pod logs stream" alreadyStreaming=false namespace=argo-workflows podName=workflow-controlm-44vpo-create-job-3142162940 podPhase=Succeeded workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.570Z" level=debug msg="Ensuring pod logs stream" alreadyStreaming=false namespace=argo-workflows podName=workflow-controlm-44vpo-notify-204188948 podPhase=Succeeded workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.570Z" level=debug msg="Not starting watches" namespace=argo-workflows workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.570Z" level=debug msg="Waiting for work-group" namespace=argo-workflows workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.570Z" level=debug msg="Sorting entries" namespace=argo-workflows workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.570Z" level=debug msg="Streaming pod logs" namespace=argo-workflows podName=workflow-controlm-44vpo-create-job-3142162940 workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.570Z" level=debug msg="Streaming pod logs" namespace=argo-workflows podName=workflow-controlm-44vpo-build-1554548131 workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.570Z" level=debug msg="Streaming pod logs" namespace=argo-workflows podName=workflow-controlm-44vpo-notify-204188948 workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.575Z" level=error msg="a container name must be specified for pod workflow-controlm-44vpo-notify-204188948, choose one of: [init wait main]" namespace=argo-workflows podName=workflow-controlm-44vpo-notify-204188948 workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.575Z" level=debug msg="Pod logs stream done" namespace=argo-workflows podName=workflow-controlm-44vpo-notify-204188948 workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.576Z" level=error msg="a container name must be specified for pod workflow-controlm-44vpo-build-1554548131, choose one of: [init wait prerequisistes main]" namespace=argo-workflows podName=workflow-controlm-44vpo-build-1554548131 workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.576Z" level=debug msg="Pod logs stream done" namespace=argo-workflows podName=workflow-controlm-44vpo-build-1554548131 workflow=workflow-controlm-44vpo
time="2024-11-18T10:18:41.594Z" level=debug msg="Pod logs stream done" namespace=argo-workflows podName=workflow-controlm-44vpo-create-job-3142162940 workflow=workflow-controlm-44vpo
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x48 pc=0x1e8c8a4]
goroutine 373 [running]:
github.com/argoproj/argo-workflows/v3/util/logs.WorkflowLogs.func1.1({0xc0007879e0, 0x2d})
/go/src/github.com/argoproj/argo-workflows/util/logs/workflow-logger.go:167 +0x524
created by github.com/argoproj/argo-workflows/v3/util/logs.WorkflowLogs.func1 in goroutine 224
/go/src/github.com/argoproj/argo-workflows/util/logs/workflow-logger.go:126 +0x5e5
So I wonder if the problem isn't related to the resource types in the template. apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
name: workflow-controlm
namespace: argo-workflows
spec:
ttlStrategy:
# keep completed workflows for 1d
secondsAfterCompletion: 86400
workflowMetadata:
annotations:
workflows.argoproj.io/title: "{{workflow.parameters.DISPLAY_NAME}}"
labels:
workflows.argoproj.io/title: "{{workflow.parameters.DISPLAY_NAME}}"
entrypoint: main-template
onExit: exit-handler
serviceAccountName: workflow-pods-sa
arguments:
parameters:
- name: PROJECT_URL
value: ""
- name: PROJECT_BRANCH
value: ""
- name: CONTROLM_JOB
value: ""
- name: CONTROLM_TASK
value: ""
- name: CONTROLM_CTMS
value: ""
- name: TEMPLATE_NAME
value: ""
- name: BUILD_URL
value: ""
- name: DISPLAY_NAME
value: "to-overload"
volumes:
- name: workspace
emptyDir: {}
templates:
- name: main-template
steps:
- - name: build
template: build
arguments:
parameters:
- name: PROJECT_URL
value: "{{workflow.parameters.PROJECT_URL}}"
- name: PROJECT_BRANCH
value: "{{workflow.parameters.PROJECT_BRANCH}}"
- name: CONTROLM_TASK
value: "{{workflow.parameters.CONTROLM_TASK}}"
- name: TEMPLATE_NAME
value: "{{workflow.parameters.TEMPLATE_NAME}}"
- - name: create-job
template: create-job
arguments:
artifacts:
- name: kubejob
raw:
data: |
{{workflow.outputs.parameters.KUBEJOB}}
# Steps template definition: Build
- name: build
inputs:
parameters:
- name: PROJECT_URL
- name: PROJECT_BRANCH
- name: CONTROLM_TASK
- name: TEMPLATE_NAME
outputs:
parameters:
- name: namespace
valueFrom:
path: /workspace/namespace.txt
globalName: NAMESPACE
- name: kube_job
valueFrom:
path: /workspace/kube-job.yaml
globalName: KUBEJOB
- name: backofflimit
valueFrom:
path: /workspace/backofflimit.txt
globalName: backoffLimit
containerSet:
volumeMounts:
- name: workspace
mountPath: /workspace
containers:
- name: prerequisistes
image: myregistry/tools:latest
envFrom:
- secretRef:
name: gitlab-a1963-read
command:
- sh
- '-c'
- |
set -o errexit
cd /workspace
gitUrl="{{inputs.parameters.PROJECT_URL}}"
gitBranch="{{inputs.parameters.PROJECT_BRANCH}}"
codeApp=$(basename $(dirname {{inputs.parameters.PROJECT_URL}}))
echo "${codeApp}-${gitBranch}" > namespace.txt
git clone -b ${gitBranch} ${gitUrl} chart
- name: main
dependencies:
- prerequisistes
image: myregistry/tools:latest
command:
- sh
- '-c'
- |
set -o errexit
cd /workspace/
printf "\n### Templating the Helm Chart ###\n"
taskPath="chart/{{inputs.parameters.CONTROLM_TASK}}"
templateName="{{inputs.parameters.TEMPLATE_NAME}}"
outputDir="./myfolder"
helm template ${taskPath}/${templateName}/ -f ${taskPath}/app-values.yaml -f ${taskPath}/infra-values.yaml --output-dir ${outputDir} --dependency-update
# Store data to export them as global variables
find ${outputDir} -type f -iname "job*.yaml" | xargs cat > kube-job.yaml
yq .spec.backoffLimit kube-job.yaml > backofflimit.txt
namespace=$(cat namespace.txt)
get_ns=$(kubectl get -o name --ignore-not-found ns ${namespace})
if [ -z $get_ns ]; then kubectl create ns ${namespace}; fi
printf "\n### Creating configmaps - Secrets - AVP ###\n"
find ${outputDir} -type f -iname "configmap-*.yaml" -o -iname "secret*.yaml" | while read -r line; do
cat ${line} | | kubectl -n ${namespace} apply -f -
done
# Steps template definition: Argo App
- name: create-job
inputs:
artifacts:
- name: kubejob
path: /artifacts/kubejob
resource:
action: create
successCondition: status.succeeded > 0
failureCondition: status.failed > {{workflow.outputs.parameters.backoffLimit}}
flags: ["--namespace={{workflow.outputs.parameters.NAMESPACE}}"]
manifest: |
{{workflow.outputs.parameters.KUBEJOB}}
outputs:
parameters:
- name: job-status
valueFrom:
jsonPath: '{.status}'
# Exit handler templates
# After the completion of the entrypoint template
- name: exit-handler
steps:
- - name: Success-Notifier
template: notify
arguments:
parameters:
- name: exitCode
value: 0
when: "{{workflow.status}} == Succeeded && {{workflow.parameters.PROJECT_BRANCH}} != dev"
- name: Failure-Notifier
template: notify
arguments:
parameters:
- name: exitCode
value: 1
when: "{{workflow.status}} != Succeeded && {{workflow.parameters.PROJECT_BRANCH}} != dev"
# notify template definition
- name: notify
inputs:
parameters:
- name: exitCode
container:
image: alpine/curl
envFrom:
- secretRef:
name: controlm
command:
- sh
- '-c'
- |
exitCode="{{inputs.parameters.exitCode}}"
job="{{workflow.parameters.CONTROLM_JOB}}"
task="{{workflow.parameters.CONTROLM_TASK}}"
ctms="{{workflow.parameters.CONTROLM_CTMS}}"
message="xxxx"
# POST exitCode HERE |
@tooptoop4 NO, but I will do it... |
Pre-requisites
:latest
image tag (i.e.quay.io/argoproj/workflow-controller:latest
) and can confirm the issue still exists on:latest
. If not, I have explained why, in detail, in my description below.What happened? What did you expect to happen?
By trying to get a workflow logs with the API endpoint
api/workflow/namespace/name/log
using curl, I received a reply502 Bad Gateway
.I checked the pod status and found that the pod servers are
CrashLoopBackOff
followed by arestart
.And looking at the previous logs, we can clearly see that the pod has
panic: runtime error: invalid memory address or nil pointer dereference
because of thecontainer
parameter missing from the request.When I specify the container
?logOptions.container=main
it works fineHowever, I think that the
argo-workflows-server
pod should notCrashLoopBackOff
for this reason.Otherwise it would be very impactful in production if simple bad API calls crash the whole argo-worflows
Version(s)
v3.5.10
Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.
Logs from the workflow controller
Logs from in your workflow's wait container
The text was updated successfully, but these errors were encountered: