Task was marked as running but was not present in the job queue, so it has been marked as failed. #14277

deep7861 · 2023-07-24T05:01:06Z

Please confirm the following

I agree to follow this project's code of conduct.
I have checked the current issues for duplicates.
I understand that AWX is open source software provided for free and that I might not receive a timely response.
I am NOT reporting a (potential) security vulnerability. (These should be emailed to security@ansible.com instead.)

Bug Summary

One of our jobs consistently fails with this error:
Task was marked as running but was not present in the job queue, so it has been marked as failed.

We haven't been able to identify any resource crunch on k8s cluster, neither the AWX POD are running out of resources.

AWX version

21.3.0

Select the relevant components

Installation method

kubernetes

Modifications

no

Ansible version

No response

Operating system

No response

Web browser

No response

Steps to reproduce

Our setup:
AKS 1.23.8
AWX Operator: 0.24.0
AWX: 21.3.0

This job is connecting to ~30 linux VMs (inventory hosts) and from each VM, contacting ~100 network devices to get output of 3 commands.
The output is being stored in a dictionary per inventory host.

The job runs okay when there are lesser network devices (upto 90ish), with always fail with 100.

As the error probably says, the issue should not be in the network or device access or anything else.

Expected results

Play runs smooth and job finishes as expected

Actual results

Job fails with error message:
Task was marked as running but was not present in the job queue, so it has been marked as failed.

Additional information

No response

The text was updated successfully, but these errors were encountered:

fosterseth · 2023-07-26T18:06:32Z

@deep7861 you may be running into the k8s max container log issue. Changing this max log size varies depending on your k8s cluster type, but here is a thread that explains it a bit #11338 (comment)

the other thing to look into is the receptor reconnect option ansible/receptor#683 (comment)

deep7861 · 2023-08-08T15:34:46Z

@fosterseth Thank you for looking into this issue.
While I try to find the log size relation, I happen to notice a strange behavior.
In some of the posts you mentioned, I saw a suggestion to check the 'result_traceback' value from /api/v2/jobs/job_id for the failed job.
Now, when I try doing it - the page doesn't load. Here is what I get:

When I try to look for that job from usual AWX UI, it fails as well:

This error appears to happen and I note below log from web container:

2023/08/08 15:28:49 [error] 33#33: *189 upstream prematurely closed connection while reading response header from upstream, client: 10.244.7.25, server: _, req │
│ 10.244.7.25 - - [08/Aug/2023:15:28:49 +0000] "GET /api/v2/unified_jobs/?name__icontains=ine_lm&not__launch_type=sync&order_by=-finished&page=1&page_size=20 HTT │
│ DAMN ! worker 5 (pid: 38) died, killed by signal 9 :( trying respawn ... │
│ Respawned uWSGI worker 5 (new pid: 70) │
│ mounting awx.wsgi:application on / │
│ WSGI app 0 (mountpoint='/') ready in 1 seconds on interpreter 0x7636d0 pid: 70 (default app)

Do we know why this is happening?

bpedersen2 · 2024-04-11T15:00:56Z

#9594

github-actions bot added needs_triage type:bug community labels Jul 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task was marked as running but was not present in the job queue, so it has been marked as failed. #14277

Task was marked as running but was not present in the job queue, so it has been marked as failed. #14277

deep7861 commented Jul 24, 2023

fosterseth commented Jul 26, 2023

deep7861 commented Aug 8, 2023 •

edited

Loading

bpedersen2 commented Apr 11, 2024

Task was marked as running but was not present in the job queue, so it has been marked as failed. #14277

Task was marked as running but was not present in the job queue, so it has been marked as failed. #14277

Comments

deep7861 commented Jul 24, 2023

Please confirm the following

Bug Summary

AWX version

Select the relevant components

Installation method

Modifications

Ansible version

Operating system

Web browser

Steps to reproduce

Expected results

Actual results

Additional information

fosterseth commented Jul 26, 2023

deep7861 commented Aug 8, 2023 • edited Loading

bpedersen2 commented Apr 11, 2024

deep7861 commented Aug 8, 2023 •

edited

Loading