-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Airflow Celery Worker logs inaccessible #18239
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
In airflow.cfg set this
If still doesn't work, ensure worker containers are exposing port 8793 in kubernetes template. Log existence check failed because webserver probably tries to find logs locally, but they're probably stored on worker. It does fallback to logic of retrieval of those logs from worker containers using REST. |
Hello @dimon222 - The workers are sitting behind a Kubernetes service. Hence, the logs need to be accessible via the service name. To support this, I have used a different hostname callable and gave it the DNS name of service. I do not see how it could be a problem though. You can imagine the situation like: master.abc.com -> Airflow master I need logs to be accessible behind worker.abc.com instead of via IP address |
I'm using the helm chart and experiencing something similar. When using |
I don't think this is going to work, since each airflow worker runs a background flask application which serves logs from tasks run on that worker at the specified logging port (in this case 8793). airflow/airflow/utils/serve_logs.py Lines 70 to 72 in 9b3ed1f
By running the workers behind a load balancer, you're removing the webserver's ability to specify which server the logs are stored on. I suspect that if you reload the page enough times it may work eventually when your request happens to be routed to the correct worker via the load balancer. |
I would assume "shared logs" implies that storage is shared (mounted volume, PVC in kubernetes, etc). Unless there's something on worker itself restricts to go above what was allocated to this specific worker to do? |
Hi dimon222, i set |
@changxiaoju |
Thankyou, but i am not using CeleryExecutor , instead i use LocalExcutor, what may cause the error then? |
If error does include the port still in a same way as you see in first message, you could attempt expose that port on your scheduler. If not, I suspect you have some unrelated exception and should probably make separate ticket for that. |
Please check if the issue happens on latest Airflow version (there has been some work related to this) |
This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author. |
This issue has been closed because it has not received response from the issue author. |
Apache Airflow version
2.1.3 (latest released)
Operating System
Python 3.6 Apache Airflow Docker
Versions of Apache Airflow Providers
2.1.3
Deployment
Other Docker-based deployment
Deployment details
Kubernetes based deployment - Workers and Master in Kubernetes as pods. Logs accessed via NodePort Service
What happened
*** Log file does not exist: /xxx/airflow/home/logs/xxxx/2021-09-14T12:13:32.383510+00:00/1.log
*** Fetching from: http://tsc-aflow-orca:8793/log/xxxx/2021-09-14T12:13:32.383510+00:00/1.log
*** Failed to fetch log file from worker. 503 Server Error: Service Unavailable for url: http://tsc-aflow-orca:8793/log/xxxx/2021-09-14T12:13:32.383510+00:00/1.log
For more information check: https://httpstatuses.com/503
Checked the Worker pod - Logs exist. However, it seems like the Worker web service is unable to access the logs.
What you expected to happen
Worker logs worked fine with the same setup but older version v1.10.12
How to reproduce
Run a sample Dag with Webserver, Scheduler and one Worker running in Kubernetes - No custom setup.
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: