-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use gunicorn to serve logs generated by worker #17591
Conversation
from flask import Flask, abort, request, send_from_directory | ||
from itsdangerous import TimedJSONWebSignatureSerializer | ||
from setproctitle import setproctitle | ||
|
||
from airflow.configuration import conf | ||
|
||
|
||
def flask_app(): | ||
flask_app = Flask(__name__) | ||
def create_app(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow convention supported by flask. https://github.com/pallets/flask/blob/afc13b9390ae2e40f4731e815b49edc9ef52ed4b/src/flask/cli.py#L63
def flask_app(): | ||
flask_app = Flask(__name__) | ||
def create_app(): | ||
flask_app = Flask(__name__, static_folder=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't support this feature, so we can safely turn it off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain a bit more about this change, why do we need it?
Yeah. Agree with @kaxil I figured it is after seeing this slack conversation: https://apache-airflow.slack.com/archives/CCQ7EGB1P/p1626866127472700 - but commit message why we are doing it and what problem are we solving (what was the effect of running the dev server rather than gunicorn). |
app.run(host='0.0.0.0', port=worker_log_server_port) | ||
options = { | ||
'bind': f"0.0.0.0:{worker_log_server_port}", | ||
'workers': 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need 2 workers or do away with just 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 should work too, but I configured 2 to failover in case one worker had problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah 2 is better just in case of memory errors/crashes. Just one small nit (in case somene has problems with memory usage etc.) i think it would be great to mention in the docs of worker that we are using Gunicorm and that the configuration options can be overridden by GUNiCORN_CMD_ARGS env variable https://docs.gunicorn.org/en/latest/settings.html#settings
It's not at all obvious from docs without looking at the code now that wlwe have separate Gunicorm processes forked and that you can configure their behaviour via ENV vars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Added docs.
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
@@ -107,3 +107,16 @@ External Links | |||
When using remote logging, users can configure Airflow to show a link to an external UI within the Airflow Web UI. Clicking the link redirects a user to the external UI. | |||
|
|||
Some external systems require specific configuration in Airflow for redirection to work but others do not. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Flask build-in webserver is not intended to meet security and performance requirements for a production server. It should be used only for development.
We even see warnings when we start worker.
For production deployment, we should production-ready WSGI server. In this contribution, I used Gunicorn because it is already used by the
airflow webserver
command, so we don't need to add a dependency.^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.