-
Notifications
You must be signed in to change notification settings - Fork 16.4k
use 1 worker for log server #54510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use 1 worker for log server #54510
Conversation
fixes stacktraces of the form in standalone & running directly for the components that launch logserver under a multiprocessing process: scheduler | Process SpawnProcess-1:1: scheduler | Traceback (most recent call last): scheduler | File "/opt/bb/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap scheduler | self.run() scheduler | File "/opt/bb/lib/python3.10/multiprocessing/process.py", line 108, in run scheduler | self._target(*self._args, **self._kwargs) scheduler | File "/mnt/dev/repos/airflow/.venv/lib/python3.10/site-packages/uvicorn/_subprocess.py", line 73, in subprocess_started scheduler | sys.stdin = os.fdopen(stdin_fileno) # pragma: full coverage scheduler | File "/opt/bb/lib/python3.10/os.py", line 1030, in fdopen scheduler | return io.open(fd, mode, buffering, encoding, *args, **kwargs) scheduler | OSError: [Errno 9] Bad file descriptor there is a bug in uvicorn that causes a crash when uvicorn.run is being spawned from a process that itself was spawned by multiprocessing, as we do with serve_logs: Kludex/uvicorn#2679 until uvicorn itself fixes what it tries to do with stdin, just use 1 worker instead of 2 to avoid the codepath that causes the break.
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
|
@jason810496 curious - any reason you chose 2 workers specifically for this when you refactored? |
Airflow use 2 workers before I refactor from Flask to FastAPI with uvicorn. |
|
By the way, how can I reproduce this error? |
| # This is why we split the instantiation of log server to a separate module | ||
| # | ||
| # https://github.com/encode/uvicorn/blob/374bb6764e8d7f34abab0746857db5e3d68ecfdd/docs/deployment/index.md?plain=1#L50-L63 | ||
| uvicorn.run("airflow.utils.serve_logs.log_server:app", host=host, port=port, workers=1, log_level="info") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe which version of uvicorn will cause the problem you described above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah gotcha. i put a simple reproducer in Kludex/uvicorn#2679 which is just straight uvicorn
running airflow standalone on main right now also repros for me on rhel 9 with uvicorn 0.35.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
running airflow standalone on main
Thanks for the reply.
Will it raise the same error for just released Airflow 3.0.4 standalone as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just checked, no problem on 3.0.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks for the confirmation.
It would be great to raise another issue to track this behavior.
This means there might not be uvicorn issue but other changes in Airflow that cause this error.
Since there is not problem on 3.0.4 but there is error on latest main branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mphillips81@mmp-pw-dev /tmp/af3/.venv/lib/python3.10/site-packages/airflow % grep -rsi 'uvicorn'
cli/commands/api_server_command.py:import uvicorn
cli/commands/api_server_command.py: log.warning("Running in dev mode, ignoring uvicorn args")
cli/commands/api_server_command.py: Running the uvicorn with:
cli/commands/api_server_command.py: uvicorn.run(
Binary file cli/commands/__pycache__/api_server_command.cpython-310.pyc matches
i think uvicorn is only being hit from api-server in the released 3.0.4 - the problem in main is the uvicorn.run is happening wrapped by a multiprocessing.Process when log-server is spawned - those changes aren't in 3.0.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mattp- for helping out!
Sorry for the delay response, and this issue was just addressed in Fix scheduler serve_logs subprocess file descriptor errors #55443 so I will close your PR.
Thanks for your understanding.
fixes stacktraces of the form in standalone & running directly for the components that launch logserver under a multiprocessing process:
there is a bug in uvicorn that causes a crash when uvicorn.run is being spawned from a process that itself was spawned by multiprocessing, as we do with serve_logs: Kludex/uvicorn#2679
until uvicorn itself fixes what it tries to do with stdin, just use 1 worker instead of 2 to avoid the codepath that causes the break.