-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate to connexion v3 #37638
Migrate to connexion v3 #37638
Conversation
3013d04
to
db5fef3
Compare
9c0c8d6
to
3da1786
Compare
Hi @vincbeck We have defined the scope of Task 1 in the PR description. Thank you for your previous comments and suggestions to complete this task. There is a part of the proposed solution that we are still unclear.
In a recent comment RobbeSneyders commented 2 days ago suggested
So, should we still consider moving |
31301cd
to
1900ffc
Compare
Yes |
1900ffc
to
46630eb
Compare
Refinement: Returning
|
app = create_app(testing=conf.getboolean("core", "unit_test_mode")) | |
app.run( | |
debug=True, |
Usage 2:
airflow/airflow/cli/commands/internal_api_command.py
Lines 74 to 76 in 0f4babe
app = create_app(testing=conf.getboolean("core", "unit_test_mode")) | |
app.run( | |
debug=True, # nosec |
f66f6d9
to
de01bb4
Compare
ProblemDue to the upgrade of connextion v3, we cannot access blueprints( they moved the blueprint registration code inside their codebase). We used the returned blueprint to make exemptions to accept HTTP(S) requests without "csrf token" in the header. When the auth-token is in the header, the client doesn't include a csrf token. That's why we get csrf token missing error with @RobbeSneyders suggested utilizing the middleware library asgi-csrf to do the same without using blueprints.
This is a sample code to make csrf-token exemption.
|
I assume the scope :) ?. I believe the scope is the base URL of Airflow webserver (not 100% sure how asg-csrf does it but that's what I understand it should be. The CSRF tokens we have are generated in the webserver views - and those are generated at the "base URL" (and anything that's deeper in the path) - and those csrf tokens are then used by the browser to make the calls to the API.
We should use https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#secret-key - this is done usually by: conf.get_mandatory_value("webserver", "secret_key") |
Yes, it's scope. I have this now.
After using asgi_csrf, I get a different error. I hope I'm in the right direction to solve the issue. Here is my pull request to Sudipto's forked repo. |
It is possible to avoid this error using the following tweak for the time being. asgi-csrf looks for # asgi-csrf skip_if_scope
flask_app.config['SECRET_KEY'] = conf.get_mandatory_value("webserver", "secret_key")
def skip_api_paths(scope):
return scope["path"].startswith("/api/v1")
asgi_csrf(
flask_app,
signing_secret=conf.get_mandatory_value("webserver", "secret_key"),
skip_if_scope=skip_api_paths,
) @potiuk |
We might be able to handle @connexion_app.app.before_request
def before_request():
"""Exempts the view function associated with '/api/v1' requests from CSRF protection."""
if request.path.startswith("/api/v1"): # TODO: make sure this path is correct
view_function = flask_app.view_functions.get(request.endpoint)
if view_function:
# Exempt the view function from CSRF protection
connexion_app.app.extensions["csrf"].exempt(view_function) I implemented it here, asking for a review @vincbeck @potiuk @Satoshi-Sh |
de01bb4
to
f261cf5
Compare
Nicely done, @sudiptob2 . I checked it with Once todo about the scope is done, we can go ahead to the second bug. |
cc: @VladaZakharova - just adding you for awareness :) |
Once I do a bit of homework on it myself :) |
Hi @Satoshi-Sh, breeze start-airflow --dev-mode --load-example-dags --backend postgres
This has to be handled in subtask 1 so that reviewers can easily review it. |
Hey @RobbeSneyders Thanks for the offer. We got the PR green finally (HURRAY!). What - I think, you could help with is to validate some of our assumptions. Due to the length of this PR and comments and number of commits in that, I will open a new PR and ask some concrete questions and explain our decisions there and ask you for comments - and I will involve other maintainers as well. |
Continued in #39055 |
This is a huge PR being result of over a 100 commits made by a number of people in #apache#36052 and apache#37638. It switches to Connexion 3 as the driving backend implementation for both - Airflow REST APIs and Flask app that powers Airflow UI. It should be largely backwards compatible when it comes to behaviour of both APIs and Airflow Webserver views, however due to decisions made by Connexion 3 maintainers, it changes heavily the technology stack used under-the-hood: 1) Connexion App is an ASGI-compatible Open-API spec-first framework using ASGI as an interface between webserver and Python web application. ASGI is an asynchronous successor of WSGI. 2) Connexion itself is using Starlette to run asynchronous web services in Python. 3) We continue using gunicorn appliation server that still uses WSGI standard, which means that we can continue using Flask and we are usig standard Uvicorn ASGI webserver that converts the ASGI interface to WSGI interface of Gunicorn Some of the problems handled in this PR There were two problem was with session handling: * the get_session_cookie - did not get the right cookie - it returned "session" string. The right fix was to change cookie_jar into cookie.jar because this is where apparently TestClient of starlette is holding the cookies (visible when you debug) * The client does not accept "set_cookie" method - it accepts passing cookies via "cookies" dictionary - this is the usual httpx client - see https://www.starlette.io/testclient/ - so we have to set cookie directly in the get method to try it out Add "flask_client_with_login" for tests that neeed flask client Some tests require functionality not available to Starlette test client as they use Flask test client specific features - for those we have an option to get flask test client instead of starlette one. Fix error handling for new connection 3 approach Error handling for Connexion 3 integration needed to be reworked. The way it behaves is much the same as it works in main: * for API errors - we get application/problem+json responses * for UI erros - we have rendered views * for redirection - we have correct location header (it's been missing) * the api error handled was not added as available middleware in the www tests It should fix all test_views_base.py tests which were failing on lack of location header for redirection. Fix wrong response is tests_view_cluster_activity The problem in the test was that Starlette Test Client opens a new connection and start new session, while flask test client uses the same database session. The test did not show data because the data was not committed and session was not closed - which also failed sqlite local tests with "database is locked" error. Fix test_extra_links The tests were failing again because the dagrun created was not committed and session not closed. This worked with flask client that used the same session accidentally but did not work with test client from Starlette. Also it caused "database locked" in sqlite / local tests. Switch to non-deprecated auth manager Fix to test_views_log.py This PR partially fixes sessions and request parameter for test_views_log. Some tests are still failing but for different reasons - to be investigated. Fix views_custom_user_views tests The problem in those tests was that the check in security manager was based on the assumption that the security manager was shared between the client and test flask application - because they were coming from the same flask app. But when we use starlette, the call goes to a new process started and the user is deleted in the database - so the shortcut of checking the security manager did not work. The change is that we are now checking if the user is deleted by calling /users/show (we need a new users READ permission for that) - this way we go to the database and check if the user was indeed deleted. Fix test_task_instance_endpoint tests There were two reasons for the test failed: * when the Job was added to task instance, the task instance was not merged in session, which means that commit did not store the added Job * some of the tests were expecting a call with specific session and they failed because session was different. Replacing the session with mock.ANY tells pytest that this parameter can be anything - we will have different session when when the call will be made with ASGI/Starlette Fix parameter validation * added default value for limit parameter across the board. Connexion 3 does not like if the parameter had no default and we had not provided one - even if our custom decorated was adding it. Adding default value and updating our decorator to treat None as `default` fixed a number of problems where limits were not passed * swapped openapi specification for /datasets/{uri} and /dataset/events. Since `{uri}` was defined first, connection matched `events` with `{uri}` and chose parameter definitions from `{uri}` not events Fix test_log_enpoint tests The problem here was that some sessions should be committed/closed but also in order to run it standalone we wanted to create log templates in the database - as it relied implcitly on log templates created by other tests. Fix test_views_dagrun, test_views_tasks and test_views_log Fixed by switching to use flask client for testing rather than starlette. Starlette client in this case has some side effects that are also impacting Sqlite's session being created in a different thread and deleted with close_all_sessions fixture. Fix test_views_dagrun Fixed by switching to use flask client for testing rather than starlette. Starlette client in this case has some side effects that are also impacting Sqlite's session being created in a different thread and deleted with close_all_sessions fixture. Co-authored-by: sudipto baral <sudiptobaral.me@gmail.com> Co-authored-by: satoshi-sh <satoss1108@gmail.com> Co-authored-by: Maksim Yermakou <maksimy@google.com> Co-authored-by: Ulada Zakharava <Vlada_Zakharava@epam.com>
This is a huge PR being result of over a 100 commits made by a number of people in #apache#36052 and apache#37638. It switches to Connexion 3 as the driving backend implementation for both - Airflow REST APIs and Flask app that powers Airflow UI. It should be largely backwards compatible when it comes to behaviour of both APIs and Airflow Webserver views, however due to decisions made by Connexion 3 maintainers, it changes heavily the technology stack used under-the-hood: 1) Connexion App is an ASGI-compatible Open-API spec-first framework using ASGI as an interface between webserver and Python web application. ASGI is an asynchronous successor of WSGI. 2) Connexion itself is using Starlette to run asynchronous web services in Python. 3) We continue using gunicorn appliation server that still uses WSGI standard, which means that we can continue using Flask and we are usig standard Uvicorn ASGI webserver that converts the ASGI interface to WSGI interface of Gunicorn Some of the problems handled in this PR There were two problem was with session handling: * the get_session_cookie - did not get the right cookie - it returned "session" string. The right fix was to change cookie_jar into cookie.jar because this is where apparently TestClient of starlette is holding the cookies (visible when you debug) * The client does not accept "set_cookie" method - it accepts passing cookies via "cookies" dictionary - this is the usual httpx client - see https://www.starlette.io/testclient/ - so we have to set cookie directly in the get method to try it out Add "flask_client_with_login" for tests that neeed flask client Some tests require functionality not available to Starlette test client as they use Flask test client specific features - for those we have an option to get flask test client instead of starlette one. Fix error handling for new connection 3 approach Error handling for Connexion 3 integration needed to be reworked. The way it behaves is much the same as it works in main: * for API errors - we get application/problem+json responses * for UI erros - we have rendered views * for redirection - we have correct location header (it's been missing) * the api error handled was not added as available middleware in the www tests It should fix all test_views_base.py tests which were failing on lack of location header for redirection. Fix wrong response is tests_view_cluster_activity The problem in the test was that Starlette Test Client opens a new connection and start new session, while flask test client uses the same database session. The test did not show data because the data was not committed and session was not closed - which also failed sqlite local tests with "database is locked" error. Fix test_extra_links The tests were failing again because the dagrun created was not committed and session not closed. This worked with flask client that used the same session accidentally but did not work with test client from Starlette. Also it caused "database locked" in sqlite / local tests. Switch to non-deprecated auth manager Fix to test_views_log.py This PR partially fixes sessions and request parameter for test_views_log. Some tests are still failing but for different reasons - to be investigated. Fix views_custom_user_views tests The problem in those tests was that the check in security manager was based on the assumption that the security manager was shared between the client and test flask application - because they were coming from the same flask app. But when we use starlette, the call goes to a new process started and the user is deleted in the database - so the shortcut of checking the security manager did not work. The change is that we are now checking if the user is deleted by calling /users/show (we need a new users READ permission for that) - this way we go to the database and check if the user was indeed deleted. Fix test_task_instance_endpoint tests There were two reasons for the test failed: * when the Job was added to task instance, the task instance was not merged in session, which means that commit did not store the added Job * some of the tests were expecting a call with specific session and they failed because session was different. Replacing the session with mock.ANY tells pytest that this parameter can be anything - we will have different session when when the call will be made with ASGI/Starlette Fix parameter validation * added default value for limit parameter across the board. Connexion 3 does not like if the parameter had no default and we had not provided one - even if our custom decorated was adding it. Adding default value and updating our decorator to treat None as `default` fixed a number of problems where limits were not passed * swapped openapi specification for /datasets/{uri} and /dataset/events. Since `{uri}` was defined first, connection matched `events` with `{uri}` and chose parameter definitions from `{uri}` not events Fix test_log_enpoint tests The problem here was that some sessions should be committed/closed but also in order to run it standalone we wanted to create log templates in the database - as it relied implcitly on log templates created by other tests. Fix test_views_dagrun, test_views_tasks and test_views_log Fixed by switching to use flask client for testing rather than starlette. Starlette client in this case has some side effects that are also impacting Sqlite's session being created in a different thread and deleted with close_all_sessions fixture. Fix test_views_dagrun Fixed by switching to use flask client for testing rather than starlette. Starlette client in this case has some side effects that are also impacting Sqlite's session being created in a different thread and deleted with close_all_sessions fixture. Co-authored-by: sudipto baral <sudiptobaral.me@gmail.com> Co-authored-by: satoshi-sh <satoss1108@gmail.com> Co-authored-by: Maksim Yermakou <maksimy@google.com> Co-authored-by: Ulada Zakharava <Vlada_Zakharava@epam.com> Better API initialization including vending of API specification. The way paths are added and initialized is better (for example FAB contributes their path via new method in Auth Manager. This also add back-compatibility to FAB auth manaager to continue working on Airflow 2.9.
This is a huge PR being result of over a 100 commits made by a number of people in #apache#36052 and apache#37638. It switches to Connexion 3 as the driving backend implementation for both - Airflow REST APIs and Flask app that powers Airflow UI. It should be largely backwards compatible when it comes to behaviour of both APIs and Airflow Webserver views, however due to decisions made by Connexion 3 maintainers, it changes heavily the technology stack used under-the-hood: 1) Connexion App is an ASGI-compatible Open-API spec-first framework using ASGI as an interface between webserver and Python web application. ASGI is an asynchronous successor of WSGI. 2) Connexion itself is using Starlette to run asynchronous web services in Python. 3) We continue using gunicorn appliation server that still uses WSGI standard, which means that we can continue using Flask and we are usig standard Uvicorn ASGI webserver that converts the ASGI interface to WSGI interface of Gunicorn Some of the problems handled in this PR There were two problem was with session handling: * the get_session_cookie - did not get the right cookie - it returned "session" string. The right fix was to change cookie_jar into cookie.jar because this is where apparently TestClient of starlette is holding the cookies (visible when you debug) * The client does not accept "set_cookie" method - it accepts passing cookies via "cookies" dictionary - this is the usual httpx client - see https://www.starlette.io/testclient/ - so we have to set cookie directly in the get method to try it out Add "flask_client_with_login" for tests that neeed flask client Some tests require functionality not available to Starlette test client as they use Flask test client specific features - for those we have an option to get flask test client instead of starlette one. Fix error handling for new connection 3 approach Error handling for Connexion 3 integration needed to be reworked. The way it behaves is much the same as it works in main: * for API errors - we get application/problem+json responses * for UI erros - we have rendered views * for redirection - we have correct location header (it's been missing) * the api error handled was not added as available middleware in the www tests It should fix all test_views_base.py tests which were failing on lack of location header for redirection. Fix wrong response is tests_view_cluster_activity The problem in the test was that Starlette Test Client opens a new connection and start new session, while flask test client uses the same database session. The test did not show data because the data was not committed and session was not closed - which also failed sqlite local tests with "database is locked" error. Fix test_extra_links The tests were failing again because the dagrun created was not committed and session not closed. This worked with flask client that used the same session accidentally but did not work with test client from Starlette. Also it caused "database locked" in sqlite / local tests. Switch to non-deprecated auth manager Fix to test_views_log.py This PR partially fixes sessions and request parameter for test_views_log. Some tests are still failing but for different reasons - to be investigated. Fix views_custom_user_views tests The problem in those tests was that the check in security manager was based on the assumption that the security manager was shared between the client and test flask application - because they were coming from the same flask app. But when we use starlette, the call goes to a new process started and the user is deleted in the database - so the shortcut of checking the security manager did not work. The change is that we are now checking if the user is deleted by calling /users/show (we need a new users READ permission for that) - this way we go to the database and check if the user was indeed deleted. Fix test_task_instance_endpoint tests There were two reasons for the test failed: * when the Job was added to task instance, the task instance was not merged in session, which means that commit did not store the added Job * some of the tests were expecting a call with specific session and they failed because session was different. Replacing the session with mock.ANY tells pytest that this parameter can be anything - we will have different session when when the call will be made with ASGI/Starlette Fix parameter validation * added default value for limit parameter across the board. Connexion 3 does not like if the parameter had no default and we had not provided one - even if our custom decorated was adding it. Adding default value and updating our decorator to treat None as `default` fixed a number of problems where limits were not passed * swapped openapi specification for /datasets/{uri} and /dataset/events. Since `{uri}` was defined first, connection matched `events` with `{uri}` and chose parameter definitions from `{uri}` not events Fix test_log_enpoint tests The problem here was that some sessions should be committed/closed but also in order to run it standalone we wanted to create log templates in the database - as it relied implcitly on log templates created by other tests. Fix test_views_dagrun, test_views_tasks and test_views_log Fixed by switching to use flask client for testing rather than starlette. Starlette client in this case has some side effects that are also impacting Sqlite's session being created in a different thread and deleted with close_all_sessions fixture. Fix test_views_dagrun Fixed by switching to use flask client for testing rather than starlette. Starlette client in this case has some side effects that are also impacting Sqlite's session being created in a different thread and deleted with close_all_sessions fixture. Co-authored-by: sudipto baral <sudiptobaral.me@gmail.com> Co-authored-by: satoshi-sh <satoss1108@gmail.com> Co-authored-by: Maksim Yermakou <maksimy@google.com> Co-authored-by: Ulada Zakharava <Vlada_Zakharava@epam.com> Better API initialization including vending of API specification. The way paths are added and initialized is better (for example FAB contributes their path via new method in Auth Manager. This also add back-compatibility to FAB auth manaager to continue working on Airflow 2.9.
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
This PR is created based on #36052
Todo
blueprint
in connexion v3Taks 1 - Refactor get_api_endpoints()
Problem Definiton
Ref: Github Pull Request #36052 VladaZakharova commented on Jan 18
In the
init_api_auth_provider
method, we update the base path as follows:However, the
blueprint
object obtained fromauth_mgr.get_api_endpoints(connexion_app)
will always beNone
if we are using ConnexionV3.Proposed solution
Ref vincbeck commented on Jan 18
get_api_endpoints
toset_api_endpoints
. The return type should be updated toNone
. Documentation should be updated as well to something like "Set API endpoint(s) definition for the auth manager.". This is a breaking change but nobody uses this interface yet, so it is a good time to do it.This piece of codeRef: Migrate to connexion v3 #37638 (comment)flask_app.extensions["csrf"].exempt(blueprint)
should be moved in theset_api_endpoints
method usingappbuilder.app.extensions["csrf"].exempt(api.blueprint)
How to test
python ./clients/python/test_python_client.py
Subtasks
base_paths.append(blueprint.url_prefix if blueprint.url_prefix else "")
CSRF
exemption is correct. Ref: Migrate to connexion v3 #37638 (comment)swagger-ui
installation, Ref: Migrate to connexion v3 #37638 (comment)favicon.ico
problem in the swagger UI.Task 2 - Replace envrion_overrides argument
Problem Definition
Ref: Github Pull Request #36052 commented on Feb 6
Solution
{'REMOTE_USER:"user"}
instead of using environ_overirdes argument inside testclient.method. Accordinly update the authentication part intests/test_utils/remote_user_api_auth_backend.py
touser_id = request.headers.get("REMOTE_USER")