Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dag warning endpoint permissions #34355

Merged
merged 4 commits into from
Sep 15, 2023

Conversation

hussein-awala
Copy link
Member

No description provided.

@hussein-awala hussein-awala added this to the Airflow 2.7.2 milestone Sep 13, 2023
@boring-cyborg boring-cyborg bot added the area:API Airflow's REST/HTTP API label Sep 13, 2023
query = query.where(DagWarningModel.dag_id == dag_id)
if warning_type:
query = query.where(DagWarningModel.warning_type == warning_type)
total_entries = get_query_count(query, session=session)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

useless query, the count could be calculated from the list

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually not useless, because the is limit in the query, I will fix it

@hussein-awala hussein-awala marked this pull request as draft September 13, 2023 22:55
@hussein-awala
Copy link
Member Author

I thought about using IN (list of accessible dags), but after some search, it looks like IN has limits in the different engines:

  1. MySQL:

    • The maximum number of values in an IN list is determined by the value of the max_allowed_packet server variable. This variable specifies the maximum size of a single packet or any generated/intermediate string.
    • In practical terms, you can typically have several thousand values in an IN list.
  2. PostgreSQL:

    • There is no strict limit on the number of values you can use in an IN list. However, very long lists may be subject to performance issues.
  3. SQLite:

    • The maximum number of parameters in an IN clause is determined by the maximum number of host parameters in a single SQL statement, which is typically limited to several hundred.
  4. SQL Server:

    • The maximum number of values in an IN list is determined by the maximum number of expressions that can be included in a list (which is 65,536).
  5. Oracle:

    • The maximum number of expressions in an IN list is 1,000.

I will try to find a way to join the table with the dag table.

@Taragolis Taragolis marked this pull request as ready for review September 13, 2023 23:03
@Taragolis Taragolis marked this pull request as draft September 13, 2023 23:05
@hussein-awala
Copy link
Member Author

Looks like there is no other solution, we do the same thing for dag endpoint:

readable_dags = get_airflow_app().appbuilder.sm.get_accessible_dag_ids(g.user)
dags_query = dags_query.where(DagModel.dag_id.in_(readable_dags))

for dag_warning in dag_warnings
if get_airflow_app().appbuilder.sm.can_read_dag(dag_warning.dag_id, g.user)
]
total_entries = len(dag_warnings)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is supposed to show the actual total, not the paginated value

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. This is the total not paginated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, I changed it but I forgot to push my commit. Just pushed it

Comment on lines 70 to 74
dag_warnings = [
dag_warning
for dag_warning in dag_warnings
if get_airflow_app().appbuilder.sm.can_read_dag(dag_warning.dag_id, g.user)
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filtering should happen before limit, otherwise this would show a weird number of entries if some of the warnings are not viewable by the current user, even possibly an empty list.

@@ -57,7 +60,12 @@ def get_dag_warnings(
allowed_filter_attrs = ["dag_id", "warning_type", "message", "timestamp"]
query = select(DagWarningModel)
if dag_id:
if not get_airflow_app().appbuilder.sm.can_read_dag(dag_id, g.user):
Copy link
Contributor

@ephraimbuddy ephraimbuddy Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we should have the DAG permission in the decorator instead of handling this by ourselves. My vote would go to just adding (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG) to the list of permissions.
That should solve this. Assuming it's dag_ids instead of dag_id, then we can do as you're doing now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see that if dag_id is not provided, then it gets all dag warning, my bad.

@hussein-awala hussein-awala marked this pull request as ready for review September 14, 2023 19:45
Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
@hussein-awala hussein-awala merged commit 3570bbf into apache:main Sep 15, 2023
43 checks passed
@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Oct 3, 2023
ephraimbuddy pushed a commit that referenced this pull request Oct 5, 2023
* Fix dag warning endpoint permissions

* update the query to have an accurate result for total entries and pagination

* add unit tests

* Update test_dag_warning_endpoint.py

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>

---------

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
(cherry picked from commit 3570bbf)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants