Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use async db calls in WorkflowTrigger #38689

Merged

Conversation

stevenschaerer
Copy link
Contributor

This closes: #38672

I opted to slightly extend some existing unit tests rather than write mostly identical new ones. If the @sync_to_async decorator and the awaits before self._get_count are removed the tests will fail because the trigger and dummy tasks are executed sequentially.

Should I also remove TaskStateTrigger in this PR? It is not used anymore outside of tests.

Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, LGTM

@uranusjr
Copy link
Member

uranusjr commented Apr 2, 2024

I don’t think we can just remove TaskStateTrigger, but let’s deprecate it for future removal.

It would also be a good idea to implement an version of _get_count to use the async SQLAlchemy API if available, and fallback to the sync_to_async version otherwise. Most people likely won’t have the async component (pip install sqlalchemy[asyncio]) but we should start to slowly add support to it in triggerers.

@hussein-awala
Copy link
Member

I don’t think we can just remove TaskStateTrigger, but let’s deprecate it for future removal.

It would also be a good idea to implement an version of _get_count to use the async SQLAlchemy API if available, and fallback to the sync_to_async version otherwise. Most people likely won’t have the async component (pip install sqlalchemy[asyncio]) but we should start to slowly add support to it in triggerers.

I'm working on a draft AIP to add this support, I tested it a few months ago and the results were promising: #36504

@hussein-awala
Copy link
Member

Also some async features are only available in sqlalchemy 2.0, so maybe we need to focus on this migration first.

tests/triggers/test_external_task.py Outdated Show resolved Hide resolved
tests/triggers/test_external_task.py Outdated Show resolved Hide resolved
@stevenschaerer
Copy link
Contributor Author

I don’t think we can just remove TaskStateTrigger, but let’s deprecate it for future removal.

It would also be a good idea to implement an version of _get_count to use the async SQLAlchemy API if available, and fallback to the sync_to_async version otherwise. Most people likely won’t have the async component (pip install sqlalchemy[asyncio]) but we should start to slowly add support to it in triggerers.

I have deprecated TaskStateTrigger. Following the other comments I'll leave using async SQLAlchemy for a separate PR.

@Taragolis Taragolis merged commit e6eec0c into apache:main Apr 4, 2024
41 checks passed
@Taragolis Taragolis added the type:bug-fix Changelog: Bug Fixes label Apr 4, 2024
@Taragolis Taragolis added this to the Airflow 2.9.1 milestone Apr 4, 2024
jedcunningham pushed a commit that referenced this pull request Apr 26, 2024
* Use async db calls in WorkflowTrigger

* address PR comments

* deprecate TaskStateTrigger with proper category

(cherry picked from commit e6eec0c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Triggerer type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid to use sync (with blocking io) functions into the WorkflowTrigger
4 participants