Skip to content

Conversation

@amoghrajesh
Copy link
Contributor

@amoghrajesh amoghrajesh commented Jan 7, 2026

closes: #60403

Motivation

Motivation is that ProvidersManager is imported in few places in task sdk from core airflow, blocking client server separation.

image

Now coming to a bigger problem and solving this issue better, ProvidersManager is a monolithic structure that forces workers to load infrastructure components they will never use (executors, auth managers). This for no reason provides workers with data they do not need increasing instance size as well.

Simple example:

# Workers need:
pm.hooks  # For task execution
pm.taskflow_decorators  # For @task.docker etc
pm.filesystem_module_names  # For filesystem I/O

# They certainly don't need
pm.executor_class_names
pm.auth_managers  # Only API server needs  

So, my goal here it to enable client server separation by splitting runtime resources of providers (hooks, taskflow_decorators) from others that airflow-core needs.

Approach

A detailed table of what should belong where as per me:

Property ProvidersManagerRuntime (SDK) ProvidersManager (Core) Purpose
hooks Primary Deprecated Connection hooks for external systems
taskflow_decorators Primary Deprecated Task decorators (@task.docker, etc.)
filesystem_module_names Primary Deprecated File I/O implementations
asset_factories Primary Deprecated Asset creation functions
asset_uri_handlers Primary Deprecated URI normalization
asset_to_openlineage_converters Primary Deprecated OpenLineage integration
auth_managers No Yes Authentication managers
executor_class_names No Yes Executor discovery
secrets_backend_class_names No Yes Secrets backends
logging_class_names No Yes Log handlers
queue_class_names No Yes Queue managers
cli_command_functions No Yes CLI commands
cli_command_providers No Yes CLI command providers
providers Yes Yes Provider metadata (both need it)
notification No Yes Notifiers
trigger No Yes Deferrable triggers
dialects No Yes Database dialects
plugins No Yes Provider plugins
extra_links_class_names No Yes UI extra links
connection_form_widgets No Yes Connection form UI
field_behaviours No Yes Connection field behaviors
provider_configs No Yes Provider configuration
already_initialized_provider_configs No Yes Already initialized configs

Decided to extract current ProvidersManager into two portions, one that handles runtime and other that handles what core needs.

What's done:

  • Splitting providers manager: ProvidersManagerRuntime in task-sdk for runtime resources and ProvidersManager as we know today stays to serve server components
  • Extracted common code of ~600 lines used by both into shared library
  • Hacky deprecation delegation in ProvidersManager (will make it better)
  • Basic testing confirms everything works

Task sdk will do this from now one:

from airflow.sdk.providers_manager_runtime import ProvidersManagerRuntime

pm = ProvidersManagerRuntime()
pm.hooks  # connection handling
pm.taskflow_decorators  # @task decorators
pm.filesystem_module_names  # file I/O
pm.asset_uri_handlers  # Asset scheme handlers

Testing

Backward compatibility

Existing code continues to work with deprecation warnings:

from airflow.providers_manager import ProvidersManager
pm = ProvidersManager()
pm.hooks
<ipython-input-3-95b9e89432f5>:1 DeprecationWarning: ProvidersManager.hooks is deprecated. Use ProvidersManagerRuntime.hooks from task-sdk instead.
Out[3]: <airflow._shared.providers_discovery.LazyDictWithCache at 0x1113a8040>
pm.hooks.get("postgres")
<ipython-input-4-9891a5daf791>:1 DeprecationWarning: ProvidersManager.hooks is deprecated. Use ProvidersManagerRuntime.hooks from task-sdk instead.
Out[4]: HookInfo(hook_class_name='airflow.providers.postgres.hooks.postgres.PostgresHook', connection_id_attribute_name='postgres_conn_id', package_name='apache-airflow-providers-postgres', hook_name='Postgres', connection_type='postgres', connection_testable=True, dialects=[])

CLI

Just created a simple script and ran it, output: providers-output.txt

Impact on consumers

For Providers

Nothing now, watch out for deprecation warnings and migrate using runtime properties from task sdk.

For Airflow developers / DAG authors

If you are using ProvidersManager for your DAG code, first of all you shouldn't but if you really are, migrate after watching deprecation warnings for runtime properties.

from airflow.sdk.providers_manager_runtime import ProvidersManagerRuntime
pm = ProvidersManagerRuntime()

What's next

Next phase:

  • Move UI metadata to yaml format
  • Extract connection form widgets from hook methods to provider.yaml
  • Migrate all 82 providers
    ` Update API server to read from YAML (eliminate hook imports)

Benefits:

  • API server starts without importing provider code
  • UI metadata purely declarative (YAML)

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@amoghrajesh amoghrajesh marked this pull request as ready for review January 8, 2026 14:28
@amoghrajesh amoghrajesh self-assigned this Jan 12, 2026
@kaxil
Copy link
Member

kaxil commented Jan 14, 2026

cc @potiuk Would be good to get your review on this one, you'd know the ProvidersManager best

Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Close enough but I'd wait for @potiuk to do a review too

@amoghrajesh
Copy link
Contributor Author

Agreed, I would wait for a review from @potiuk too

Copy link
Member

@jason810496 jason810496 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. LGTM overall!

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice and clean - just one question and one suggestion. Great Job @amoghrajesh on making it closer to task-isolation and provider split "north star".

@amoghrajesh
Copy link
Contributor Author

Thanks for the review folks. Merging this one.

@amoghrajesh amoghrajesh merged commit c93cb32 into apache:main Jan 19, 2026
228 checks passed
@amoghrajesh amoghrajesh deleted the split-providers-manager branch January 19, 2026 11:49
jason810496 pushed a commit to jason810496/airflow that referenced this pull request Jan 22, 2026
…pache#60218)

Splitting providers manager: ProvidersManagerRuntime in task-sdk for runtime resources and ProvidersManager as we know today stays to serve server components
suii2210 pushed a commit to suii2210/airflow that referenced this pull request Jan 26, 2026
…pache#60218)

Splitting providers manager: ProvidersManagerRuntime in task-sdk for runtime resources and ProvidersManager as we know today stays to serve server components
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Break ProvidersManager into infrastructure and runtime managers

6 participants