Allow Remote logging providers to load connections from the API Server #53719

ashb · 2025-07-24T16:20:04Z

Often remote logging is down using automatic instance profiles, but not
always. If you tried to configure a logger by a connection defined in the
metadata DB it would have not worked (it either caused the supervise job to
fail early, or to just behave as if the connection didn't exist, depending on
the hook's behaviour)

Unfortunately, the way of knowing what the default connection ID various hooks
use is not easily discoverable, at least not easily from the outside (we can't
look at remote.hook as for most log providers that would try to load the
connection, failing in the way we are trying to fix) so I updated the log
config module to keep track of what the default conn id is for the modern log
providers.

Once we have the connection ID we know (or at least have a good idea that
we've got the right one) we then pre-emptively check the secrets backends for
it, if not found there load it from the API server, and then either way. if we
find a connection we put it in the env variable so that it is available.

The reason we use this approach, is that are running in the supervisor process
itself, so SUPERVISOR_COMMS is not and cannot be set yet.

Discovered when digging in to #52501 -- it might fix the problem

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

Often remote logging is down using automatic instance profiles, but not always. If you tried to configure a logger by a connection defined in the metadata DB it would have not worked (it either caused the supervise job to fail early, or to just behave as if the connection didn't exist, depending on the hook's behaviour) Unfortunately, the way of knowing what the default connection ID various hooks use is not easily discoverable, at least not easily from the outside (we can't look at `remote.hook` as for most log providers that would try to load the connection, failing in the way we are trying to fix) so I updated the log config module to keep track of what the default conn id is for the modern log providers. Once we have the connection ID we know (or at least have a good idea that we've got the right one) we then pre-emptively check the secrets backends for it, if not found there load it from the API server, and then either way. if we find a connection we put it in the env variable so that it is available. The reason we use this approach, is that are running in the supervisor process itself, so SUPERVISOR_COMMS is not and cannot be set yet.

ashb · 2025-07-24T16:20:36Z

This is messier and more complex than I like -- any idea on w a way of cleaning this up greatly appreciated.

potiuk · 2025-07-24T16:29:56Z

This is messier and more complex than I like -- any idea on w a way of cleaning this up greatly appreciated.

We should change remote logging templates for loggers to be part of provider's packages as separate files that should be dropped to the same directory where "generic" airflow config is - and our logging configuration should discover the remote logging file dropped there and read it from there. Such remote logging config that should be copy*pasteable from provider's sources (and docs) should have everything needed for the core to be able to configure it in a generic way.

That should remove the coupling of airlow core from essentially provider feature and make it a bit less messy.
I am assuming your "messy" comment bit here is about precisely this - airflow-core needing to have some provider-specific parts. Maybe there are other "messy" parts that I do not realize :)

amoghrajesh

I would see solving this problem in a multiphase way. For unblocking users from immediately using remote logging to load connections and as a short term / mid term fix, I see this solution to be OK.

Over a longer term, I would like to propose and agree with @potiuk's suggestion here, which is asking every provider to have their own logging config files setup to decouple core from knowing what should be present in a provider's logging config, but just knowing how to discover it.

For ex, google could define a providers/google/logging/gcs_remote_logging.yaml with details such as:

logging_schemes:
  - scheme: "gs://"
    handler_class: "airflow.providers.google.cloud.log.gcs_task_handler.GCSRemoteLogIO"
    default_connection_id: "google_cloud_default"
    required_config:
      - "logging.remote_base_log_folder"
      - "logging.google_key_path"

and get done with it. Core can define a discovery mechanism to be ok with it.

For now, I am ok with this PR.

amoghrajesh · 2025-07-25T14:19:15Z

@potiuk the suggestions from you are thoughts that are relevant more to the process as you said. This PR addresses the fact that the connection to be used still comes from configuration and even with better discovery etc, nothing changes on that part.

potiuk · 2025-07-25T14:23:23Z

Yep. I was just responding to @ashb call to suggestions :) "any idea on w a way of cleaning this up greatly appreciated.". Generally speaking that PR looks good to me as well - besides (as Ash mentioned) being messy :)

potiuk · 2025-07-25T14:24:47Z

And the messiness is because of legacy "embedding" of provider things in core - not because the PR is messy on its own :D

…he API Server (#53719) Often remote logging is down using automatic instance profiles, but not always. If you tried to configure a logger by a connection defined in the metadata DB it would have not worked (it either caused the supervise job to fail early, or to just behave as if the connection didn't exist, depending on the hook's behaviour) Unfortunately, the way of knowing what the default connection ID various hooks use is not easily discoverable, at least not easily from the outside (we can't look at `remote.hook` as for most log providers that would try to load the connection, failing in the way we are trying to fix) so I updated the log config module to keep track of what the default conn id is for the modern log providers. Once we have the connection ID we know (or at least have a good idea that we've got the right one) we then pre-emptively check the secrets backends for it, if not found there load it from the API server, and then either way. if we find a connection we put it in the env variable so that it is available. The reason we use this approach, is that are running in the supervisor process itself, so SUPERVISOR_COMMS is not and cannot be set yet. (cherry picked from commit e4fb686) Co-authored-by: Ash Berlin-Taylor <ash@apache.org>

github-actions · 2025-07-25T14:27:48Z

Backport successfully created: v3-0-test

Status	Branch	Result
✅	v3-0-test

…he API Server (#53719) Often remote logging is down using automatic instance profiles, but not always. If you tried to configure a logger by a connection defined in the metadata DB it would have not worked (it either caused the supervise job to fail early, or to just behave as if the connection didn't exist, depending on the hook's behaviour) Unfortunately, the way of knowing what the default connection ID various hooks use is not easily discoverable, at least not easily from the outside (we can't look at `remote.hook` as for most log providers that would try to load the connection, failing in the way we are trying to fix) so I updated the log config module to keep track of what the default conn id is for the modern log providers. Once we have the connection ID we know (or at least have a good idea that we've got the right one) we then pre-emptively check the secrets backends for it, if not found there load it from the API server, and then either way. if we find a connection we put it in the env variable so that it is available. The reason we use this approach, is that are running in the supervisor process itself, so SUPERVISOR_COMMS is not and cannot be set yet. (cherry picked from commit e4fb686) Co-authored-by: Ash Berlin-Taylor <ash@apache.org>

…he API Server (#53719) (#53761) Often remote logging is down using automatic instance profiles, but not always. If you tried to configure a logger by a connection defined in the metadata DB it would have not worked (it either caused the supervise job to fail early, or to just behave as if the connection didn't exist, depending on the hook's behaviour) Unfortunately, the way of knowing what the default connection ID various hooks use is not easily discoverable, at least not easily from the outside (we can't look at `remote.hook` as for most log providers that would try to load the connection, failing in the way we are trying to fix) so I updated the log config module to keep track of what the default conn id is for the modern log providers. Once we have the connection ID we know (or at least have a good idea that we've got the right one) we then pre-emptively check the secrets backends for it, if not found there load it from the API server, and then either way. if we find a connection we put it in the env variable so that it is available. The reason we use this approach, is that are running in the supervisor process itself, so SUPERVISOR_COMMS is not and cannot be set yet. (cherry picked from commit e4fb686) Co-authored-by: Ash Berlin-Taylor <ash@apache.org>

apache#53719) Often remote logging is down using automatic instance profiles, but not always. If you tried to configure a logger by a connection defined in the metadata DB it would have not worked (it either caused the supervise job to fail early, or to just behave as if the connection didn't exist, depending on the hook's behaviour) Unfortunately, the way of knowing what the default connection ID various hooks use is not easily discoverable, at least not easily from the outside (we can't look at `remote.hook` as for most log providers that would try to load the connection, failing in the way we are trying to fix) so I updated the log config module to keep track of what the default conn id is for the modern log providers. Once we have the connection ID we know (or at least have a good idea that we've got the right one) we then pre-emptively check the secrets backends for it, if not found there load it from the API server, and then either way. if we find a connection we put it in the env variable so that it is available. The reason we use this approach, is that are running in the supervisor process itself, so SUPERVISOR_COMMS is not and cannot be set yet.

ashb requested review from amoghrajesh and kaxil as code owners July 24, 2025 16:20

boring-cyborg bot added area:ConfigTemplates area:logging area:task-sdk labels Jul 24, 2025

ashb added backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch and removed area:ConfigTemplates labels Jul 24, 2025

amoghrajesh approved these changes Jul 25, 2025

View reviewed changes

potiuk approved these changes Jul 25, 2025

View reviewed changes

amoghrajesh assigned ashb Jul 25, 2025

amoghrajesh merged commit e4fb686 into apache:main Jul 25, 2025
82 checks passed

amoghrajesh deleted the load-remote-logg-conn-from-apiserver branch July 25, 2025 14:26

This was referenced Jul 29, 2025

Cloudwatch remote logging is not working on Airflow 3 #52501

Closed

No Connection id found execptions when remote logging enabled [LocalExecutor] #50583

Closed

potiuk linked an issue Aug 7, 2025 that may be closed by this pull request

Remote Logging to Azure Blob Storage Broke in Airflow 3.0 #54192

Closed

2 tasks

csp33 mentioned this pull request Aug 14, 2025

Unhandled Exception in remote logging if connection doesn't exist #54498

Open

2 tasks

ashb added this to the Airflow 3.0.4 milestone Aug 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow Remote logging providers to load connections from the API Server #53719

Allow Remote logging providers to load connections from the API Server #53719

ashb commented Jul 24, 2025 •

edited

Loading

Uh oh!

ashb commented Jul 24, 2025

Uh oh!

potiuk commented Jul 24, 2025

Uh oh!

amoghrajesh left a comment

Uh oh!

amoghrajesh commented Jul 25, 2025

Uh oh!

potiuk commented Jul 25, 2025

Uh oh!

potiuk commented Jul 25, 2025

Uh oh!

Uh oh!

github-actions bot commented Jul 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Allow Remote logging providers to load connections from the API Server #53719

Allow Remote logging providers to load connections from the API Server #53719

Conversation

ashb commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ashb commented Jul 24, 2025

Uh oh!

potiuk commented Jul 24, 2025

Uh oh!

amoghrajesh left a comment

Choose a reason for hiding this comment

Uh oh!

amoghrajesh commented Jul 25, 2025

Uh oh!

potiuk commented Jul 25, 2025

Uh oh!

potiuk commented Jul 25, 2025

Uh oh!

Uh oh!

github-actions bot commented Jul 25, 2025

Backport successfully created: v3-0-test

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ashb commented Jul 24, 2025 •

edited

Loading