Releases: apache/airflow
Apache Airflow 2.10.3
Significant Changes
No significant changes.
Bug Fixes
- Improves the handling of value masking when setting Airflow variables for enhanced security. (#43123) (#43278)
- Adds support for task_instance_mutation_hook to handle mapped operators with index 0. (#42661) (#43089)
- Fixes executor cleanup to properly handle zombie tasks when task instances are terminated. (#43065)
- Adds retry logic for HTTP 502 and 504 errors in internal API calls to handle webserver startup issues. (#42994) (#43044)
- Restores the use of separate sessions for writing and deleting RTIF data to prevent StaleDataError. (#42928) (#43012)
- Fixes PythonOperator error by replacing hyphens with underscores in DAG names. (#42993)
- Improving validation of task retries to handle None values (#42532) (#42915)
- Fixes error handling in dataset managers when resolving dataset aliases into new datasets (#42733)
- Enables clicking on task names in the DAG Graph View to correctly select the corresponding task. (#38782) (#42697)
- Prevent redirect loop on /home with tags/last run filters (#42607) (#42609) (#42628)
- Support of host.name in OTEL metrics and usage of OTEL_RESOURCE_ATTRIBUTES in metrics (#42428) (#42604)
- Reduce eyestrain in dark mode with reduced contrast and saturation (#42567) (#42583)
- Handle ENTER key correctly in trigger form and allow manual JSON (#42525) (#42535)
- Ensure DAG trigger form submits with updated parameters upon keyboard submit (#42487) (#42499)
- Do not attempt to provide not
stringified
objects to UI via xcom if pickling is active (#42388) (#42486) - Fix the span link of task instance to point to the correct span in the scheduler_job_loop (#42430) (#42480)
- Bugfix task execution from runner in Windows (#42426) (#42478)
- Allows overriding the hardcoded OTEL_SERVICE_NAME with an environment variable (#42242) (#42441)
- Improves trigger performance by using
selectinload
instead ofjoinedload
(#40487) (#42351) - Suppress warnings when masking sensitive configs (#43335) (#43337)
- Masking configuration values irrelevant to DAG author (#43040) (#43336)
- Execute templated bash script as file in BashOperator (#43191)
- Fixes schedule_downstream_tasks to include upstream tasks for one_success trigger rule (#42582) (#43299)
- Add retry logic in the scheduler for updating trigger timeouts in case of deadlocks. (#41429) (#42651)
- Mark all tasks as skipped when failing a dag_run manually (#43572)
- Fix
TrySelector
for Mapped Tasks in Logs and Details Grid Panel (#43566) - Conditionally add OTEL events when processing executor events (#43558) (#43567)
- Fix broken stat
scheduler_loop_duration
(#42886) (#43544) - Ensure total_entries in /api/v1/dags (#43377) (#43429)
- Include limit and offset in request body schema for List task instances (batch) endpoint (#43479)
- Don't raise a warning in ExecutorSafeguard when execute is called from an extended operator (#42849) (#43577)
Miscellaneous
- Deprecate session auth backend (#42911)
- Removed unicodecsv dependency for providers with Airflow version 2.8.0 and above (#42765) (#42970)
- Remove the referrer from Webserver to Scarf (#42901) (#42942)
- Bump
dompurify
from 2.2.9 to 2.5.6 in /airflow/www (#42263) (#42270) - Correct docstring format in _get_template_context (#42244) (#42272)
- Backport: Bump Flask-AppBuilder to
4.5.2
(#43309) (#43318) - Check python version that was used to install pre-commit venvs (#43282) (#43310)
- Resolve warning in Dataset Alias migration (#43425)
Doc Only Changes
- Clarifying PLUGINS_FOLDER permissions by DAG authors (#43022) (#43029)
- Add templating info to TaskFlow tutorial (#42992)
- Airflow local settings no longer importable from dags folder (#42231) (#42603)
- Fix documentation for cpu and memory usage (#42147) (#42256)
- Fix instruction for docker compose (#43119) (#43321)
- Updates documentation to reflect that dag_warnings is returned instead of import_errors. (#42858) (#42888)
Apache Airflow 2.10.2
Significant Changes
No significant changes.
Bug Fixes
- Revert "Fix: DAGs are not marked as stale if the dags folder change" (#42220, #42217)
- Add missing open telemetry span and correct scheduled slots documentation (#41985)
- Fix require_confirmation_dag_change (#42063) (#42211)
- Only treat null/undefined as falsy when rendering XComEntry (#42199) (#42213)
- Add extra and
renderedTemplates
as keys to skipcamelCasing
(#42206) (#42208) - Do not
camelcase
xcom entries (#42182) (#42187) - Fix task_instance and dag_run links from list views (#42138) (#42143)
- Support multi-line input for Params of type string in trigger UI form (#40414) (#42139)
- Fix details tab log url detection (#42104) (#42114)
- Add new type of exception to catch timeout (#42064) (#42078)
- Rewrite how DAG to dataset / dataset alias are stored (#41987) (#42055)
- Allow dataset alias to add more than one dataset events (#42189) (#42247)
Miscellaneous
- Limit universal-pathlib below
0.2.4
as it breaks our integration (#42101) - Auto-fix default deferrable with
LibCST
(#42089) - Deprecate
--tree
flag fortasks list
cli command (#41965)
Doc Only Changes
Apache Airflow 2.10.1
Significant Changes
No significant changes.
Bug Fixes
- Handle Example dags case when checking for missing files (#41874)
- Fix logout link in "no roles" error page (#41845)
- Set end_date and duration for triggers completed with end_from_trigger as True. (#41834)
- DAGs are not marked as stale if the dags folder change (#41829)
- Fix compatibility with FAB provider versions <1.3.0 (#41809)
- Don't Fail LocalTaskJob on heartbeat (#41810)
- Remove deprecation warning for cgitb in Plugins Manager (#41793)
- Fix log for notifier(instance) without name (#41699)
- Splitting syspath preparation into stages (#41694)
- Adding url sanitization for extra links (#41680)
- Fix InletEventsAccessors type stub (#41607)
- Fix UI rendering when XCom is INT, FLOAT, BOOL or NULL (#41605)
- Fix try selector refresh (#41503)
- Incorrect try number subtraction producing invalid span id for OTEL airflow (#41535)
- Add WebEncoder for trigger page rendering to avoid render failure (#41485)
- Adding
tojson
filter to example_inlet_event_extra example dag (#41890) - Add backward compatibility check for executors that don't inherit BaseExecutor (#41927)
Miscellaneous
- Bump webpack from 5.76.0 to 5.94.0 in /airflow/www (#41879)
- Adding rel property to hyperlinks in logs (#41783)
- Field Deletion Warning when editing Connections (#41504)
- Make Scarf usage reporting in major+minor versions and counters in buckets (#41900)
- Lower down universal-pathlib minimum to 0.2.2 (#41943)
- Protect against None components of universal pathlib xcom backend (#41938)
Doc Only Changes
Apache Airflow 2.10.0
Significant Changes
Datasets no longer trigger inactive DAGs (#38891)
Previously, when a DAG is paused or removed, incoming dataset events would still
trigger it, and the DAG would run when it is unpaused or added back in a DAG
file. This has been changed; a DAG's dataset schedule can now only be satisfied
by events that occur when the DAG is active. While this is a breaking change,
the previous behavior is considered a bug.
The behavior of time-based scheduling is unchanged, including the timetable part
of DatasetOrTimeSchedule
.
try_number
is no longer incremented during task execution (#39336)
Previously, the try number (try_number
) was incremented at the beginning of task execution on the worker. This was problematic for many reasons.
For one it meant that the try number was incremented when it was not supposed to, namely when resuming from reschedule or deferral. And it also resulted in
the try number being "wrong" when the task had not yet started. The workarounds for these two issues caused a lot of confusion.
Now, instead, the try number for a task run is determined at the time the task is scheduled, and does not change in flight, and it is never decremented.
So after the task runs, the observed try number remains the same as it was when the task was running; only when there is a "new try" will the try number be incremented again.
One consequence of this change is, if users were "manually" running tasks (e.g. by calling ti.run()
directly, or command line airflow tasks run
),
try number will no longer be incremented. Airflow assumes that tasks are always run after being scheduled by the scheduler, so we do not regard this as a breaking change.
/logout
endpoint in FAB Auth Manager is now CSRF protected (#40145)
The /logout
endpoint's method in FAB Auth Manager has been changed from GET
to POST
in all existing
AuthViews (AuthDBView
, AuthLDAPView
, AuthOAuthView
, AuthOIDView
, AuthRemoteUserView
), and
now includes CSRF protection to enhance security and prevent unauthorized logouts.
OpenTelemetry Traces for Apache Airflow (#37948).
This new feature adds capability for Apache Airflow to emit 1) airflow system traces of scheduler,
triggerer, executor, processor 2) DAG run traces for deployed DAG runs in OpenTelemetry format. Previously, only metrics were supported which emitted metrics in OpenTelemetry.
This new feature will add richer data for users to use OpenTelemetry standard to emit and send their trace data to OTLP compatible endpoints.
Decorator for Task Flow (@skip_if, @run_if)
to make it simple to apply whether or not to skip a Task. (#41116)
This feature adds a decorator to make it simple to skip a Task.
Using Multiple Executors Concurrently (#40701)
Previously known as hybrid executors, this new feature allows Airflow to use multiple executors concurrently. DAGs, or even individual tasks, can be configured
to use a specific executor that suits its needs best. A single DAG can contain tasks all using different executors. Please see the Airflow documentation for
more details. Note: This feature is still experimental. See documentation on Executor <https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/index.html#using-multiple-executors-concurrently>
_ for a more detailed description.
Scarf based telemetry: Does Airflow collect any telemetry data? (#39510)
Airflow integrates Scarf to collect basic usage data during operation. Deployments can opt-out of data collection by setting the [usage_data_collection]enabled
option to False, or the SCARF_ANALYTICS=false environment variable.
See FAQ on this <https://airflow.apache.org/docs/apache-airflow/stable/faq.html#does-airflow-collect-any-telemetry-data>
_ for more information.
New Features
- AIP-61 Hybrid Execution (AIP-61)
- AIP-62 Getting Lineage from Hook Instrumentation (AIP-62)
- AIP-64 TaskInstance Try History (AIP-64)
- AIP-44 Internal API (AIP-44)
- Enable ending the task directly from the triggerer without going into the worker. (#40084)
- Extend dataset dependencies (#40868)
- Feature/add token authentication to internal api (#40899)
- Add DatasetAlias to support dynamic Dataset Event Emission and Dataset Creation (#40478)
- Add example DAGs for inlet_events (#39893)
- Implement
accessors
to read dataset events defined as inlet (#39367) - Decorator for Task Flow, to make it simple to apply whether or not to skip a Task. (#41116)
- Add start execution from triggerer support to dynamic task mapping (#39912)
- Add try_number to log table (#40739)
- Added ds_format_locale method in macros which allows localizing datetime formatting using Babel (#40746)
- Add DatasetAlias to support dynamic Dataset Event Emission and Dataset Creation (#40478, #40723, #40809, #41264, #40830, #40693, #41302)
- Use sentinel to mark dag as removed on re-serialization (#39825)
- Add parameter for the last number of queries to the DB in DAG file processing stats (#40323)
- Add prototype version dark mode for Airflow UI (#39355)
- Add ability to mark some tasks as successful in
dag test
(#40010) - Allow use of callable for template_fields (#37028)
- Filter running/failed and active/paused dags on the home page(#39701)
- Add metrics about task CPU and memory usage (#39650)
- UI changes for DAG Re-parsing feature (#39636)
- Add Scarf based telemetry (#39510, #41318)
- Add dag re-parsing request endpoint (#39138)
- Redirect to new DAGRun after trigger from Grid view (#39569)
- Display
endDate
in task instance tooltip. (#39547) - Implement
accessors
to read dataset events defined as inlet (#39367, #39893) - Add color to log lines in UI for error and warnings based on keywords (#39006)
- Add Rendered k8s pod spec tab to ti details view (#39141)
- Make audit log before/after filterable (#39120)
- Consolidate grid collapse actions to a single full screen toggle (#39070)
- Implement Metadata to emit runtime extra (#38650)
- Add executor field to the DB and parameter to the operators (#38474)
- Implement context accessor for DatasetEvent extra (#38481)
- Add dataset event info to dag graph (#41012)
- Add button to toggle datasets on/off in dag graph (#41200)
- Add
run_if
&skip_if
decorators (#41116) - Add dag_stats rest api endpoint (#41017)
- Add listeners for Dag import errors (#39739)
- Allowing DateTimeSensorAsync, FileSensor and TimeSensorAsync to start execution from trigger during dynamic task mapping (#41182)
Improvements
- Allow set Dag Run resource into Dag Level permission: extends Dag's access_control feature to allow Dag Run resource permissions. (#40703)
- Improve security and error handling for the internal API (#40999)
- Datasets UI Improvements (#40871)
- Change DAG Audit log tab to Event Log (#40967)
- Make standalone dag file processor works in DB isolation mode (#40916)
- Show only the source on the consumer DAG page and only triggered DAG run in the producer DAG page (#41300)
- Update metrics names to allow multiple executors to report metrics (#40778)
- Format DAG run count (#39684)
- Update styles for
renderedjson
component (#40964) - Improve ATTRIBUTE_REMOVED sentinel to use class and more context (#40920)
- Make XCom display as react json (#40640)
- Replace usages of task context logger with the log table (#40867)
- Rollback for all retry exceptions (#40882) (#40883)
- Support rendering ObjectStoragePath value (#40638)
- Add try_number and map_index as params for log event endpoint (#40845)
- Rotate fernet key in batches to limit memory usage (#40786)
- Add gauge metric for 'last_num_of_db_queries' parameter (#40833)
- Set parallelism log messages to warning level for better visibility (#39298)
- Add error handling for encoding the dag runs (#40222)
- Use params instead of dag_run.conf in example DAG (#40759)
- Load Example Plugins with Example DAGs (#39999)
- Stop deferring TimeDeltaSensorAsync task when the target_dttm is in the past (#40719)
- Send important executor logs to task logs (#40468)
- Open external links in new tabs (#40635)
- Attempt to add ReactJSON view to rendered templates (#40639)
- Speeding up regex match time for custom warnings (#40513)
- Refactor DAG.dataset_triggers into the timetable class (#39321)
- add next_kwargs to StartTriggerArgs (#40376)
- Improve UI error handling (#40350)
- Remove double warning in CLI when config value is deprecated (#40319)
- Implement XComArg concat() (#40172)
- Added
get_extra_dejson
method with nested parameter which allows you to specify if you want the nested json as string to be also deserialized (#39811) - Add executor field to the task instance API (#40034)
- Support checking for db path absoluteness on Windows (#40069)
- Introduce StartTriggerArgs and prevent start trigger initialization in scheduler (#39585)
- Add task documentation to details tab in grid view (#39899)
- Allow executors to be specified with only the class name of the Executor (#40131)
- Remove obsolete conditional logic related to try_number (#40104)
- Allow Task Group Ids to be passed as branches in BranchMixIn (#38883)
- Javascript connection form will apply CodeMirror to all textarea's dynamically (#39812)
- Determine needs_expansion at time of serialization (#39604)
- Add indexes on dag_id column in referencing tables to speed up deletion of dag records (#39638)
- ...
Apache Airflow Helm Chart 1.15.0
Significant Changes
Default Airflow image is updated to 2.9.3
(#40816)
The default Airflow image that is used with the Chart is now 2.9.3
, previously it was 2.9.2
.
Default PgBouncer Exporter image has been updated (#40318)
The PgBouncer Exporter image has been updated to airflow-pgbouncer-exporter-2024.06.18-0.17.0
, which addresses CVE-2024-24786.
New Features
- Add git-sync container lifecycle hooks (#40369)
- Add init containers for jobs (#40454)
- Add persistent volume claim retention policy (#40271)
- Add annotations for Redis StatefulSet (#40281)
- Add
dags.gitSync.sshKey
, which allows the git-sync private key to be configured in the values file directly (#39936) - Add
extraEnvFrom
to git-sync containers (#39031)
Improvements
- Link in
UIAlert
to production guide when a dynamic webserver secret is used now opens in a new tab (#40635) - Support disabling helm hooks on
extraConfigMaps
andextraSecrets
(#40294)
Bug Fixes
- Add git-sync ssh secret to DAG processor (#40691)
- Fix duplicated
safeToEvict
annotations (#40554) - Add missing
triggerer.keda.usePgbouncer
to values.yaml (#40614) - Trim leading
//
character using mysql backend (#40401)
Doc only changes
- Updating chart download link to use the Apache download CDN (#40618)
Misc
Apache Airflow 2.9.3
Significant Changes
Time unit for scheduled_duration
and queued_duration
changed (#37936)
scheduled_duration
and queued_duration
metrics are now emitted in milliseconds instead of seconds.
By convention all statsd metrics should be emitted in milliseconds, this is later expected in e.g. prometheus
statsd-exporter.
Support for OpenTelemetry Metrics is no longer "Experimental" (#40286)
Experimental support for OpenTelemetry was added in 2.7.0 since then fixes and improvements were added and now we announce the feature as stable.
Bug Fixes
- Fix calendar view scroll (#40458)
- Validating provider description for urls in provider list view (#40475)
- Fix compatibility with old MySQL 8.0 (#40314)
- Fix dag (un)pausing won't work on environment where dag files are missing (#40345)
- Extra being passed to SQLalchemy (#40391)
- Handle unsupported operand int + str when value of tag is int (job_id) (#40407)
- Fix TriggeredDagRunOperator triggered link (#40336)
- Add
[webserver]update_fab_perms
to deprecated configs (#40317) - Swap dag run link from legacy graph to grid with graph tab (#40241)
- Change
httpx
torequests
infile_task_handler
(#39799) - Fix import future annotations in venv jinja template (#40208)
- Ensures DAG params order regardless of backend (#40156)
- Use a join for TI notes in TI batch API endpoint (#40028)
- Improve trigger UI for string array format validation (#39993)
- Disable jinja2 rendering for doc_md (#40522)
- Skip checking sub dags list if taskinstance state is skipped (#40578)
- Recognize quotes when parsing urls in logs (#40508)
Doc Only Changes
- Add notes about passing secrets via environment variables (#40519)
- Revamp some confusing log messages (#40334)
- Add more precise description of masking sensitive field names (#40512)
- Add slightly more detailed guidance about upgrading to the docs (#40227)
- Metrics allow_list complete example (#40120)
- Add warning to deprecated api docs that access control isn't applied (#40129)
- Simpler command to check local scheduler is alive (#40074)
- Add a note and an example clarifying the usage of DAG-level params (#40541)
- Fix highlight of example code in dags.rst (#40114)
- Add warning about the PostgresOperator being deprecated (#40662)
- Updating airflow download links to CDN based links (#40618)
- Fix import statement for DatasetOrTimetable example (#40601)
- Further clarify triage process (#40536)
- Fix param order in PythonOperator docstring (#40122)
- Update serializers.rst to mention that bytes are not supported (#40597)
Miscellaneous
- Upgrade build installers and dependencies (#40177)
- Bump braces from 3.0.2 to 3.0.3 in /airflow/www (#40180)
- Upgrade to another version of trove-classifier (new CUDA classifiers) (#40564)
- Rename "try_number" increments that are unrelated to the airflow concept (#39317)
- Update trove classifiers to the latest version as build dependency (#40542)
- Upgrade to latest version of hatchling as build dependency (#40387)
- Fix bug in
SchedulerJobRunner._process_executor_events
(#40563) - Remove logging for "blocked" events (#40446)
Apache Airflow Helm Chart 1.14.0
Significant Changes
ClusterRole
and ClusterRoleBinding
names have been updated to be unique (#37197)
ClusterRole
s and ClusterRoleBinding
s created when multiNamespaceMode
is enabled have been renamed to ensure unique names:
{{ include "airflow.fullname" . }}-pod-launcher-role
has been renamed to{{ .Release.Namespace }}-{{ include "airflow.fullname" . }}-pod-launcher-role
{{ include "airflow.fullname" . }}-pod-launcher-rolebinding
has been renamed to{{ .Release.Namespace }}-{{ include "airflow.fullname" . }}-pod-launcher-rolebinding
{{ include "airflow.fullname" . }}-pod-log-reader-role
has been renamed to{{ .Release.Namespace }}-{{ include "airflow.fullname" . }}-pod-log-reader-role
{{ include "airflow.fullname" . }}-pod-log-reader-rolebinding
has been renamed to{{ .Release.Namespace }}-{{ include "airflow.fullname" . }}-pod-log-reader-rolebinding
{{ include "airflow.fullname" . }}-scc-rolebinding
has been renamed to{{ .Release.Namespace }}-{{ include "airflow.fullname" . }}-scc-rolebinding
workers.safeToEvict
default changed to False (#40229)
The default for workers.safeToEvict
now defaults to False. This is a safer default
as it prevents the nodes workers are running on from being scaled down by the
K8s Cluster Autoscaler <https://kubernetes.io/docs/concepts/cluster-administration/cluster-autoscaling/#cluster-autoscaler>
_.
If you would like to retain the previous behavior, you can set this config to True.
Default Airflow image is updated to 2.9.2
(#40160)
The default Airflow image that is used with the Chart is now 2.9.2
, previously it was 2.8.3
.
Default StatsD image is updated to v0.26.1
(#38416)
The default StatsD image that is used with the Chart is now v0.26.1
, previously it was v0.26.0
.
New Features
Improvements
- Allow
valueFrom
in env config of components (#40135) - Enable templating in
extraContainers
andextraInitContainers
(#38507) - Add safe-to-evict annotation to pod-template-file (#37352)
- Support
workers.command
for KubernetesExecutor (#39132) - Add
priorityClassName
to Jobs (#39133) - Add Kerberos sidecar to pod-template-file (#38815)
- Add templated field support for extra containers (#38510)
Bug Fixes
- Set
workers.safeToEvict
default to False (#40229)
Doc only changes
- Document
extraContainers
andextraInitContainers
that are templated (#40033) - Fix typo in HorizontalPodAutoscaling documentation (#39307)
- Fix supported k8s versions in docs (#39172)
- Fix typo in YAML path for
brokerUrlSecretName
(#39115)
Misc
Apache Airflow 2.9.2
Significant Changes
No significant changes.
Bug Fixes
- Fix bug that makes
AirflowSecurityManagerV2
leave transactions in theidle in transaction
state (#39935) - Fix alembic auto-generation and rename mismatching constraints (#39032)
- Add the existing_nullable to the downgrade side of the migration (#39374)
- Fix Mark Instance state buttons stay disabled if user lacks permission (#37451). (#38732)
- Use SKIP LOCKED instead of NOWAIT in mini scheduler (#39745)
- Remove DAG Run Add option from FAB view (#39881)
- Add max_consecutive_failed_dag_runs in API spec (#39830)
- Fix example_branch_operator failing in python 3.12 (#39783)
- Fetch served logs also when task attempt is up for retry and no remote logs available (#39496)
- Change dataset URI validation to raise warning instead of error in Airflow 2.9 (#39670)
- Visible DAG RUN doesn't point to the same dag run id (#38365)
- Refactor
SafeDogStatsdLogger
to useget_validator
to enable pattern matching (#39370) - Fix custom actions in security manager
has_access
(#39421) - Fix HTTP 500 Internal Server Error if DAG is triggered with bad params (#39409)
- Fix static file caching is disabled in Airflow Webserver. (#39345)
- Fix TaskHandlerWithCustomFormatter now adds prefix only once (#38502)
- Do not provide deprecated
execution_date
in@apply_lineage
(#39327) - Add missing conn_id to string representation of ObjectStoragePath (#39313)
- Fix
sql_alchemy_engine_args
config example (#38971) - Add Cache-Control "no-store" to all dynamically generated content (#39550)
Miscellaneous
- Limit
yandex
provider to avoidmypy
errors (#39990) - Warn on mini scheduler failures instead of debug (#39760)
- Change type definition for
provider_info_cache
decorator (#39750) - Better typing for BaseOperator
defer
(#39742) - More typing in TimeSensor and TimeSensorAsync (#39696)
- Re-raise exception from strict dataset URI checks (#39719)
- Fix stacklevel for _log_state helper (#39596)
- Resolve SA warnings in migrations scripts (#39418)
- Remove unused index
idx_last_scheduling_decision
ondag_run
table (#39275)
Doc Only Changes
- Provide extra tip on labeling DynamicTaskMapping (#39977)
- Improve visibility of links / variables / other configs in Configuration Reference (#39916)
- Remove 'legacy' definition for
CronDataIntervalTimetable
(#39780) - Update plugins.rst examples to use pyproject.toml over setup.py (#39665)
- Fix nit in pg set-up doc (#39628)
- Add Matomo to Tracking User Activity docs (#39611)
- Fix Connection.get -> Connection. get_connection_from_secrets (#39560)
- Adding note for provider dependencies (#39512)
- Update docker-compose command (#39504)
- Update note about restarting triggerer process (#39436)
- Updating S3LogLink with an invalid bucket link (#39424)
- Update testing_packages.rst (#38996)
- Add multi-team diagrams (#38861)
Apache Airflow 2.9.1
Significant Changes
Stackdriver logging bugfix requires Google provider 10.17.0
or later (#38071)
If you use Stackdriver logging, you must use Google provider version 10.17.0
or later. Airflow 2.9.1
now passes gcp_log_name
to the StackdriverTaskHandler
instead of name
, and this will fail on earlier provider versions.
This fixes a bug where the log name configured in [logging] remove_base_log_folder
was overridden when Airflow configured logging, resulting in task logs going to the wrong destination.
Bug Fixes
- Make task log messages include run_id (#39280)
- Copy menu_item
href
for nav bar (#39282) - Fix trigger kwarg encryption migration (#39246, #39361, #39374)
- Add workaround for datetime-local input in
firefox
(#39261) - Add Grid button to Task Instance view (#39223)
- Get served logs when remote or executor logs not available for non-running task try (#39177)
- Fixed side effect of menu filtering causing disappearing menus (#39229)
- Use grid view for Task Instance's
log_url
(#39183) - Improve task filtering
UX
(#39119) - Improve rendered_template
ux
in react dag page (#39122) - Graph view improvements (#38940)
- Check that the dataset<>task exists before trying to render graph (#39069)
- Hostname was "redacted", not "redact"; remove it when there is no context (#39037)
- Check whether
AUTH_ROLE_PUBLIC
is set incheck_authentication
(#39012) - Move rendering of
map_index_template
so it renders for failed tasks as long as it was defined before the point of failure (#38902) Undeprecate
BaseXCom.get_one
method for now (#38991)- Add
inherit_cache
attribute forCreateTableAs
custom SA Clause (#38985) - Don't wait for DagRun lock in mini scheduler (#38914)
- Fix calendar view with no DAG Run (#38964)
- Changed the background color of external task in graph (#38969)
- Fix dag run selection (#38941)
- Fix
SAWarning
'Coercing Subquery object into a select() for use in IN()' (#38926) - Fix implicit
cartesian
product in AirflowSecurityManagerV2 (#38913) - Fix problem that links in legacy log view can not be clicked (#38882)
- Fix dag run link params (#38873)
- Use async db calls in WorkflowTrigger (#38689)
- Fix audit log events filter (#38719)
- Use
methodtools.lru_cache
instead offunctools.lru_cache
in class methods (#37757) - Raise deprecated warning in
airflow dags backfill
only if-I
/--ignore-first-depends-on-past
provided (#38676)
Miscellaneous
TriggerDagRunOperator
deprecateexecution_date
in favor oflogical_date
(#39285)- Force to use Airflow Deprecation warnings categories on
@deprecated
decorator (#39205) - Add warning about run/import Airflow under the Windows (#39196)
- Update
is_authorized_custom_view
from auth manager to handle custom actions (#39167) - Add in Trove classifiers Python 3.12 support (#39004)
- Use debug level for
minischeduler
skip (#38976) - Bump
undici
from5.28.3 to 5.28.4
in/airflow/www
(#38751)
Doc Only Changes
- Fix supported k8s version in docs (#39172)
- Dynamic task mapping
PythonOperator
op_kwargs (#39242) - Add link to
user
androle
commands (#39224) - Add
k8s 1.29
to supported version in docs (#39168) - Data aware scheduling docs edits (#38687)
- Update
DagBag
class docstring to include all params (#38814) - Correcting an example taskflow example (#39015)
- Remove decorator from rendering fields example (#38827)
Apache Airflow 2.9.0
Significant Changes
Following Listener API methods are considered stable and can be used for production system (were experimental feature in older Airflow versions) (#36376):
Lifecycle events:
on_starting
before_stopping
DagRun State Change Events:
on_dag_run_running
on_dag_run_success
on_dag_run_failed
TaskInstance State Change Events:
on_task_instance_running
on_task_instance_success
on_task_instance_failed
Support for Microsoft SQL-Server for Airflow Meta Database has been removed (#36514)
After discussion <https://lists.apache.org/thread/r06j306hldg03g2my1pd4nyjxg78b3h4>
__
and a voting process <https://lists.apache.org/thread/pgcgmhf6560k8jbsmz8nlyoxosvltph2>
__,
the Airflow's PMC and Committers have reached a resolution to no longer maintain MsSQL as a supported Database Backend.
As of Airflow 2.9.0 support of MsSQL has been removed for Airflow Database Backend.
A migration script which can help migrating the database before upgrading to Airflow 2.9.0 is available in
airflow-mssql-migration repo on Github <https://github.com/apache/airflow-mssql-migration>
_.
Note that the migration script is provided without support and warranty.
This does not affect the existing provider packages (operators and hooks), DAGs can still access and process data from MsSQL.
Dataset URIs are now validated on input (#37005)
Datasets must use a URI that conform to rules laid down in AIP-60, and the value
will be automatically normalized when the DAG file is parsed. See
documentation on Datasets <https://airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/datasets.html>
_ for
a more detailed description on the rules.
You may need to change your Dataset identifiers if they look like a URI, but are
used in a less mainstream way, such as relying on the URI's auth section, or
have a case-sensitive protocol name.
The method get_permitted_menu_items
in BaseAuthManager
has been renamed filter_permitted_menu_items
(#37627)
Add REST API actions to Audit Log events (#37734)
The Audit Log event
name for REST API events will be prepended with api.
or ui.
, depending on if it came from the Airflow UI or externally.
Official support for Python 3.12 (#38025)
There are a few caveats though:
-
Pendulum2 does not support Python 3.12. For Python 3.12 you need to use
Pendulum 3 <https://pendulum.eustace.io/blog/announcing-pendulum-3-0-0.html>
_ -
Minimum SQLAlchemy version supported when Pandas is installed for Python 3.12 is
1.4.36
released in
April 2022. Airflow 2.9.0 increases the minimum supported version of SQLAlchemy to1.4.36
for all
Python versions.
Not all Providers support Python 3.12. At the initial release of Airflow 2.9.0 the following providers
are released without support for Python 3.12:
apache.beam
- pending onApache Beam support for 3.12 <https://github.com/apache/beam/issues/29149>
_papermill
- pending on Releasing Python 3.12 compatible papermill client version
including this merged issue <https://github.com/nteract/papermill/pull/771>
_
Prevent large string objects from being stored in the Rendered Template Fields (#38094)
There's now a limit to the length of data that can be stored in the Rendered Template Fields.
The limit is set to 4096 characters. If the data exceeds this limit, it will be truncated. You can change this limit
by setting the [core]max_template_field_length
configuration option in your airflow config.
Change xcom table column value type to longblob for MySQL backend (#38401)
Xcom table column value
type has changed from blob
to longblob
. This will allow you to store relatively big data in Xcom but process can take a significant amount of time if you have a lot of large data stored in Xcom.
To downgrade from revision: b4078ac230a1
, ensure that you don't have Xcom values larger than 65,535 bytes. Otherwise, you'll need to clean those rows or run airflow db clean xcom
to clean the Xcom table.
New Features
- Allow users to write dag_id and task_id in their national characters, added display name for dag / task (v2) (#38446)
- Prevent large objects from being stored in the RTIF (#38094)
- Use current time to calculate duration when end date is not present. (#38375)
- Add average duration mark line in task and dagrun duration charts. (#38214, #38434)
- Add button to manually create dataset events (#38305)
- Add
Matomo
as an option for analytics_tool. (#38221) - Experimental: Support custom weight_rule implementation to calculate the TI priority_weight (#38222)
- Adding ability to automatically set DAG to off after X times it failed sequentially (#36935)
- Add dataset conditions to next run datasets modal (#38123)
- Add task log grouping to UI (#38021)
- Add dataset_expression to grid dag details (#38121)
- Introduce mechanism to support multiple executor configuration (#37635)
- Add color formatting for ANSI chars in logs from task executions (#37985)
- Add the dataset_expression as part of DagModel and DAGDetailSchema (#37826)
- Add TaskFail entries to Gantt chart (#37918)
- Allow longer rendered_map_index (#37798)
- Inherit the run_ordering from DatasetTriggeredTimetable for DatasetOrTimeSchedule (#37775)
- Implement AIP-60 Dataset URI formats (#37005)
- Introducing Logical Operators for dataset conditional logic (#37101)
- Add post endpoint for dataset events (#37570)
- Show custom instance names for a mapped task in UI (#36797)
- Add excluded/included events to get_event_logs api (#37641)
- Add datasets to dag graph (#37604)
- Show dataset events above task/run details in grid view (#37603)
- Introduce new config variable to control whether DAG processor outputs to stdout (#37439)
- Make Datasets
hashable
(#37465) - Add conditional logic for dataset triggering (#37016)
- Implement task duration page in react. (#35863)
- Add
queuedEvent
endpoint to get/delete DatasetDagRunQueue (#37176) - Support multiple XCom output in the BaseOperator (#37297)
- AIP-58: Add object storage backend for xcom (#37058)
- Introduce
DatasetOrTimeSchedule
(#36710) - Add
on_skipped_callback
toBaseOperator
(#36374) - Allow override of hovered navbar colors (#36631)
- Create new Metrics with Tagging (#36528)
- Add support for openlineage to AFS and common.io (#36410)
- Introduce
@task.bash
TaskFlow decorator (#30176, #37875) - Added functionality to automatically ingest custom airflow.cfg file upon startup (#36289)
Improvements
- More human friendly "show tables" output for db cleanup (#38654)
- Improve trigger assign_unassigned by merging alive_triggerer_ids and get_sorted_triggers queries (#38664)
- Add exclude/include events filters to audit log (#38506)
- Clean up unused triggers in a single query for all dialects except MySQL (#38663)
- Update Confirmation Logic for Config Changes on Sensitive Environments Like Production (#38299)
- Improve datasets graph UX (#38476)
- Only show latest dataset event timestamp after last run (#38340)
- Add button to clear only failed tasks in a dagrun. (#38217)
- Delete all old dag pages and redirect to grid view (#37988)
- Check task attribute before use in sentry.add_tagging() (#37143)
- Mysql change xcom value col type for MySQL backend (#38401)
ExternalPythonOperator
use version fromsys.version_info
(#38377)- Replace too broad exceptions into the Core (#38344)
- Add CLI support for bulk pause and resume of DAGs (#38265)
- Implement methods on TaskInstancePydantic and DagRunPydantic (#38295, #38302, #38303, #38297)
- Made filters bar collapsible and add a full screen toggle (#38296)
- Encrypt all trigger attributes (#38233, #38358, #38743)
- Upgrade react-table package. Use with Audit Log table (#38092)
- Show if dag page filters are active (#38080)
- Add try number to mapped instance (#38097)
- Add retries to job heartbeat (#37541)
- Add REST API events to Audit Log (#37734)
- Make current working directory as templated field in BashOperator (#37968)
- Add calendar view to react (#37909)
- Add
run_id
column to log table (#37731) - Add
tryNumber
to grid task instance tooltip (#37911) - Session is not used in _do_render_template_fields (#37856)
- Improve MappedOperator property types (#37870)
- Remove provide_session decorator from TaskInstancePydantic methods (#37853)
- Ensure the "airflow.task" logger used for TaskInstancePydantic and TaskInstance (#37857)
- Better error message for internal api call error (#37852)
- Increase tooltip size of dag grid view (#37782) (#37805)
- Use named loggers instead of root logger (#37801)
- Add Run Duration in React (#37735)
- Avoid non-recommended usage of logging (#37792)
- Improve DateTimeTrigger typing (#37694)
- Make sure all unique run_ids render a task duration bar (#37717)
- Add Dag Audit Log to React (#37682)
- Add log event for auto pause (#38243)
- Better message for exception for templated base operator fields (#37668)
- Clean up webserver endpoints adding to audit log (#37580)
- Filter datasets graph by dag_id (#37464)
- Use new exception type inheriting BaseException for SIGTERMs (#37613)
- Refactor dataset class inheritance (#37590)
- Simplify checks for package versions (#37585)
- Filter Datasets by associated dag_ids (GET /datasets) (#37512)
- Enable "airflow tasks test" to run deferrable operator (#37542)
- Make datasets list/graph width adjustable (#37425)
- Speedup determine installed airflow version in
ExternalPythonOperator
(#37409) - Add more task details from rest api (#37394)
- Add confirmation dialog box for DAG run actions (#35393)
- Added shutdown color to the STATE_COLORS (#37295)
- Remove legacy dag details page and redirect to grid (#37232)
- Order XCom entries by map index in API (#37086...