Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ Airflow defines the specification as `hookspec <https://github.com/apache/airflo

To include the listener in your Airflow installation, include it as a part of an :doc:`Airflow Plugin </administration-and-deployment/plugins>`.

Listener API is meant to be called across all dags and all operators. You can't listen to events generated by specific dags. For that behavior, try methods like ``on_success_callback`` and ``pre_execute``. These provide callbacks for particular DAG authors or operator creators. The logs and ``print()`` calls will be handled as part of the listeners.
Listener API is meant to be called across all dags and all operators. You can't listen to events generated by specific dags. For that behavior, try methods like ``on_success_callback`` and ``pre_execute``. These provide callbacks for particular Dag authors or operator creators. The logs and ``print()`` calls will be handled as part of the listeners.


Compatibility note
Expand Down
2 changes: 1 addition & 1 deletion airflow-core/docs/authoring-and-scheduling/deferring.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ An overview of how this process works:
* The trigger runs until it fires, at which point its source task is re-scheduled by the scheduler.
* The scheduler queues the task to resume on a worker node.

You can either use pre-written deferrable operators as a DAG author or write your own. Writing them, however, requires that they meet certain design criteria.
You can either use pre-written deferrable operators as a Dag author or write your own. Writing them, however, requires that they meet certain design criteria.

Using Deferrable Operators
--------------------------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
Dynamic Task Mapping
====================

Dynamic Task Mapping allows a way for a workflow to create a number of tasks at runtime based upon current data, rather than the DAG author having to know in advance how many tasks would be needed.
Dynamic Task Mapping allows a way for a workflow to create a number of tasks at runtime based upon current data, rather than the Dag author having to know in advance how many tasks would be needed.

This is similar to defining your tasks in a for loop, but instead of having the DAG file fetch the data and do that itself, the scheduler can do this based on the output of a previous task. Right before a mapped task is executed the scheduler will create *n* copies of the task, one for each input.

Expand Down
12 changes: 6 additions & 6 deletions airflow-core/docs/best-practices.rst
Original file line number Diff line number Diff line change
Expand Up @@ -978,12 +978,12 @@ The benefits of the operator are:
Airflow dependencies) to make use of multiple virtual environments
* You can run tasks with different sets of dependencies on the same workers - thus Memory resources are
reused (though see below about the CPU overhead involved in creating the venvs).
* In bigger installations, DAG Authors do not need to ask anyone to create the venvs for you.
As a DAG Author, you only have to have virtualenv dependency installed and you can specify and modify the
* In bigger installations, Dag authors do not need to ask anyone to create the venvs for you.
As a Dag author, you only have to have virtualenv dependency installed and you can specify and modify the
environments as you see fit.
* No changes in deployment requirements - whether you use Local virtualenv, or Docker, or Kubernetes,
the tasks will work without adding anything to your deployment.
* No need to learn more about containers, Kubernetes as a DAG Author. Only knowledge of Python requirements
* No need to learn more about containers, Kubernetes as a Dag author. Only knowledge of Python requirements
is required to author dags this way.

There are certain limitations and overhead introduced by this operator:
Expand Down Expand Up @@ -1029,7 +1029,7 @@ and available in all the workers in case your Airflow runs in a distributed envi

This way you avoid the overhead and problems of re-creating the virtual environment but they have to be
prepared and deployed together with Airflow installation. Usually people who manage Airflow installation
need to be involved, and in bigger installations those are usually different people than DAG Authors
need to be involved, and in bigger installations those are usually different people than Dag authors
(DevOps/System Admins).

Those virtual environments can be prepared in various ways - if you use LocalExecutor they just need to be installed
Expand All @@ -1048,7 +1048,7 @@ The benefits of the operator are:
be added dynamically. This is good for both, security and stability.
* Limited impact on your deployment - you do not need to switch to Docker containers or Kubernetes to
make a good use of the operator.
* No need to learn more about containers, Kubernetes as a DAG Author. Only knowledge of Python, requirements
* No need to learn more about containers, Kubernetes as a Dag author. Only knowledge of Python, requirements
is required to author dags this way.

The drawbacks:
Expand All @@ -1069,7 +1069,7 @@ The drawbacks:
same worker might be affected by previous tasks creating/modifying files etc.

You can think about the ``PythonVirtualenvOperator`` and ``ExternalPythonOperator`` as counterparts -
that make it smoother to move from development phase to production phase. As a DAG author you'd normally
that make it smoother to move from development phase to production phase. As a Dag author you'd normally
iterate with dependencies and develop your DAG using ``PythonVirtualenvOperator`` (thus decorating
your tasks with ``@task.virtualenv`` decorators) while after the iteration and changes you would likely
want to change it for production to switch to the ``ExternalPythonOperator`` (and ``@task.external_python``)
Expand Down
10 changes: 5 additions & 5 deletions airflow-core/docs/core-concepts/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -98,15 +98,15 @@ and can be scaled by running multiple instances of the components above.
The separation of components also allow for increased security, by isolating the components from each other
and by allowing to perform different tasks. For example separating *dag processor* from *scheduler*
allows to make sure that the *scheduler* does not have access to the *DAG files* and cannot execute
code provided by *DAG author*.
code provided by *Dag author*.

Also while single person can run and manage Airflow installation, Airflow Deployment in more complex
setup can involve various roles of users that can interact with different parts of the system, which is
an important aspect of secure Airflow deployment. The roles are described in detail in the
:doc:`/security/security_model` and generally speaking include:

* Deployment Manager - a person that installs and configures Airflow and manages the deployment
* DAG author - a person that writes dags and submits them to Airflow
* Dag author - a person that writes dags and submits them to Airflow
* Operations User - a person that triggers dags and tasks and monitors their execution

Architecture Diagrams
Expand Down Expand Up @@ -153,13 +153,13 @@ Distributed Airflow architecture
................................

This is the architecture of Airflow where components of Airflow are distributed among multiple machines
and where various roles of users are introduced - *Deployment Manager*, **DAG author**,
and where various roles of users are introduced - *Deployment Manager*, **Dag author**,
**Operations User**. You can read more about those various roles in the :doc:`/security/security_model`.

In the case of a distributed deployment, it is important to consider the security aspects of the components.
The *webserver* does not have access to the *DAG files* directly. The code in the ``Code`` tab of the
UI is read from the *metadata database*. The *webserver* cannot execute any code submitted by the
**DAG author**. It can only execute code that is installed as an *installed package* or *plugin* by
**Dag author**. It can only execute code that is installed as an *installed package* or *plugin* by
the **Deployment Manager**. The **Operations User** only has access to the UI and can only trigger
dags and tasks, but cannot author dags.

Expand All @@ -178,7 +178,7 @@ Separate DAG processing architecture
In a more complex installation where security and isolation are important, you'll also see the
standalone *dag processor* component that allows to separate *scheduler* from accessing *DAG files*.
This is suitable if the deployment focus is on isolation between parsed tasks. While Airflow does not yet
support full multi-tenant features, it can be used to make sure that **DAG author** provided code is never
support full multi-tenant features, it can be used to make sure that **Dag author** provided code is never
executed in the context of the scheduler.

.. image:: ../img/diagram_dag_processor_airflow_architecture.png
Expand Down
2 changes: 1 addition & 1 deletion airflow-core/docs/core-concepts/params.rst
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ JSON Schema Validation
.. note::
If ``schedule`` is defined for a DAG, params with defaults must be valid. This is validated during DAG parsing.
If ``schedule=None`` then params are not validated during DAG parsing but before triggering a DAG.
This is useful in cases where the DAG author does not want to provide defaults but wants to force users provide valid parameters
This is useful in cases where the Dag author does not want to provide defaults but wants to force users provide valid parameters
at time of trigger.

.. note::
Expand Down
2 changes: 1 addition & 1 deletion airflow-core/docs/installation/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,7 @@ The requirements that Airflow might need depend on many factors, including (but
the technology/cloud/integration of monitoring etc.
* Technical details of database, hardware, network, etc. that your deployment is running on
* The complexity of the code you add to your DAGS, configuration, plugins, settings etc. (note, that
Airflow runs the code that DAG author and Deployment Manager provide)
Airflow runs the code that Dag author and Deployment Manager provide)
* The number and choice of providers you install and use (Airflow has more than 80 providers) that can
be installed by choice of the Deployment Manager and using them might require more resources.
* The choice of parameters that you use when tuning Airflow. Airflow has many configuration parameters
Expand Down
4 changes: 2 additions & 2 deletions airflow-core/docs/installation/upgrading_to_airflow3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ Step 2: Clean and back up your existing Airflow Instance
ensure you deploy your changes to your old instance prior to upgrade, and wait until your dags have all been reprocessed
(and all errors gone) before you proceed with upgrade.

Step 3: DAG Authors - Check your Airflow DAGs for compatibility
Step 3: Dag Authors - Check your Airflow Dags for compatibility
----------------------------------------------------------------

To minimize friction for users upgrading from prior versions of Airflow, we have created a dag upgrade check utility using `Ruff <https://docs.astral.sh/ruff/>`_.
Expand Down Expand Up @@ -115,7 +115,7 @@ Step 4: Install the Standard Providers

- Some of the commonly used Operators which were bundled as part of the ``airflow-core`` package (for example ``BashOperator`` and ``PythonOperator``)
have now been split out into a separate package: ``apache-airflow-providers-standard``.
- For convenience, this package can also be installed on Airflow 2.x versions, so that DAGs can be modified to reference these Operators from the standard provider
- For convenience, this package can also be installed on Airflow 2.x versions, so that Dags can be modified to reference these Operators from the standard provider
package instead of Airflow Core.

Step 5: Review custom operators for direct db access
Expand Down
23 changes: 7 additions & 16 deletions airflow-core/docs/public-airflow-interface.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ The following are some examples of the public interface of Airflow:

* When you are writing your own operators or hooks. This is commonly done when no hook or operator exists for your use case, or when perhaps when one exists but you need to customize the behavior.
* When writing new :doc:`Plugins <administration-and-deployment/plugins>` that extend Airflow's functionality beyond
DAG building blocks. Secrets, Timetables, Triggers, Listeners are all examples of such functionality. This
Dag building blocks. Secrets, Timetables, Triggers, Listeners are all examples of such functionality. This
is usually done by users who manage Airflow instances.
* Bundling custom Operators, Hooks, Plugins and releasing them together via
:doc:`providers <apache-airflow-providers:index>` - this is usually done by those who intend to
Expand All @@ -87,7 +87,7 @@ in details (such as output format and available flags) so if you want to rely on
way, the Stable REST API is recommended.


Using the Public Interface for DAG Authors
Using the Public Interface for Dag authors
==========================================

The primary interface for DAG Authors is the :doc:`airflow.sdk namespace <core-concepts/taskflow>`.
Expand Down Expand Up @@ -143,8 +143,8 @@ removed in a future Airflow version.
Dags
====

The DAG is Airflow's core entity that represents a recurring workflow. You can create a DAG by
instantiating the :class:`~airflow.sdk.DAG` class in your DAG file. Dags can also have parameters
The Dag is Airflow's core entity that represents a recurring workflow. You can create a Dag by
instantiating the :class:`~airflow.sdk.DAG` class in your Dag file. Dags can also have parameters
specified via :class:`~airflow.sdk.Param` class.

The recommended way to create DAGs is using the :func:`~airflow.sdk.dag` decorator
Expand Down Expand Up @@ -364,7 +364,7 @@ Timetables
==========

Custom timetable implementations provide Airflow's scheduler additional logic to
schedule DAG runs in ways not possible with built-in schedule expressions.
schedule Dag runs in ways not possible with built-in schedule expressions.
All Timetables derive from :class:`~airflow.timetables.base.Timetable`.

Airflow has a set of Timetables that are considered public. You are free to extend their functionality
Expand All @@ -381,10 +381,10 @@ You can read more about Timetables in :doc:`howto/timetable`.
Listeners
=========

Listeners enable you to respond to DAG/Task lifecycle events.
Listeners enable you to respond to Dag/Task lifecycle events.

This is implemented via :class:`~airflow.listeners.listener.ListenerManager` class that provides hooks that
can be implemented to respond to DAG/Task lifecycle events.
can be implemented to respond to Dag/Task lifecycle events.

.. versionadded:: 2.5

Expand Down Expand Up @@ -484,15 +484,6 @@ The :doc:`apache-airflow-providers:core-extensions/logging` that also shows avai
implemented in the community providers.

Decorators
==========
DAG Authors can use decorators to author dags using the :doc:`TaskFlow <core-concepts/taskflow>` concept.
All Decorators derive from :class:`~airflow.sdk.bases.decorator.TaskDecorator`.

The primary decorators for DAG Authors are now in the airflow.sdk namespace:
:func:`~airflow.sdk.dag`, :func:`~airflow.sdk.task`, :func:`~airflow.sdk.asset`,
:func:`~airflow.sdk.setup`, :func:`~airflow.sdk.task_group`, :func:`~airflow.sdk.teardown`,
:func:`~airflow.sdk.chain`, :func:`~airflow.sdk.chain_linear`, :func:`~airflow.sdk.cross_downstream`,
:func:`~airflow.sdk.get_current_context` and :func:`~airflow.sdk.get_parsing_context`.

Airflow has a set of Decorators that are considered public. You are free to extend their functionality
by extending them:
Expand Down
Loading