Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Mar 25, 2025

The lazy consensus decision has been made at the devlist to switch
entirely to uv as development tool:

link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256

This PR implements that decision and removes a lot of baggage connected
to using pip additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use pyproject.toml approach and linking them all
together via uv's workspace feature.

This enables much more streamlined development workflows, where any
part of airflow development is manageable using uv sync in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.

Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.

This PR is "safe" in terms of the airflow and provider's code - it
does not really (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.

It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the docs generation code to devel-common.

What is still NOT done after that move and will be covered in the
follow-up changes:

  • isolating docs-building to have separate configuraiton for docs
    building per distribution - allowing to run doc build locally

  • moving some of the tests and checks out from breeze container
    image up to the local environment (for example mypy checks) and
    likely isolating them per-provider

  • Constraints are still generated using pip freeze and automatically
    managed by our custom scripts in canary builds - this will be
    replaced later by switching to uv.lock mechanism.

  • potentially, we could merge devel-common and dev - to be
    considered as a follow-up.

  • PROD image is stil build with pip by default when using
    PyPI or distribution packages - but we do not support building
    the source image with pip - when building from sources, uv
    is forced internally to install packages. Currently we have
    no plans to change default PROD building to use uv.

This is the detailed list of changes implemented in this PR:

  • uv is now mandatory to install as pre-requisite in order to
    develop airflow. We do not support installing airflow for
    development with pip - there will be a lot of cases where
    it will not work for development - including development
    dependencies and installing several distributions together.

  • removed meta-package `hatch_build.py' and replacing it with
    pre-commit automatically modifying declarative pyproject.toml

  • stripped down hatch_build_airflow_core.py to only cover custom
    git and asset build hooks (and renaming the file to hatch_build.py
    and moving all airflow dependencies to pyproject.toml

  • converted "loose" packages in airflow repo into distributions:

    • docker-tests
    • kubernetes-tests
    • helm-tests
    • dev (here we do not have src subfolder - sources are directly
      in the distribution, which is for-now inconsistent with other
      distributions).

    The names of the _tests distribution folders have been renamed to
    the -tests convention to make sure the imports are always
    referring to base of each distribution and are not used from the
    content root.

  • Each eof the distributions (on top of already existing airflow-core,
    task-sdk, devel-common and 90+providers has it's own set of
    dependencies, and the top-level meta-package workspace root brings
    those distributions together allowing to install them all tegether
    with a simple uv sync --all-packages command and come up with
    consistent set of dependencies that are good for all those
    packages (yay!). This is used to build CI image with single
    common environment to run the tests (with some quirks due to
    constraints use where we have to manually list all distributions
    until we switch to uv.lock mechanism)

  • doc code is moved to devel-common distribution. The docker-stack
    and airflow-providers docs are still kept in docs (follow up
    changes are coming).

  • versions are not dynamically retrieved from __init__.py all
    of them are synchronized directly to pyproject.toml files - this
    way - except the custom build hook - we have no dynamic components
    in our pyproject.toml properties.

  • references to extras were removed from INSTALL and other places,
    the only references to extras remains in the user documentation - we
    stop using extras for local development, we switch to using
    dependency groups.

  • backtracking command was removed from breeze - we did not need it
    since we started using uv

  • internal commands (except constraint generation) have been moved to
    uv from pip

  • breeze requires uv to be installed and expects to be installed by
    uv tool install -e ./dev/breeze

  • pyproject.tomls are dynamically modified when we add a version
    suffix dynamically (--version-suffix-for-pypi) - only for the
    time of building the versions with updated suffix

  • mypy checks are now consistently used across all the different
    distributions and for consistency (and to fix some of the issues
    with namespace packages) rather than using "folder" approach
    when running mypy checks, even if we run mypy for whole
    distribution, we run check on individual files rather than on
    a folder. That adds consistency in execution of mypy heursistics.
    Rather than using in-container mypy script all the logic of
    selection and parameters passed to mypy are in pre-commit code.
    For now we are still using CI image to run mypy because mypy is
    very sensitive to version of dependencies installed, we should
    be able to switch to running mypy locally once we have the
    uv.lock mechanism incorporated in our workflows.

  • lower bounds for dependencies have been set consistently across
    all the distributions. With uv sync and dependabot, those
    should be generally kept consistently for the future

  • the devel-common dependencies have been groupped together in
    devel-common extras - including basic, doc, doc-gen, and
    all which will make it easier to install them for some OS-es
    (basic is used as default set of dependencies to cover most
    common set of development dependencies to be used for development)

  • generated/provider_dependencies.json are not committed to the
    repository any longer. They are .gitignored and geberated
    on-the-flight as needed (breeze will generate them automatically
    when empty and pre-commit will always regenerate them to be
    consistent with provider's pyproject.toml files.

  • chart-utils have been noved to helm-tests from devel-common
    as they were only used there.

  • for k8s tests we are using the uv main .venv environment
    rather than creating our own .build environment and we use
    uv sync to keep it in sync

  • Updated uv version to 0.6.10

  • We are using uv sync to perform "upgrade to newer depencies"
    in canary builds and locally

  • leveldb has been turned into "dependency group" and removed from
    apache-airflow and apache-airflow-core extras, it is now only
    available by google provider's leveldb optional extra to install
    with pip


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk potiuk changed the title Simplify tooling by only selecting uv Simplify tooling by only allowing uv Mar 25, 2025
@potiuk potiuk force-pushed the simplify-dev-tooling-with-uv-selection branch 19 times, most recently from 69d1a25 to df7df66 Compare March 30, 2025 14:21
@potiuk potiuk changed the title Simplify tooling by only allowing uv Simplify tooling by switching completely to uv Mar 30, 2025
@potiuk potiuk marked this pull request as ready for review March 30, 2025 14:21
@potiuk potiuk force-pushed the simplify-dev-tooling-with-uv-selection branch 3 times, most recently from 4332761 to 63a770c Compare April 1, 2025 18:34
Copy link
Contributor

@bugraoz93 bugraoz93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall! I went through entire changes. It is hard to come up with specific things to mention. There are already a good amount of reviews, but I wanted to go through and follow up. Even though LoC seems like more but lots of logic is deleted, great simplification :)

@potiuk potiuk force-pushed the simplify-dev-tooling-with-uv-selection branch from 63a770c to 77694e5 Compare April 1, 2025 19:59
@potiuk
Copy link
Member Author

potiuk commented Apr 1, 2025

Looks great overall! I went through entire changes. It is hard to come up with specific things to mention. There are already a good amount of reviews, but I wanted to go through and follow up. Even though LoC seems like more but lots of logic is deleted, great simplification :)

Wait for what's next ... It unblocks ...SO MANY THINGS...

@potiuk potiuk force-pushed the simplify-dev-tooling-with-uv-selection branch 6 times, most recently from 3ddd66a to 441f148 Compare April 2, 2025 09:24
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:

link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256

This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.

This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.

Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.

This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.

It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.

What is still NOT done after that move and will be covered in the
follow-up changes:

* isolating docs-building to have separate configuraiton for docs
  building per distribution - allowing to run doc build locally
  with it's own conf.py file

* moving some of the tests and checks out from breeze container
  image up to the local environment (for example mypy checks) and
  likely isolating them per-provider

* Constraints are still generated using `pip freeze` and automatically
  managed by our custom scripts in `canary` builds - this will be
  replaced later by switching to `uv.lock` mechanism.

* potentially, we could merge `devel-common` and `dev` - to be
  considered as a follow-up.

* PROD image is stil build with `pip` by default when using
  `PyPI` or distribution packages  - but we do not support building
  the source image with `pip` - when building from sources, uv
  is forced internally to install packages. Currently we have
  no plans to change default PROD building to use `uv`.

This is the detailed list of changes implemented in this PR:

* uv is now mandatory to install as pre-requisite in order to
  develop airflow. We do not support installing airflow for
  development with `pip` - there will be a lot of cases where
  it will not work for development - including development
  dependencies and installing several distributions together.

* removed meta-package `hatch_build.py' and replacing it with
  pre-commit automatically modifying declarative pyproject.toml

* stripped down `hatch_build_airflow_core.py` to only cover custom
  git and asset build hooks (and renaming the file to `hatch_build.py`
  and moving all airflow dependencies to `pyproject.toml`

* converted "loose" packages in airflow repo into distributions:
  * docker-tests
  * kubernetes-tests
  * helm-tests
  * dev (here we do not have `src` subfolder - sources are directly
    in the distribution, which is for-now inconsistent with other
    distributions).

  The names of the `_tests` distribution folders have been renamed to
  the `-tests` convention to make sure the imports are always
  referring to base of each distribution and are not used from the
  content root.

* Each eof the distributions (on top of already existing airflow-core,
  task-sdk, devel-common and 90+providers has it's own set of
  dependencies, and the top-level meta-package workspace root brings
  those distributions together allowing to install them all tegether
  with a simple `uv sync --all-packages` command and come up with
  consistent set of dependencies that are good for all those
  packages (yay!). This is used to build CI image with single
  common environment to run the tests (with some quirks due to
  constraints use where we have to manually list all distributions
  until we switch to `uv.lock` mechanism)

* `doc` code is moved to `devel-common` distribution. The `doc` folder
  only keeps README informing where the other doc code is, the
  spelling_wordlist.txt and start_docs_server.sh. The documentation is
  generated in `generated/generated-docs/` folder which is entirely
  .gitignored.

* the documentation is now fully moved to:
  * `airflow-core/docs` - documentation for Airflow Core
  * `providers/**/docs` - documentation for Providers
  * `chart/docs` - documentation for Helm Chart
  * `task-sdk/docs` - documentation for Task SDK (new format not yet published)
  * `docker-stack-docs` - documentation for Docker Stack'
  * `providers-summary-docs` - documentation for provider summary page

* `versions` are not dynamically retrieved from `__init__.py` all
  of them are synchronized directly to pyproject.toml files - this
  way - except the custom build hook - we have no dynamic components
  in our `pyproject.toml` properties.

* references to extras were removed from INSTALL and other places,
  the only references to extras remains in the user documentation - we
  stop using extras for local development, we switch to using
  dependency groups.

* backtracking command was removed from breeze - we did not need it
  since we started using `uv`

* internal commands (except constraint generation) have been moved to
  `uv` from `pip`

* breeze requires `uv` to be installed and expects to be installed by
  `uv tool install -e ./dev/breeze`

* pyproject.tomls are dynamically modified when we add a version
  suffix dynamically (`--version-suffix-for-pypi`) - only for the
  time of building the versions with updated suffix

* `mypy` checks are now consistently used across all the different
  distributions and for consistency (and to fix some of the issues
  with namespace packages) rather than using "folder" approach
  when running mypy checks, even if we run mypy for whole
  distribution, we run check on individual files rather than on
  a folder. That adds consistency in execution of mypy heursistics.
  Rather than using in-container mypy script all the logic of
  selection and parameters passed to mypy are in pre-commit code.
  For now we are still using CI image to run mypy because mypy is
  very sensitive to version of dependencies installed, we should
  be able to switch to running mypy locally once we have the
  `uv.lock` mechanism incorporated in our workflows.

* lower bounds for dependencies have been set consistently across
  all the distributions. With `uv sync` and dependabot, those
  should be generally kept consistently for the future

* the `devel-common` dependencies have been groupped together in
  `devel-common` extras - including `basic`, `doc`, `doc-gen`, and
  `all` which will make it easier to install them for some OS-es
  (basic is used as default set of dependencies to cover most
  common set of development dependencies to be used for development)

* generated/provider_dependencies.json are not committed to the
  repository any longer. They are .gitignored and geberated
  on-the-flight as needed (breeze will generate them automatically
  when empty and pre-commit will always regenerate them to be
  consistent with provider's pyproject.toml files.

* `chart-utils` have been noved to `helm-tests` from `devel-common`
  as they were only used there.

* for k8s tests we are using the `uv` main `.venv` environment
  rather than creating our own `.build` environment and we use
  `uv sync` to keep it in sync

* Updated `uv` version to 0.6.10

* We are using `uv sync` to perform "upgrade to newer depencies"
  in `canary` builds and locally

* leveldb has been turned into "dependency group" and removed from
  apache-airflow and apache-airflow-core extras, it is now only
  available by google provider's leveldb optional extra to install
  with `pip`
@potiuk potiuk force-pushed the simplify-dev-tooling-with-uv-selection branch from 441f148 to f909ebc Compare April 2, 2025 11:10
@potiuk
Copy link
Member Author

potiuk commented Apr 2, 2025

Just mypy (solved in main) . Merging.

@potiuk potiuk merged commit d447355 into apache:main Apr 2, 2025
6 checks passed
@potiuk potiuk deleted the simplify-dev-tooling-with-uv-selection branch April 2, 2025 11:11
pankajkoti added a commit to astronomer/airflow that referenced this pull request Apr 3, 2025
potiuk added a commit to potiuk/airflow that referenced this pull request Apr 4, 2025
The apache#48223 left a few remnants of backtracking. This PR removes them.
potiuk added a commit to potiuk/airflow that referenced this pull request Apr 4, 2025
The apache#48223 left a few remnants of backtracking. This PR removes them.
nailo2c pushed a commit to nailo2c/airflow that referenced this pull request Apr 4, 2025
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:

link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256

This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.

This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.

Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.

This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.

It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.

What is still NOT done after that move and will be covered in the
follow-up changes:

* isolating docs-building to have separate configuraiton for docs
  building per distribution - allowing to run doc build locally
  with it's own conf.py file

* moving some of the tests and checks out from breeze container
  image up to the local environment (for example mypy checks) and
  likely isolating them per-provider

* Constraints are still generated using `pip freeze` and automatically
  managed by our custom scripts in `canary` builds - this will be
  replaced later by switching to `uv.lock` mechanism.

* potentially, we could merge `devel-common` and `dev` - to be
  considered as a follow-up.

* PROD image is stil build with `pip` by default when using
  `PyPI` or distribution packages  - but we do not support building
  the source image with `pip` - when building from sources, uv
  is forced internally to install packages. Currently we have
  no plans to change default PROD building to use `uv`.

This is the detailed list of changes implemented in this PR:

* uv is now mandatory to install as pre-requisite in order to
  develop airflow. We do not support installing airflow for
  development with `pip` - there will be a lot of cases where
  it will not work for development - including development
  dependencies and installing several distributions together.

* removed meta-package `hatch_build.py' and replacing it with
  pre-commit automatically modifying declarative pyproject.toml

* stripped down `hatch_build_airflow_core.py` to only cover custom
  git and asset build hooks (and renaming the file to `hatch_build.py`
  and moving all airflow dependencies to `pyproject.toml`

* converted "loose" packages in airflow repo into distributions:
  * docker-tests
  * kubernetes-tests
  * helm-tests
  * dev (here we do not have `src` subfolder - sources are directly
    in the distribution, which is for-now inconsistent with other
    distributions).

  The names of the `_tests` distribution folders have been renamed to
  the `-tests` convention to make sure the imports are always
  referring to base of each distribution and are not used from the
  content root.

* Each eof the distributions (on top of already existing airflow-core,
  task-sdk, devel-common and 90+providers has it's own set of
  dependencies, and the top-level meta-package workspace root brings
  those distributions together allowing to install them all tegether
  with a simple `uv sync --all-packages` command and come up with
  consistent set of dependencies that are good for all those
  packages (yay!). This is used to build CI image with single
  common environment to run the tests (with some quirks due to
  constraints use where we have to manually list all distributions
  until we switch to `uv.lock` mechanism)

* `doc` code is moved to `devel-common` distribution. The `doc` folder
  only keeps README informing where the other doc code is, the
  spelling_wordlist.txt and start_docs_server.sh. The documentation is
  generated in `generated/generated-docs/` folder which is entirely
  .gitignored.

* the documentation is now fully moved to:
  * `airflow-core/docs` - documentation for Airflow Core
  * `providers/**/docs` - documentation for Providers
  * `chart/docs` - documentation for Helm Chart
  * `task-sdk/docs` - documentation for Task SDK (new format not yet published)
  * `docker-stack-docs` - documentation for Docker Stack'
  * `providers-summary-docs` - documentation for provider summary page

* `versions` are not dynamically retrieved from `__init__.py` all
  of them are synchronized directly to pyproject.toml files - this
  way - except the custom build hook - we have no dynamic components
  in our `pyproject.toml` properties.

* references to extras were removed from INSTALL and other places,
  the only references to extras remains in the user documentation - we
  stop using extras for local development, we switch to using
  dependency groups.

* backtracking command was removed from breeze - we did not need it
  since we started using `uv`

* internal commands (except constraint generation) have been moved to
  `uv` from `pip`

* breeze requires `uv` to be installed and expects to be installed by
  `uv tool install -e ./dev/breeze`

* pyproject.tomls are dynamically modified when we add a version
  suffix dynamically (`--version-suffix-for-pypi`) - only for the
  time of building the versions with updated suffix

* `mypy` checks are now consistently used across all the different
  distributions and for consistency (and to fix some of the issues
  with namespace packages) rather than using "folder" approach
  when running mypy checks, even if we run mypy for whole
  distribution, we run check on individual files rather than on
  a folder. That adds consistency in execution of mypy heursistics.
  Rather than using in-container mypy script all the logic of
  selection and parameters passed to mypy are in pre-commit code.
  For now we are still using CI image to run mypy because mypy is
  very sensitive to version of dependencies installed, we should
  be able to switch to running mypy locally once we have the
  `uv.lock` mechanism incorporated in our workflows.

* lower bounds for dependencies have been set consistently across
  all the distributions. With `uv sync` and dependabot, those
  should be generally kept consistently for the future

* the `devel-common` dependencies have been groupped together in
  `devel-common` extras - including `basic`, `doc`, `doc-gen`, and
  `all` which will make it easier to install them for some OS-es
  (basic is used as default set of dependencies to cover most
  common set of development dependencies to be used for development)

* generated/provider_dependencies.json are not committed to the
  repository any longer. They are .gitignored and geberated
  on-the-flight as needed (breeze will generate them automatically
  when empty and pre-commit will always regenerate them to be
  consistent with provider's pyproject.toml files.

* `chart-utils` have been noved to `helm-tests` from `devel-common`
  as they were only used there.

* for k8s tests we are using the `uv` main `.venv` environment
  rather than creating our own `.build` environment and we use
  `uv sync` to keep it in sync

* Updated `uv` version to 0.6.10

* We are using `uv sync` to perform "upgrade to newer depencies"
  in `canary` builds and locally

* leveldb has been turned into "dependency group" and removed from
  apache-airflow and apache-airflow-core extras, it is now only
  available by google provider's leveldb optional extra to install
  with `pip`
potiuk added a commit that referenced this pull request Apr 5, 2025
The #48223 left a few remnants of backtracking. This PR removes them.
diogotrodrigues pushed a commit to diogotrodrigues/airflow that referenced this pull request Apr 6, 2025
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:

link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256

This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.

This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.

Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.

This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.

It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.

What is still NOT done after that move and will be covered in the
follow-up changes:

* isolating docs-building to have separate configuraiton for docs
  building per distribution - allowing to run doc build locally
  with it's own conf.py file

* moving some of the tests and checks out from breeze container
  image up to the local environment (for example mypy checks) and
  likely isolating them per-provider

* Constraints are still generated using `pip freeze` and automatically
  managed by our custom scripts in `canary` builds - this will be
  replaced later by switching to `uv.lock` mechanism.

* potentially, we could merge `devel-common` and `dev` - to be
  considered as a follow-up.

* PROD image is stil build with `pip` by default when using
  `PyPI` or distribution packages  - but we do not support building
  the source image with `pip` - when building from sources, uv
  is forced internally to install packages. Currently we have
  no plans to change default PROD building to use `uv`.

This is the detailed list of changes implemented in this PR:

* uv is now mandatory to install as pre-requisite in order to
  develop airflow. We do not support installing airflow for
  development with `pip` - there will be a lot of cases where
  it will not work for development - including development
  dependencies and installing several distributions together.

* removed meta-package `hatch_build.py' and replacing it with
  pre-commit automatically modifying declarative pyproject.toml

* stripped down `hatch_build_airflow_core.py` to only cover custom
  git and asset build hooks (and renaming the file to `hatch_build.py`
  and moving all airflow dependencies to `pyproject.toml`

* converted "loose" packages in airflow repo into distributions:
  * docker-tests
  * kubernetes-tests
  * helm-tests
  * dev (here we do not have `src` subfolder - sources are directly
    in the distribution, which is for-now inconsistent with other
    distributions).

  The names of the `_tests` distribution folders have been renamed to
  the `-tests` convention to make sure the imports are always
  referring to base of each distribution and are not used from the
  content root.

* Each eof the distributions (on top of already existing airflow-core,
  task-sdk, devel-common and 90+providers has it's own set of
  dependencies, and the top-level meta-package workspace root brings
  those distributions together allowing to install them all tegether
  with a simple `uv sync --all-packages` command and come up with
  consistent set of dependencies that are good for all those
  packages (yay!). This is used to build CI image with single
  common environment to run the tests (with some quirks due to
  constraints use where we have to manually list all distributions
  until we switch to `uv.lock` mechanism)

* `doc` code is moved to `devel-common` distribution. The `doc` folder
  only keeps README informing where the other doc code is, the
  spelling_wordlist.txt and start_docs_server.sh. The documentation is
  generated in `generated/generated-docs/` folder which is entirely
  .gitignored.

* the documentation is now fully moved to:
  * `airflow-core/docs` - documentation for Airflow Core
  * `providers/**/docs` - documentation for Providers
  * `chart/docs` - documentation for Helm Chart
  * `task-sdk/docs` - documentation for Task SDK (new format not yet published)
  * `docker-stack-docs` - documentation for Docker Stack'
  * `providers-summary-docs` - documentation for provider summary page

* `versions` are not dynamically retrieved from `__init__.py` all
  of them are synchronized directly to pyproject.toml files - this
  way - except the custom build hook - we have no dynamic components
  in our `pyproject.toml` properties.

* references to extras were removed from INSTALL and other places,
  the only references to extras remains in the user documentation - we
  stop using extras for local development, we switch to using
  dependency groups.

* backtracking command was removed from breeze - we did not need it
  since we started using `uv`

* internal commands (except constraint generation) have been moved to
  `uv` from `pip`

* breeze requires `uv` to be installed and expects to be installed by
  `uv tool install -e ./dev/breeze`

* pyproject.tomls are dynamically modified when we add a version
  suffix dynamically (`--version-suffix-for-pypi`) - only for the
  time of building the versions with updated suffix

* `mypy` checks are now consistently used across all the different
  distributions and for consistency (and to fix some of the issues
  with namespace packages) rather than using "folder" approach
  when running mypy checks, even if we run mypy for whole
  distribution, we run check on individual files rather than on
  a folder. That adds consistency in execution of mypy heursistics.
  Rather than using in-container mypy script all the logic of
  selection and parameters passed to mypy are in pre-commit code.
  For now we are still using CI image to run mypy because mypy is
  very sensitive to version of dependencies installed, we should
  be able to switch to running mypy locally once we have the
  `uv.lock` mechanism incorporated in our workflows.

* lower bounds for dependencies have been set consistently across
  all the distributions. With `uv sync` and dependabot, those
  should be generally kept consistently for the future

* the `devel-common` dependencies have been groupped together in
  `devel-common` extras - including `basic`, `doc`, `doc-gen`, and
  `all` which will make it easier to install them for some OS-es
  (basic is used as default set of dependencies to cover most
  common set of development dependencies to be used for development)

* generated/provider_dependencies.json are not committed to the
  repository any longer. They are .gitignored and geberated
  on-the-flight as needed (breeze will generate them automatically
  when empty and pre-commit will always regenerate them to be
  consistent with provider's pyproject.toml files.

* `chart-utils` have been noved to `helm-tests` from `devel-common`
  as they were only used there.

* for k8s tests we are using the `uv` main `.venv` environment
  rather than creating our own `.build` environment and we use
  `uv sync` to keep it in sync

* Updated `uv` version to 0.6.10

* We are using `uv sync` to perform "upgrade to newer depencies"
  in `canary` builds and locally

* leveldb has been turned into "dependency group" and removed from
  apache-airflow and apache-airflow-core extras, it is now only
  available by google provider's leveldb optional extra to install
  with `pip`
diogotrodrigues pushed a commit to diogotrodrigues/airflow that referenced this pull request Apr 6, 2025
The apache#48223 left a few remnants of backtracking. This PR removes them.
simonprydden pushed a commit to simonprydden/airflow that referenced this pull request Apr 8, 2025
The lazy consensus decision has been made at the devlist to switch
entirely to `uv` as development tool:

link: https://lists.apache.org/thread/6xxdon9lmjx3xh8zw09xc5k9jxb2n256

This PR implements that decision and removes a lot of baggage connected
to using `pip` additionally to uv to install and sync the environment.
It also introduces more consistency in the way how distribution
packages are used in airflow sources - basicaly switching all internal
distributions to use `pyproject.toml` approach and linking them all
together via `uv`'s workspace feature.

This enables much more streamlined development workflows, where any
part of airflow development is manageable using `uv sync` in the right
distribution - opening the way to moving more of the "sub-worfklows"
from the CI image to local virtualenv environment.

Unfortunately, such change cannot be done incrementally, really, because
any change in the project layout drags with itself a lot of changes
in the test/CI/management scripts, so we have to implement one big
PR covering the move.

This PR is "safe" in terms of the airflow and provider's code - it
does not **really** (except occasional imports and type hint changes
resulting from better isolation of packages) change Airflow code nor
it should not affect any airflow or provider code, because it does
not move any of the folder where airflow or provider's code is modified.

It does move the test code - in a number of "auxiliary" distributions
we have. It also moves the `docs` generation code to `devel-common`
and introduces separate conf.py files for every doc package.

What is still NOT done after that move and will be covered in the
follow-up changes:

* isolating docs-building to have separate configuraiton for docs
  building per distribution - allowing to run doc build locally
  with it's own conf.py file

* moving some of the tests and checks out from breeze container
  image up to the local environment (for example mypy checks) and
  likely isolating them per-provider

* Constraints are still generated using `pip freeze` and automatically
  managed by our custom scripts in `canary` builds - this will be
  replaced later by switching to `uv.lock` mechanism.

* potentially, we could merge `devel-common` and `dev` - to be
  considered as a follow-up.

* PROD image is stil build with `pip` by default when using
  `PyPI` or distribution packages  - but we do not support building
  the source image with `pip` - when building from sources, uv
  is forced internally to install packages. Currently we have
  no plans to change default PROD building to use `uv`.

This is the detailed list of changes implemented in this PR:

* uv is now mandatory to install as pre-requisite in order to
  develop airflow. We do not support installing airflow for
  development with `pip` - there will be a lot of cases where
  it will not work for development - including development
  dependencies and installing several distributions together.

* removed meta-package `hatch_build.py' and replacing it with
  pre-commit automatically modifying declarative pyproject.toml

* stripped down `hatch_build_airflow_core.py` to only cover custom
  git and asset build hooks (and renaming the file to `hatch_build.py`
  and moving all airflow dependencies to `pyproject.toml`

* converted "loose" packages in airflow repo into distributions:
  * docker-tests
  * kubernetes-tests
  * helm-tests
  * dev (here we do not have `src` subfolder - sources are directly
    in the distribution, which is for-now inconsistent with other
    distributions).

  The names of the `_tests` distribution folders have been renamed to
  the `-tests` convention to make sure the imports are always
  referring to base of each distribution and are not used from the
  content root.

* Each eof the distributions (on top of already existing airflow-core,
  task-sdk, devel-common and 90+providers has it's own set of
  dependencies, and the top-level meta-package workspace root brings
  those distributions together allowing to install them all tegether
  with a simple `uv sync --all-packages` command and come up with
  consistent set of dependencies that are good for all those
  packages (yay!). This is used to build CI image with single
  common environment to run the tests (with some quirks due to
  constraints use where we have to manually list all distributions
  until we switch to `uv.lock` mechanism)

* `doc` code is moved to `devel-common` distribution. The `doc` folder
  only keeps README informing where the other doc code is, the
  spelling_wordlist.txt and start_docs_server.sh. The documentation is
  generated in `generated/generated-docs/` folder which is entirely
  .gitignored.

* the documentation is now fully moved to:
  * `airflow-core/docs` - documentation for Airflow Core
  * `providers/**/docs` - documentation for Providers
  * `chart/docs` - documentation for Helm Chart
  * `task-sdk/docs` - documentation for Task SDK (new format not yet published)
  * `docker-stack-docs` - documentation for Docker Stack'
  * `providers-summary-docs` - documentation for provider summary page

* `versions` are not dynamically retrieved from `__init__.py` all
  of them are synchronized directly to pyproject.toml files - this
  way - except the custom build hook - we have no dynamic components
  in our `pyproject.toml` properties.

* references to extras were removed from INSTALL and other places,
  the only references to extras remains in the user documentation - we
  stop using extras for local development, we switch to using
  dependency groups.

* backtracking command was removed from breeze - we did not need it
  since we started using `uv`

* internal commands (except constraint generation) have been moved to
  `uv` from `pip`

* breeze requires `uv` to be installed and expects to be installed by
  `uv tool install -e ./dev/breeze`

* pyproject.tomls are dynamically modified when we add a version
  suffix dynamically (`--version-suffix-for-pypi`) - only for the
  time of building the versions with updated suffix

* `mypy` checks are now consistently used across all the different
  distributions and for consistency (and to fix some of the issues
  with namespace packages) rather than using "folder" approach
  when running mypy checks, even if we run mypy for whole
  distribution, we run check on individual files rather than on
  a folder. That adds consistency in execution of mypy heursistics.
  Rather than using in-container mypy script all the logic of
  selection and parameters passed to mypy are in pre-commit code.
  For now we are still using CI image to run mypy because mypy is
  very sensitive to version of dependencies installed, we should
  be able to switch to running mypy locally once we have the
  `uv.lock` mechanism incorporated in our workflows.

* lower bounds for dependencies have been set consistently across
  all the distributions. With `uv sync` and dependabot, those
  should be generally kept consistently for the future

* the `devel-common` dependencies have been groupped together in
  `devel-common` extras - including `basic`, `doc`, `doc-gen`, and
  `all` which will make it easier to install them for some OS-es
  (basic is used as default set of dependencies to cover most
  common set of development dependencies to be used for development)

* generated/provider_dependencies.json are not committed to the
  repository any longer. They are .gitignored and geberated
  on-the-flight as needed (breeze will generate them automatically
  when empty and pre-commit will always regenerate them to be
  consistent with provider's pyproject.toml files.

* `chart-utils` have been noved to `helm-tests` from `devel-common`
  as they were only used there.

* for k8s tests we are using the `uv` main `.venv` environment
  rather than creating our own `.build` environment and we use
  `uv sync` to keep it in sync

* Updated `uv` version to 0.6.10

* We are using `uv sync` to perform "upgrade to newer depencies"
  in `canary` builds and locally

* leveldb has been turned into "dependency group" and removed from
  apache-airflow and apache-airflow-core extras, it is now only
  available by google provider's leveldb optional extra to install
  with `pip`
simonprydden pushed a commit to simonprydden/airflow that referenced this pull request Apr 8, 2025
The apache#48223 left a few remnants of backtracking. This PR removes them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants