Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented May 29, 2025

Bootstrapping of pytest - especially on MacOS in Breeze could take a long time - and it turns out it was because of rglob trying to check if any of the pyproject.toml/provider.yaml files changed and looking for "deprecation ignores". Both were using rglob, and it turned out that even just rglobbing providers folder takes significant amount of time with MacOS docker - because of the slow filesystem.

This has been replaced now with:

  • for pyproject.toml/provider.yaml - we use the main airflow pyproject.toml to know exactly which pyproject.toml/provider.yaml to look for (we have them in workspace definition)

  • for deprecations_ignores - we hardcode the short list of the ignores we have.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

Bootstrapping of pytest - especially on MacOS in Breeze could take
a long time - and it turns out it was because of rglob trying to
check if any of the pyproject.toml/provider.yaml files changed
and looking for "deprecation ignores". Both were using rglob, and it
turned out that even just rglobbing providers folder takes significant
amount of time with MacOS docker - because of the slow filesystem.

This has been replaced now with:

* for pyproject.toml/provider.yaml - we use the main airflow
  pyproject.toml to know exactly which pyproject.toml/provider.yaml
  to look for (we have them in workspace definition)

* for deprecations_ignores - we hardcode the short list of the ignores
  we have.
@potiuk
Copy link
Member Author

potiuk commented May 29, 2025

The difference locally or on linux is small (1 -> 0.3 sec), but on MacOS it's huge (10 s (!) -> less than 1 s) for just running or not running .rglob()

@potiuk potiuk requested a review from jscheffl May 29, 2025 20:31
@potiuk
Copy link
Member Author

potiuk commented May 29, 2025

cc: @ambika-garg -> I think that's something you noticed.

@potiuk potiuk added this to the Airflow 3.0.2 milestone May 29, 2025
@potiuk potiuk requested review from dstandish and kaxil May 29, 2025 20:31
@potiuk potiuk added the backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch label May 29, 2025
@potiuk potiuk requested a review from vincbeck May 29, 2025 21:15
Copy link
Contributor

@shahar1 shahar1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@potiuk potiuk merged commit 86b0c82 into apache:main May 30, 2025
99 checks passed
@potiuk potiuk deleted the significantly-speed-up-pytest-bootstrap-for-macos branch May 30, 2025 08:29
github-actions bot pushed a commit that referenced this pull request May 30, 2025
…reeze (#51223)

Bootstrapping of pytest - especially on MacOS in Breeze could take
a long time - and it turns out it was because of rglob trying to
check if any of the pyproject.toml/provider.yaml files changed
and looking for "deprecation ignores". Both were using rglob, and it
turned out that even just rglobbing providers folder takes significant
amount of time with MacOS docker - because of the slow filesystem.

This has been replaced now with:

* for pyproject.toml/provider.yaml - we use the main airflow
  pyproject.toml to know exactly which pyproject.toml/provider.yaml
  to look for (we have them in workspace definition)

* for deprecations_ignores - we hardcode the short list of the ignores
  we have.
(cherry picked from commit 86b0c82)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
@github-actions
Copy link

Backport successfully created: v3-0-test

Status Branch Result
v3-0-test PR Link

potiuk added a commit that referenced this pull request May 30, 2025
…reeze (#51223) (#51234)

Bootstrapping of pytest - especially on MacOS in Breeze could take
a long time - and it turns out it was because of rglob trying to
check if any of the pyproject.toml/provider.yaml files changed
and looking for "deprecation ignores". Both were using rglob, and it
turned out that even just rglobbing providers folder takes significant
amount of time with MacOS docker - because of the slow filesystem.

This has been replaced now with:

* for pyproject.toml/provider.yaml - we use the main airflow
  pyproject.toml to know exactly which pyproject.toml/provider.yaml
  to look for (we have them in workspace definition)

* for deprecations_ignores - we hardcode the short list of the ignores
  we have.
(cherry picked from commit 86b0c82)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
@gopidesupavan
Copy link
Member

Oh thats nice one :)

kaxil pushed a commit that referenced this pull request Jun 3, 2025
…reeze (#51223) (#51234)

Bootstrapping of pytest - especially on MacOS in Breeze could take
a long time - and it turns out it was because of rglob trying to
check if any of the pyproject.toml/provider.yaml files changed
and looking for "deprecation ignores". Both were using rglob, and it
turned out that even just rglobbing providers folder takes significant
amount of time with MacOS docker - because of the slow filesystem.

This has been replaced now with:

* for pyproject.toml/provider.yaml - we use the main airflow
  pyproject.toml to know exactly which pyproject.toml/provider.yaml
  to look for (we have them in workspace definition)

* for deprecations_ignores - we hardcode the short list of the ignores
  we have.
(cherry picked from commit 86b0c82)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
sanederchik pushed a commit to sanederchik/airflow that referenced this pull request Jun 7, 2025
…e#51223)

Bootstrapping of pytest - especially on MacOS in Breeze could take
a long time - and it turns out it was because of rglob trying to
check if any of the pyproject.toml/provider.yaml files changed
and looking for "deprecation ignores". Both were using rglob, and it
turned out that even just rglobbing providers folder takes significant
amount of time with MacOS docker - because of the slow filesystem.

This has been replaced now with:

* for pyproject.toml/provider.yaml - we use the main airflow
  pyproject.toml to know exactly which pyproject.toml/provider.yaml
  to look for (we have them in workspace definition)

* for deprecations_ignores - we hardcode the short list of the ignores
  we have.
jose-lehmkuhl pushed a commit to jose-lehmkuhl/airflow that referenced this pull request Jul 11, 2025
…e#51223)

Bootstrapping of pytest - especially on MacOS in Breeze could take
a long time - and it turns out it was because of rglob trying to
check if any of the pyproject.toml/provider.yaml files changed
and looking for "deprecation ignores". Both were using rglob, and it
turned out that even just rglobbing providers folder takes significant
amount of time with MacOS docker - because of the slow filesystem.

This has been replaced now with:

* for pyproject.toml/provider.yaml - we use the main airflow
  pyproject.toml to know exactly which pyproject.toml/provider.yaml
  to look for (we have them in workspace definition)

* for deprecations_ignores - we hardcode the short list of the ignores
  we have.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants