Add uv sync --no-dev before provider YAML checks#60728
Add uv sync --no-dev before provider YAML checks#60728Fury0508 wants to merge 6 commits intoapache:mainfrom
Conversation
jscheffl
left a comment
There was a problem hiding this comment.
Looks OK for me.
Just a bit unsure - so would request another pair of eyes - I saw it also today (before this PR) that my venv is "wiped" after running soe prek commands... alwas need to re-sync my UV venv... is it intended by pre-commit checks to change / alter venvs? Or shall a temporary secondary be created for this check then?
Otherwise LGTM
fbbef26 to
292796f
Compare
|
@bugraoz93 Thanks for pointing that out! I've updated the PR description with the Gen-AI disclosure section. |
Good point! I didn't think about how this would affect local development - only focused on catching the dependency issues in CI. So yeah, wiping the venv every time someone runs pre-commit hooks is definitely annoying. Should this maybe only run in CI and not locally? Or would creating a temp venv for just this check work better? Not sure what's the best fix here - happy to adjust based on what you think makes sense! |
Thanks @Fury0508! |
|
Hi @jscheffl, thanks for the approval! This PR has been approved and all checks pass. Just wondering if there's anything else needed before merge, or if it's waiting in the merge queue. Thanks for your time! |
|
Yes as written ^^^- I am okay merging and thus approved, would liek another pair of eyes as reviewer before merge as being not 100% sure. |
jason810496
left a comment
There was a problem hiding this comment.
alwas need to re-sync my UV venv... is it intended by pre-commit checks to change / alter venvs? Or shall a temporary secondary be created for this check then?
Yeah, I also feel a bit strange at the first glance, which might also introduces some dependencies environment race condition issue from my perspective (e.g. other in-container pre-commit checks depend on the dev dependencies while current script is running).
I think the better way might be
- Creating a new
venvunder a tmp dir and install the dependencies without dev - Extract the current
__main__logic as another function (excluding thesync_dependencies_without_devcall) into a separate function namedprovider_yaml_files_check. - Run
sync_dependencies_without_devfor thevenvunder tmp dir. - Run
provider_yaml_files_checkas a subprocess using the virtual environment we just created under the temporary path, ensuring that it runs without development dependencies (for example, by overriding PYTHONPATH, etc.).
to avoid the race condition and environment side effect.
Thanks for the detailed feedback @jason810496! You're absolutely right about the race condition and environment side effects. I'll refactor to use an isolated temporary venv as you suggested:
Does this sound like the right direction? Happy to implement if this approach works for the team. |
Yes, thanks for the follow-up. |
Thanks for raising the point of race conditions of multiple checks using the same venv. Did not consider this but will be important! |
|
@jason810496 I'm working on implementing the isolated temp venv but running into an issue I can't figure out. Here's what I've done so far:
But when the subprocess tries to import the validation function, it fails with: Even though I can verify the packages are installed in the temp venv. The issue seems to be that the script has module-level imports like I tried explicitly installing the script's direct dependencies ( Am I approaching the temp venv creation wrong? Should I be using Any guidance would be helpful - I feel like I'm close but missing something about how to properly set up the isolated environment. |
jason810496
left a comment
There was a problem hiding this comment.
Hi @Fury0508, thanks for your update!
I couldn't see your change on GitHub, so feel free to push your commits.
From my perspective, the problem might be the final "Run validation function as subprocess using temp venv's python" step, because we might set the PYTHONPATH wrong. I will log the sys.path in the validation function for debugging to check whether the import paths are correct or not.
Additionally, there is an early comment (not directly related to the problem) that the first "Create temp venv" step, we could use uv instead of python -m venv to speed the process up.
|
Actually - ideally the whole check should be split into runnnig a check "per-provider" - and before every check you should do: We can use the feature of Alternativelly the same can be done by specifying You can find out more about it from my recent FOSDEM presentation: https://fosdem.org/2026/schedule/event/WE7NHM-modern-python-monorepo-apache-airflow/ |
|
Somehow I missed this PR and created another to solve the same issue (#61713), which fails the pre-commit even before running the tests. |
|
@potiuk - You're right that running uv sync at the distribution folder level would be a cleaner approach. I can see how using the workspace feature would automatically sync all provider dependencies. I'll look into the FOSDEM presentation you linked to better understand the implementation. Your PR catches issues at pre-commit stage (faster feedback) Current status: This PR is approved and all checks are passing |
- Add sync_dependencies_without_dev() to strip dev dependencies - Move jsonpath_ng import inside function to avoid ImportError - Helps detect unhandled optional cross-provider dependencies Related: apache#60662
- Exit immediately if uv sync --no-dev fails instead of continuing with warning - Add stdout display for better visibility of sync operation - Ensures validation never runs with dev dependencies present
b8ecb37 to
018fb69
Compare
|
Let me rebase it with all the tests enabled and let's see if it passes. |
What change does this PR introduce?
Strips dev dependencies before running provider YAML validation checks by running
uv sync --no-dev --all-packages.Why is this change needed?
Currently, the provider YAML check runs with all dependencies installed, including dev dependencies. This can mask issues where provider code has optional cross-provider dependencies that aren't handled properly (missing try/except blocks for optional imports). By stripping dev dependencies first, we create an environment closer to production and can detect these issues during CI.
Related issue(s)
closes: #60662
Changes
sync_dependencies_without_dev()function that runsuv sync --no-dev --all-packagesbefore provider checksTesting
uv sync --no-dev --all-packagesin breeze container to strip dev dependenciesjsonpath-ngfrom amazon provider) remain installedpython scripts/in_container/run_provider_yaml_files_check.pyGen-AI Assisted Contribution
Claude AI was consulted for guidance on Airflow's codebase structure and assistance with resolving CI failures. Core implementation and testing were done independently.