-
-
Notifications
You must be signed in to change notification settings - Fork 1k
Use uv pip compile
to recompile requirements files
#8276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
351aa4c
to
6b471fa
Compare
uv pip compile
to recompile requirements filesuv pip compile
to recompile requirements files
6b471fa
to
7e79aeb
Compare
a43b9bd
to
2f05290
Compare
2f05290
to
7e91b3b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great, I left a couple of nitpicks. Maybe it'd be good to leave out the autogenerated changes from the PR, so we avoid conflicts and I can just update them after merging?
Can't wait to start using uv
after trying it out recently. Probably will take a quick look at installing tox-uv using tox' self-bootstrapping setting as well. Probably the testing part of the recompile script will be much faster with that enabled too!
return "==".join([normalize_pkg(pkg), version]) | ||
|
||
|
||
CHANGELOG_URLS = {normalize_pkg(name): url for name, url in CHANGELOG_URLS.items()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about normalizing them in the JSON file instead, and adding something to check_changelog_urls
in scripts/dev/misc_checks.py
to ensure they stay that way?
Shouldn't block this though, happy to take care of it after merging if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I would prefer removing points of friction for contributing, and maintaining. In this case I think it would be nice if people could put packages names in the json files in whatever case they like. The robots doing comparisons should be the ones that have to care about normalization. For example, PyQt5
and PyJWT
are in there at the moment, nothing wrong with writing them like that.
@@ -346,31 +396,71 @@ def get_outfile(name): | |||
return os.path.join(REQ_DIR, 'requirements-{}.txt'.format(name)) | |||
|
|||
|
|||
def build_requirements(name): | |||
"""Build a requirements file.""" | |||
def get_updated_requirements_freeze(filename, venv_dir, comments): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to keep this or should we just throw it out entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are happy to get rid of it, then I'm happy to get rid of it! I just left it in there to allow for easy comparisons easy switching back if we aren't happy with the pip compile based implementation. It's one of the options I listed in the PR description.
pre=comments['pre'], | ||
pip_args=comments['pip_args']) | ||
with utils.gha_group('Freezing requirements'): | ||
args = ['--all'] if '-tox.txt-raw' in filename else [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the if '-tox.txt-raw' in filename
a bit confusing - could that be something like (untested) os.path.basename(filename) == "requirements-tox.txt-raw"
instead, which we can then simplify to path.name == "..."
once we switch this over to pathlib?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can change that. But I would like to hear your opinion on removing the special casing of setuptools and pip entirely, which would remove this block altogether. pip freeze
excludes it's own dependancies from it's output to avoid polluting peoples requirements lists, because its starting point is looking at all installed packages, and then we have to add these excluded requirements back when we actually want them. But compile based approaches like pip-compile that have a starting point of a requirements file don't have that problem.
See upstream discussions around including setuptools that I linked in this commit: a4ca2f9
ae5aba0
to
1bcf279
Compare
Thanks for the review! I've addressed most of your comments. I've:
|
1bcf279
to
99a57b8
Compare
58b1833
to
627361f
Compare
In #8269 we saw some packages leaking into the pip freeze output that we don't want in the requirements files (if setuptools isn't supposed to be in there, why should its dependencies). I've also greatly missed the comments that pip-compile puts in requirements.txt files explaining where indirect dependencies come from. So I took the opportunity to switch our tooling for updating and parsing new dependencies and their versions to use pip-compile instead of `pip -U install && pip freeze`. It turned out to not be a big change because the pip freeze output is largely compatible with requirements files (we are writing directly to one after all). So we just need to switch what commands we are running and triage any compatibility issues. I chose `uv pip compile` instead of `pip-compile` because I like what uv is doing (fast, aiming for compatibility, consolidating a confusing ecosystem into a single tool). But pip-compile/tools should do the same job if we want to go that route. The biggest differences are: * outputs normalized names: this generally results in a larger diff than otherwise (I propose we go through an regenerate all the requirements files in one go, and maybe add that commit to a blame ignore file) and requires our comparison logic to deal with normalized package names everywhere * setuptools and pip not included in tox requirement file - not sure what to do about that yet, should they be in the .text-raw file? TODO: * remove support for pip_args? * change markers in raw files to lower case? Ideally don't require, if a human can write them in any case and a robot can normalize we should do that. If if there are patterns with `._` in them as part of names, how do we handle that? * pull out similar bits of `build_requirements*` methods * maybe make it so you can pass `requirements=None` to `init_venv` to make it not install stuff, install uv, do the uv invocation, gate all that behind a `--method="freeze|compile"` arg? * add pip and setuptools to tox requirements file? * basename requirements file names so they don't have * `script_path/../../` in them in the annotated version * add tests for the markers (with inputs of differing cases) to make sure they all still work * update changelog check used in CI to normalize names too
The CHANGELOG_URLS variable is imported by the `./scripts/dev/misc_checks.py changelog-urls` check. Since there is a bit of churn in package name case in this change (and a while back when a bunch of packages switched between underscores and hyphens), update this variable at import time so that that checker will be looking at normalized names too.
There was a fair bit of duplicate code, so I've pulled out the "take a list of requirements, give me a new one" out to separate methods. The stuff around parsing the output can stay common thankfully! Even if we drop one of the methods this level of abstraction is probably fine to keep.
Since I was looking at how hard it would be to support using pip-compile to recompute requirements, I was worried that I would break the markers we support in the raw requirements files. This adds two tests: * disabled_test_markers_real_pip_and_venv A test that sets up a local python index and runs the real pip/uv binaries. It works (and it was fun to setup a package index factory) but has a couple of downsides: 1. it hits the real pypi index, which is not great in a test. This can be prevented by removing the EXTRA bit from the INDEX_URL env vars and pre-downloading pip, wheel, setuptools and uv to the test repo (and adding index.htmls for them). But because of the next item I'm not sure it's worth the effort of keeping this test around 2. it's slow because of having to download packages from the internet (even if we pre-cached them it would still have to download them, I guess we could include a zip of fixed/vendored versions, but that will probably require maintenance over time) and because it cals venv to make new virtual environments, which isn't the quickest operation (maybe uv venv is quicker?) * test_markers_in_comments Tests just the comment reading and line manipulation logic. Could be made a bit more pure by just calling read_comments() and convert_line(), but that wouldn't test that "add" marker.
`pip freeze` writes out package names as specified by the packages, `pip compile` writes out normalized package names. Sticking to normalized names in robot written files lets us more easily compare the output of the two different requirement compiling methods and also mean we can stop worrying about packages changing between `_` and `-` in dependency updates. Script used to do this change was: import re import glob def normalize_pkg(name): """Normalize a package name for comparisons. From https://packaging.python.org/en/latest/specifications/name-normalization/#name-normalization `pip freeze` passes file names though in whatever case they are in in the package, pip-compile will normalize them. """ if "/" in name: # don't change file paths return name return re.sub(r"[-_.]+", "-", name).lower() def normalize_line(line): if not line or not line.strip(): return line if "==" not in line.split()[0]: return line pkg, version = line.split("==", maxsplit=1) return "==".join([normalize_pkg(pkg), version]) for name in ["requirements.txt"] + glob.glob("misc/requirements/requirements*.txt"): with open(name) as f: before_lines = f.readlines() after_lines = [normalize_line(line) for line in before_lines] with open(name, mode="w") as f: f.writelines(after_lines)
This lets us more easily compare the output of runs of the different requirements compiling methods.
This is to match the `pip freeze` requirements compilation method. It's not clear to me if we actually want this behaviour or not. If seems `pip freeze` will exclude dependencies of itself: https://pip.pypa.io/en/stable/cli/pip_freeze/#cmdoption-all Even if there are other packages installed that depend on those dependencies. `uv pip compile`, and now the original `pip-compile` both have decided to include setuptools in generated requirements files: astral-sh/uv#1353 jazzband/pip-tools#989 (comment) So I'm not sure if we have a reason for going against this here or if they were just being excluded because that's what pip freeze does. Hopefully we can drop this commit and use the default behaviour in the future. For now when I'm trying to provide the new backend it's here to make the diff of generated files more precise. This message prefix to identify a pip compile comment was taken from these examples: # The following packages were excluded from the output: # setuptool # The following packages are considered to be unsafe in a requirements file: # setuptools==41.4.0 # via protobuf
For the pip freeze backend pip is being passed `--all` when the tox requirements file is being processed so that pip and setuptools are included in the requirements file. This was added in 922dca0 for reasons I haven't fully grokked. This commit adds similar behaviour for the pip compile backend via: 1. don't add the --no-emit-package args for the tox requirements file 2. add pip and setuptools to the tox requirements file It seems that pip and setuptools aren't even requirements of tox, but they are being included in the compiled requirements file anyway. Why aren't the included in the raw requirements file? I don't know, but from what I can figure it's not going to harm anything to have them in there.
It would have been convenient to have an end2end test to make sure that the output of the two requirements file compilation methods had the same results. But I think there is a bit too much stuff going on in `recompile_requirements` for that atm. Making the local repo in the temp path and install fake packages there works fine, although setting up the virtualenv isn't quick. But the script currently installs pip, setuptools and wheel which means we have to either 1) hit pypi.org in the tests to get them, or 2) download them at the start of the test suite and put them in the local repo. Not sure it's worth the effort to go down this rabbit hole when we already have a dozen real requirements files to verify the change with. I'm leaving this in the commit history because it was fun to get the local repo working!
I find these comments useful to show why packages are included in the final compiled requirements files. Required a small change to `recompile_requirements` to ignore these new comment line (it was saying there was a new requirement with an empty name that needed a changelog entry). The addition of the `os.path.realpath()` is to clean up the paths of the requirements files in the annotations, eg: check-manifest==0.49 - # via -r scripts/dev/../../misc/requirements/requirements-check-manifest.txt-raw + # via -r misc/requirements/requirements-check-manifest.txt-raw
see 433074c, adf39e9, db83a82 & 78a74a2 Now that we are using pip-compile to build requirements lock files, instead of pip-freeze, these things shouldn't be showing up in the results. Looks like `typeguard` was dropped in a192a3d "initial compile based bump" I guess that was being pulled in by pip freeze too.
This is for parsing diffs like: # via importlib-metadata + + # The following packages were excluded from the output: + # setuptools Otherwise further parsing breaks because it is looking for a package name on the line.
Substring matches when you are actually only trying to match one specific item is a smell, its unclear what's intended and causes the reader to have to stop and think.
This aligns with the direction of pip-compile and uv, and removes some code from this script. See: astral-sh/uv#1353 jazzband/pip-tools#989 (comment) The code being removed worked fine. So if it turns out that installing a newer setuptools or pip in virtualenvs break stuff, it's fine to add back.
Changes in this commit should be limited to: * lines changing order (they are sorted now) * jaraco.text, jaraco.collections, autocommand, inflect, typeguard are gone now (setuptools vendored packages that were being included from the venv) * comments added showing why dependencies are included
627361f
to
d701f32
Compare
This PR switches the
scripts/dev/recompile_requirements.py
to useuv pip compile
to compile the pinned requirements files, instead of the current method of usingpip freeze
.The motivation for this is:
we don't have to worry about packages installed to the virtualenv that aren't our dependancies being included in the output (see Update dependencies #8269 (comment) for an example)
it speeds up the job a little bit because one of the
pip install
s is replaced by auv
call (anduv
prides itself on being fast)we get lineage annotations in the pinned requirements files saying why requirements are included, eg:
There are a few differences between the two methods of calculating the pinned dependencies that I've accounted for:
pip-compile
outputs normalized package names, whereaspip freeze
outputs names as specified in the packagespip-compile
outputs setuptools and pip into the pinned requirements file, if they are dependencies of one of the packages we require, whereaspip freeze
excludes them by defaultOptions:
pip freeze
based method of requirements calculation. I left it in as an escape hatch, but since we can get the output of thepip-compile
based one to be exactly the same it seems safe to remove it. If we do this we should remove support for thepip_args
marker as that won't work withuv
(and it's not currently being used)uv
version (egvenv
) to make it fasterHere is a guide to the commits:
You'll probably want to review this PR commit-at-a-time or in stages so you don't get all the line noise from normalizing and updating the lock files in the diff.