Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple pip-compile runs with multiple python versions -> single requirements.txt #1326

Closed
RuRo opened this issue Feb 16, 2021 · 39 comments
Closed
Labels
docs Documentation related feature Request for a new feature help wanted Request help from the community

Comments

@RuRo
Copy link

RuRo commented Feb 16, 2021

What's the problem this feature will solve?

I have a single requirements.in file, I would like to generate a single requirements.txt file, that would work with multiple python versions.
Afaik, currently, the only way to solve this issue is to run pip-compile N times - once per python version.
This however leaves you with N different requirements.txt files with mutually unsynchronized versions.

Disclamer:

I am aware of the limitations of pip/PyPI dependency resolution (#635, #639).
I am not asking pip-tools to guess the dependencies for different python versions.

I am willing to run pip-compile for each python version that I want to support.
What I want is a way to aggregate the results of these multiple runs of pip-compile.

Example problem

Let's say, that my requirements.in contains just required-package (this is a hypothetical package, that doesn't actually exist).

required-package has the following versions in different versions of python:

  • python3.6 has required-package 1.0.0 which requires packages a and b
  • python3.7 has required-package 1.0.0 which requires packages a and c
  • python3.8 has required-package 1.0.0 which requires just the package c
    python3.8 also has a newer version - required-package 2.0.0 which is only available in python >= 3.8 and requires just the package d
    For simplicity, let's assume that all other packages don't have any further dependencies and have only one version - 1.0.0.

If I understand it correctly, currently the only thing I can do is create 3 requirements.txt, one for each python version:

# python3.6 -m piptools compile requirements.in --no-header --no-annotate --output-file=requirements36.txt
required-package==1.0.0
a==1.0.0
b==1.0.0
# python3.7 -m piptools compile requirements.in --no-header --no-annotate --output-file=requirements37.txt
required-package==1.0.0
a==1.0.0
c==1.0.0
# python3.8 -m piptools compile requirements.in --no-header --no-annotate --output-file=requirements38.txt
required-package==2.0.0
d==1.0.0

There are 2 big problems with the above result:

  1. 3.6 and 3.7 require different packages b and c
  2. 3.8 has a different version of required-package

Desired result

Ideally, I would like to have a common.requirements.txt along the lines of

required-package==1.0.0
a==1.0.0
b==1.0.0; python_version < "3.7"
c==1.0.0; python_version >= "3.7"

More realistically, I would be OK with something like

required-package==1.0.0
a==1.0.0
b==1.0.0; python_version == "3.6"
c==1.0.0; python_version == "3.7" or python_version == "3.8"

or

required-package==1.0.0
a==1.0.0
b==1.0.0; python_version not in "3.7,3.8"
c==1.0.0; python_version in "3.7,3.8"

or in the worst case scenario

required-package==1.0.0; python_version in "3.6,3.7,3.8"
a==1.0.0; python_version in "3.6,3.7,3.8"
b==1.0.0; python_version in "3.6"
c==1.0.0; python_version in "3.7,3.8"

Solution

Maybe, I am missing something and there is an obvious solution to this problem, but I wasn't able to get this kind of result with the current pip-tools functionality. Please, correct me if I'm wrong.

I propose adding a new --add-environment-markers option.
The behaviour of --add-environment-markers is as follows:

  1. Remember the old contents of output-file
  2. Compile the requirements like you normally would
  3. After the compilation is finished, compare the old and the new requirements
    • instead of deleting a requirement from old requirement.txt
      add ; python_version not in "${current_python_version}" to it
    • when adding a new requirement,
      add ; python_version in "${current_python_version}" to it

This option can then be used something like this:

python3.6 -m piptools compile requirements.in --output-file=requirements.txt
python3.7 -m piptools compile requirements.in --output-file=requirements.txt --add-environment-markers
python3.8 -m piptools compile requirements.in --output-file=requirements.txt --add-environment-markers

The first line runs pip-compile in python3.6 and generates (default behaviour)

required-package==1.0.0
a==1.0.0
b==1.0.0

The second line runs pip-compile in python3.7. b==1.0.0 would have been deleted and c==1.0.0 would have been added. Due to the new --add-environment-markers flag, it adds environment markers instead:

required-package==1.0.0
a==1.0.0
b==1.0.0; python_version not in "3.7"
c==1.0.0; python_version in "3.7"

The third line runs pip-compile in python3.8, going through the same motions as the previous command, generates (assuming the in/not in operators are merged)

required-package==1.0.0
a==1.0.0
b==1.0.0; python_version not in "3.7,3.8"
c==1.0.0; python_version in "3.7,3.8"

the required-package is not updated to 2.0.0 (this is the current behaviour anyway).

If you run all 3 commands with --add-environment-markers instead of just the last 2, you will get the following result:

required-package==1.0.0; python_version in "3.6,3.7,3.8"
a==1.0.0; python_version in "3.6,3.7,3.8"
b==1.0.0; python_version in "3.6" and python_version not in "3.7,3.8"
c==1.0.0; python_version in "3.7,3.8"

(ouput-file is empty before the first command, so every requirement is "added")

You could also potentially add parameters to --add-environment-markers, to specify, which environment markers should be added in step 3.
Like --add-environment-markers=python_version and --add-environment-markers=python_version,os_name.

@AndydeCleyre
Copy link
Contributor

You may be interested to look at this script I submitted for another issue, and the rest of that discussion.

The idea of the script, platform-compile.sh, is:

  • 1 requirements.in
  • run the script on each platform (or python version)
  • each run generates a <platform-name>-requirements.txt
  • 1 requirements.txt manually created; something like:
-r py2.7-linux2-x86_64-requirements.txt
-r py3.8-linux-x86_64-requirements.txt

Then, either of pip install -r requirements.txt or pip-sync requirements.txt should work on any of those platforms.

If --base <reqs.txt> is passed, then that existing txt overwrites the platform-targeted txt before pip-compile is run, so that if its contents already satisfy the platform's requirements they should not be changed unnecessarily.


platform-compile.sh:

#!/bin/dash -e
# platform-compile.sh [--base <basetxt>] [<reqsin>] [<pip-compile-arg>...]

python_version="$(python -c 'from __future__ import print_function; import platform; print(*platform.python_version_tuple()[:2], sep=".")')"
sys_platform="$(python -c 'from __future__ import print_function; import sys; print(sys.platform)')"
machine="$(python -c 'from __future__ import print_function; import platform; print(platform.machine())')"

if [ "$1" = '--base' ]; then
    base="$2"
    shift 2
else
    unset base
fi

if [ -r "$1" ]; then
    reqsin="$1"
    shift
else
    reqsin="requirements.in"
fi

txt="py${python_version}-${sys_platform}-${machine}-$(printf '%s' "${reqsin}" | sed 's/\.in$//').txt"
markers="; python_version ~= '${python_version}' and sys_platform == '${sys_platform}' and platform_machine == '${machine}'"

if [ "$base" ] && [ "$base" != "$txt" ]; then
    cp "$base" "$txt"
fi

pip-compile --no-header "$reqsin" -o "$txt" "$@"
sed -i -E "s/(^[^;]+==[^;#]+)(#|$)/\1${markers}  \2/g" "$txt"

@RuRo
Copy link
Author

RuRo commented Feb 19, 2021

I've seen that issue. Maybe I should have left a comment there instead of opening a new issue, but I think, that attempting to merge pre-generated requirements.txt from different environments is not a viable approach, since you can't dynamically resolve conflicts, if the requirements.txt are already generated at that point. So I decided to open a separate issue, describing my proposal.

Your script is really close, but doesn't quite solve the problem (I think?).

Without the --base option, your script just produces N separate requirements.txt files with no version synchronization. I understand, that since every line has an exclusive marker you can just concatenate them, but that's kind of pointless, since the requirements will be different for each environment. At that point you are back to square one and might as well ship N different requirements.txt files.

If you meant to use the --base option along the lines of

PYTHON=3.6 ./platform-compile.sh
PYTHON=3.7 ./platform-compile.sh --base py3.6-linux-x86_64-requirements.txt
PYTHON=3.8 ./platform-compile.sh --base py3.7-linux-x86_64-requirements.txt

Then this almost does what I want. Originally, I wanted to keep all "unanimous" requirements (that are the same in all environments) without any markers, so that pip install -r merged.requirements.txt installed a reasonable set of dependencies even in an "unsupported" environment. Though now that I think about it, this behaviour is unlikely to actually be useful. I'll have to think about this later.

P.S. In any case, I think that either your script or my proposal could/should be ideally implemented in pip-tools directly. Parsing requirement files with string replacements feels really hacky (for example, your script breaks with --generate-hashes and conflicts with other pre-existing environment markers).

@ssbarnea
Copy link
Member

ssbarnea commented Apr 7, 2021

I am quite interested about the subject as I observed a growing number of pip installation issues on older platforms, caused by the fact that more and more libraries are dropping support for older pythons. Even in cases where pip knows about python_requires it may endup downloading a huge number of releases until it finds one that works.

Maybe we can sort this with another approach: a script that is updating requirements.in and checks that adds python_version conditions for each library that dropped support for older pythons, practically helping pip pick the right versions.

Example:

# requirements.in
foo>=2.0

# after running the magic script
foo>=2.0,<3.0; python_version=="2.7"
foo>=2.0; python_version>"2.7"

That may not be very easy to min version can happen on each python release. Many already dropped py35 too.

I would not bother about platforms or other conditions, I would let user sort these out manually.

@ssbarnea ssbarnea added the feature Request for a new feature label Jun 23, 2021
@ssbarnea
Copy link
Member

How about adding a pip-combine command that does receive a list of requirement files and produce a single one. With 5.2.0 we also include the python version used to generate them. That feature would really be awesome.

@ssbarnea ssbarnea added the help wanted Request help from the community label Jun 23, 2021
@RuRo
Copy link
Author

RuRo commented Jun 23, 2021

How about adding a pip-combine command that does receive a list of requirement files and produce a single one. With 5.2.0 we also include the python version used to generate them. That feature would really be awesome.

See my previous comment for why I don't like the "combine requirements after" solution.

How would your proposed approach combine the following requirements:

  • Generated in python 3.7:
    a==1.0.0
    b==1.0.0
    
  • Generated in python 3.8:
    a==3.1.4
    c==1.0.0
    

Notice the different a version and that b was replaced with c.

@ssbarnea
Copy link
Member

That is because you assumed the intermediate files do not have an implicit/invisible marker, but they would. If you assume that each input entry has an implicit marker limiting it for a specific python version you do not endup having this problem.

All it remains after this is to compress the joined result:

# uncompressed combined output:
foo>=2.0; python_version=="3.6"
foo>=2.0; python_version=="3.7"
foo>=2.0; python_version=="3.8"

# compressed
foo>=2.0; python_version>="3.6" # or similar

There is a trick here as the logic for compression should need to do something like:

  • determine known_pythons (basically one for each input file)
  • if requirement is identical for all known_python versions, add it without marker
  • if current python is the latest known, assume >= instead of == (we do not want to prevent use of py310 even if we never tested it)

I may not be very easy to achieve but on the other hand the compression logic does not have to be perfect either, if we fail to produce minimal/optimal compiled versions is not really a big deal. I am more worried about not wrongly adding python_version constraints when they were not effectively specified.

@RuRo
Copy link
Author

RuRo commented Jun 23, 2021

That is because you assumed the intermediate files do not have an implicit/invisible marker, but they would.

No, that was not my point. Please, feel free to add the python_version markers to the example I provided.

The problem is that the individual requirements.txt would contain the already pinned package versions and there is no valid way to produce a coherent pinned merged requirements list after the fact.

The only "solution" to this problem is to keep different pinned versions in that case, but this kind of defeats the point of version pinning and is barely any different from just having N different requirements files for each python version.

If you think, that your approach solves this issue, then please walk me through how would the requirements in my example be combined.

@henryiii
Copy link

Having one file is still drastically better than having multiple files, it's easier to pass around, to use, you no longer need to inject the Python version into the filename, etc.

Your example at the top is very specific. You have:

# python3.8 -m piptools compile requirements.in --no-header --no-annotate --output-file=requirements38.txt
required-package==2.0.0
d==1.0.0

But you want 1.0.0 instead. Why would I want required-package 1.0.0 if there's a 2.0.0 and I have not requested <2, just because I support an old version of Python that it does not? If you don't want the latest version of that, you need to limit it manually; if you do, you get the same result with a normal combine. This is a very small optimization with a very large cost. And it's not really related to the output file, but rather the solve - I think those should be kept separate if possible. It's also not at all what I'd expect going in. If I ask for the latest versions of everything, I should get the latest versions of everything. I think you want a special solving mode.

I also can't think of what this optimizes; I guess you save a package download in the simple case that required-package is pure Python (and thus is likely too small to matter anyway)?

The rules from @ssbarnea seem to cover things quite well.

@RuRo
Copy link
Author

RuRo commented Jun 23, 2021

just because I support an old version of Python that it does not?

Yes. Exactly because you support a version of Python that doesn't have the newer package version. That's what "supporting a version" means to me. Just like you can't use the newest python stdlib/language features if you support the older python versions, you can't use unsupported packages. That's the whole point.

If I ask for the latest versions of everything, I should get the latest versions of everything.

You should get the latest version of everything that satisfies the given constraints. These constraints include the list of requirements.in, but also the current python version, OS, architecture and the previous contents of the compiled requirements.txt.

I think you want a special solving mode.

I want a way to produce a single pinned requirements.txt, that works with multiple python versions with minimal environment differences. I am convinced, that this requires a special "solving"/pip-compile mode, which is exactly what I propose in the original post.

If you don't think that the special solve is required, convince me, that it is not needed. Either that or open your own issue/proposal. This might sound a bit rude, but consider that I am proposing this feature to solve my specific problem and so far your solution doesn't seem to solve my problem.

I also can't think of what this optimizes;

It's not an optimization, it's a functional requirement. For me, one of the most important details of pip-compile is version pinning. Pinning makes it easier to get reproducible results across environments (CI/CD vs local developer venv vs production). For me, having a single requirements.txt file for different Python versions means extending the existing pip-compile guarantees across different Python versions, and not just smushing unrelated requirements sets together + minifying them.

@henryiii
Copy link

I don't want PyTest 4 on Python 3.10 if I happen to support Python 2.7. PyTest <6.2 (IIRC) doesn't support Python 3.10, they just started supporting Python 3.10 about a month ago. If I ask for "pytest", I want the latest possible version of PyTest on each version of Python. It is much more likely that newer versions of libraries support new versions of Python. You don't know what future versions of Python you support when they haven't been released yet. A general assumption that you should never try to use a version of library older than the Python release date if there's a newer one available. This isn't 100% accurate, but you generally should try to avoid it. This will force old versions of packages onto newer versions of Python. This approach does not scale.

and the previous contents of the compiled requirements.txt

Why? Adding hysteresis to the solve is likely to cause a lot of confusion.

special solve is required, convince me

I'm not trying to convince you that this is not a special solve (though I'd like to convince you it's a really bad idea). My point is you are asking for a very special solve that tries to minimize version differences (which, as I point out above, is very likely to fail in many real world situations), and if it was added, it should be added as a solve mode, not an output model. You could use the solve mode to produce a set of requirements.txt's, then you could still merge them with combine. It's best to keep components separate.

Though this mode sounds really hard to implement, and it sounds like it will produce bad situations with old packages on new Pythons and not alert you if you don't actually support the latest versions of things. I don't see any benefit other than maybe saving a pure Python package download once in a while.

reproducible results across environments

If packages drop support for Python versions that you still support, I think it's better to either support both sets of versions or manually pin to say you don't support the newer version. If you implicitly don't support the new version by never solving for it, then why not be explicit? And if you do support it, why not let the solver select it on the environments where it can?

@henryiii
Copy link

This technically has nothing to do with a single requirements.txt, in other words. Your special solve would work just as well if it still produced multiple requirements-*.txt files that all happened to be very similar.

It should never be the default mode for producing a single requirements.txt!

Another example is for Universal2 support, which is just now rolling out to libraries. If you did an "old" solve, then this work is useless - most of these libraries have dropped <3.6 support, and many of them have dropped <3.7 support. But you need the latest version on 3.8 and 3.9 to support Apple Silicon. Intentionally trying to solve old versions of libraries basically nullifies all the work that's going into rollouts like this.

@RuRo
Copy link
Author

RuRo commented Jun 23, 2021

I don't want PyTest 4 on Python 3.10 if I happen to support Python 2.7. PyTest <6.2 (IIRC) doesn't support Python 3.10, they just started supporting Python 3.10 about a month ago.

If no single version of pytest is available for both python 2.7 and 3.10, and yet you have requested a single pinned requirement set, that supports these python versions, then the correct result is

ERROR: Could not find a version that satisfies the requirements ...
ERROR: No matching distribution found for pytest ...

(IMO)

you generally should try to avoid it. This will force old versions of packages onto newer versions of Python. This approach does not scale.

It seems, to me, like you might be confusing environment requirement pinning and regular package (install_requires style) requirements. There is no scaling for requirements.txt pinning. Either there exists a set of pinned requirements, that satisfies the constraints, or there doesn't.

and the previous contents of the compiled requirements.txt

Why? Adding hysteresis to the solve is likely to cause a lot of confusion.

Are you asking me why pip-compile takes into account the current contents of requirements.txt? Because it currently does. This is not a part of my proposal, but rather a statement of fact about the current implementation.

I'm not trying to convince you that this is not a special solve

And I never suggested, that you did. My point was, that as it currently stands, your proposal does not solve my issue. If your proposal is popular/simple enough, and it gets implemented, I would still not consider my issue as closed/solved. Therefore, (unless you can convince me, that your proposal does solve my problem), you should open a separate issue with your proposal to avoid confusion. (or continue your discussion in #826)

You could use the solve mode to produce a set of requirements.txt's, then you could still merge them with combine. It's best to keep components separate.

My proposed pip-compile mode relies on having an incrementally compiled common requirements file. If/when your pip-combine proposal gets implemented, I will of course try to reformulate my proposal to use this mechanism for the "combining" part. But until then, I prefer to keep it as it is, since that would further complicate the proposal.

I don't see any benefit other than maybe saving a pure Python package download once in a while.

I have explained, why I need this, and it has nothing to do with package downloads.

This technically has nothing to do with a single requirements.txt, in other words.

I don't agree. Here is my point of view:

  1. Currently, pip-compile can generate separate requirements.txt for separate python versions.
  2. Currently, pip-compile provides some guarantees for these separate requirements.txt.
  3. I want to have a single multi-python requirements.txt, that provides the same guarantees as (2) currently does.
  4. This can't be achieved by simply merging the separate requirements.txt after the fact.

Conclusion: to generate a valid pinned multi-python requirements.txt file with the same guarantees as (2), you need a modified pip-compile stage.


I would also like to point out, that as far as I can tell, pip-combine doesn't actually need to know anything about the pip-tools internals and can be implemented as a relatively simple standalone script (along the lines of what AndydeCleyre proposed). Also, see #826 for an issue, that more closely matches your pip-combine proposal.

@henryiii
Copy link

henryiii commented Jun 23, 2021

It seems, to me, like you might be confusing environment requirement pinning and regular package (install_requires style) requirements. There is no scaling for requirements.txt pinning. Either there exists a set of pinned requirements, that satisfies the constraints, or there doesn't.

"pip compile" should update your requirements.txt. It's the point when you say "I want the latest versions of everything". Once you've compiled your file, then it's frozen until the next time, sure, but we are talking about the solve for the update. If you have more requirements in your requirements.in, this scales up and issues caused by using old packages on new Pythons (like adding wheels for newer pythons, supporting newer architectures, etc) are more likely to cause issues.

Are you asking me why pip-compile takes into account the current contents of requirements.txt

The printed message, sure, but I believe the solve is unaffected by requirements.txt if requirements.in is supplied.

If no single version of pytest is available for both python 2.7 and 3.10

How would you know that? You proposal relies on the authors of PyTest, in 2018/2019, knowing that the next couple of Python releases would be fine, but that Python 3.10 would be the version that breaks PyTest 4. You should not depend on upper limits being known - they usually can't be when the package is released. And adding upper limits as guess is a very bad practice - and always limiting to the current version of Python only will simply reduce you back to the same solve as before - as soon as there's a dropped Python version, the solve will not complete.

Version solving relies on the lower bound being correct, but not the upper bound. Your proposal requires accurate knowledge of the upper bound at the time the package was released, which is unavailable in the Python packaging system. Packages can't be modified after the fact, and even if they could, no volenteer is going to go back and change all old releases as soon as they find a package updated that required a change.

single pinned requirement set, that supports these python versions, then the correct result is

Then why not just use the minimum version of Python supported with Pip compile, and use that requirements.txt for everything? If it doesn't support the latest version of Python, then there wasn't a way to solve it anyway. Compile will include all wheel hashes, so you can even use this in hashed mode on newer Pythons if they have wheels (which is unlikely on older versions, so you'll likely have to compile the SDist, which probably will break, but that's not something fixed by the above solve). Sure, there will be cases where this will break (such as needing a new requirement on a newer Python - not common, but it happens), but due to the issues above, this is already bad enough.

If you are an application developer and want to only support required-package==1.*, then pin it in your requirements.in. I don't think there should be a solve allowing you to ignore 2.0 just because the latest version of Python you support doesn't support 2.0 - the same is very likely true of 1.x, it's likely going to break at some point on newer Pythons, and no one is maintaining it to add an upper limit, and even if they did, it would simply pick an earlier version of 1.x that didn't have the limit but did have the problem. And if every version had the limit (bad), then the solve wouldn't work.

Compile is not about API. It's about updating all packages to the latest versions supported. You are responsible to limit packages if you need a specific API.

@RuRo
Copy link
Author

RuRo commented Jun 23, 2021

The printed message, sure, but I believe the solve is unaffected by requirements.txt if requirements.in is supplied.

> cat requirements.in
pytest
> cat requirements.txt
#
# This file is autogenerated by pip-compile with python 3.9
# To update, run:
#
#    pip-compile --output-file=requirements.txt requirements.in
#
attrs==21.2.0
    # via pytest
more-itertools==8.8.0
    # via pytest
packaging==20.9
    # via pytest
pluggy==0.13.1
    # via pytest
py==1.10.0
    # via pytest
pyparsing==2.4.7
    # via packaging
pytest==5.4.3
    # via -r requirements.in
wcwidth==0.2.5
    # via pytest

Notice that pytest is pinned to 5.4.3 in requirements.txt. Then running

> pip-compile --output-file=requirements.txt requirements.in
> cat requirements.txt
#
# This file is autogenerated by pip-compile with python 3.9
# To update, run:
#
#    pip-compile --output-file=requirements.txt requirements.in
#
attrs==21.2.0
    # via pytest
more-itertools==8.8.0
    # via pytest
packaging==20.9
    # via pytest
pluggy==0.13.1
    # via pytest
py==1.10.0
    # via pytest
pyparsing==2.4.7
    # via packaging
pytest==5.4.3
    # via -r requirements.in
wcwidth==0.2.5
    # via pytest

Doesn't update pytest. But if I delete requirements.txt, or run the command with -U/--upgrade or -P pytest/--upgrade-package pytest, only then will pytest be pinned to pytest==6.2.4.


If no single version of pytest is available for both python 2.7 and 3.10

How would you know that?

Running pip-compile with python 2.7 will resolve pytest to the latest version that supports python 2.7 (say pytest==X.Y.Z). Then running pip-compile --add-environment-markers in python 3.10 will see that pytest is required, but the pinned X.Y.Z version is not available in python 3.10. This will fail the compilation like it currently would with non-satisfiable versions in requirements.in.

You should not depend on upper limits being known - they usually can't be when the package is released.

Yes, upper limits are a pain. But if you've run into a situation where an upper limit actually leads to issues, then you have already lost at that point, since you are attempting to support a python version range, which isn't supported by any single package version.

At that point, either you have to drop support of some python version, or use different package versions for different python versions and loose reproducibility across python versions.

Your proposal requires accurate knowledge of the upper bound at the time the package was released, which is unavailable in the Python packaging system.

I don't think it does. My proposal (just like the current pip-compile implementation) deals only with known version constraints (both lower and upper). If there are any constraints, which are not explicitly declared by the package authors, then pip-compile (and pip for that matter) just tries its best.

I really think, that this issue already exists in pip and pip-compile even without my proposal, and a solution for it is well outside the scope of this proposal.


Then why not just use the minimum version of Python supported with Pip compile, and use that requirements.txt for everything?

Because there are backport packages which are

  1. not needed in newer python versions
  2. not available in newer python versions

For example, some python packages require dataclasses only for python_version<3.7, but the dataclasses package itself is only available for python>=3.6, <3.7 because python 3.7 already has native dataclasses. So attempting to pip install -r requirements.txt in python>=3.7, for a requirements.txt file that was generated in python<3.7 will fail with No matching distribution found for dataclasses.


Compile is not about API. It's about updating all packages to the latest versions supported.

True, compile is not about API, but it's also not about updating all packages to the latest versions supported. If it was about updating, then pip-compile --output-file=requirements.txt requirements.in would bump the pytest version in the above example.

Compile is about managing stable, reproducible environments and version pinning.

@henryiii
Copy link

henryiii commented Jun 23, 2021

pip-compile --output-file=requirements.txt requirements.in

This didn't do anything at all - the requirements.txt satisfied the requirements.in. I do see where you are thinking about a "change only what is required", though.

but the pinned X.Y.Z version is not available in python 3.10.

But it is available. It just doesn't work. There are no upper limits like there are lower limits, and if there are, they can't be trusted / used in the normal way.

But if you've run into a situation where an upper limit actually leads to issues, then you have already lost at that point

This will eventually hit you if you are trying to use old versions of packages on newer versions of Python. The last version of a package to support 2.7 probably does not support 3.10, etc. Trying to tie your solve to the oldest version supported is likely going to not work for any number of realistic packages. For example, NumPy 1.16 was the last 2.7 release, and it doesn't come with wheels for anything newer than 3.7. So using 1.16 + 3.8 will try to compile from the SDist, which will break if you don't have a compiler installed, and will create buggy binaries on macOS due to the built-in accelerate module, etc. At some point (3.10 or maybe 3.9 or even 3.8) it just won't work period.

This also goes for Universal2 binaries, Arm binaries, PyPy binaries, and any other roll out. Those can't be expected to happen on all older versions of Python, usually they target newer versions of Python only. Apple Silicon doesn't even support Python 3-3.7.

I really think, that this issue already exists in pip and pip-compile even without my proposal, and a solution for it is well outside the scope of this proposal.

Pip and pip-compile never try to solve for old versions. If you ask for an update, you get an update. If you have an existing file, that was created with the latest versions according to the current version of Python - you want to force newer versions of Python to solve with requirements generated for older version of Python, which is what I'm against.

@henryiii
Copy link

requirements.in:

numpy

Currently, there is no way to pip-compile this file and get something broken. 2.7 will give you numpy 1.16, 3.8 will give you numpy 1.21. If you have an Apple Silicon machine or an Arm Linux machine, you will get working wheels.

With your method, you would get 1.16 even on 3.8. There would be no wheels, you would do a compile from SDist, and you would either get buggy results or a failed compile.

The downside is that you now have to write your program assuming NumPy 1.16-1.21, depending on Python version. But you are choosing to support old versions, which is causing the range to be required. It very likely doesn't affect you, new versions mostly just add stuff you will have to avoid - but if you really want to pin, you can set your requirements.in to:

requirements.in:

numpy==1.16.*

Now you can be assured that you will always have numpy 1.16, and program for that. But you also will not be able to support newer Python and architectures properly, because they don't have support / wheels, but you explicitly asked for that.

@RuRo
Copy link
Author

RuRo commented Jun 23, 2021

then you have already lost at that point

This will eventually hit you if you are trying to use old versions of packages on newer versions of Python.

What I'm trying to say is that there is obviously no way to resolve this kind of problem automatically. With or without my proposed option. If you ever run into such a problem, you just have to either pin the correct version by hand (in the requirements.in file), or ask the maintainer to release a fix/backport for an unsupported python version. Yeah, it's a pain, but there's really not much you can do in such cases anyway.

Also, I personally operate within the official python version EOL support timeframes. So my intended use case is mainly for python versions >=3.6,<=3.10. And python>3 seems to have fairly good back- and forward-compatibility (in terms of both package versions and python versions).


requirements.in:

numpy

I would argue that

  1. You "asked" for a single numpy version across a range of python versions, some of which are no longer supported by numpy and you got exactly what you asked for.

  2. IMO, if you wanted different numpy versions for different python versions, then you could have asked for something like

    requirements.in:

    numpy==1.16.*; python_version<3.8
    numpy>=1.21; python_version>=3.8
    

    The behaviour of pre-existing environment markers in .in files is not specified in my original proposal, because I personally don't currently need this functionality, so the exact syntax/behaviour for it is up for discussion.


you want to force newer versions of Python to solve with requirements generated for older version of Python, which is what I'm against

Technically, my proposal doesn't enforce the order. You can run it in reverse order (newer python versions first, then older versions) and it will fail on the first required package, which no longer supports the older python versions. But that's besides the point.

Okay, then how about this:

  • Have 2 versions of this flag --add-environment-markers=soft and --add-environment-markers=hard (WIP names)
  • Both of them behave like I described, but hard will fail if it can't satisfy the pre-existing requirement, while soft will try to resolve to a different version and add </>= environment markers to each.

With this modification,

  • I would be able to run pip-compile --add-environment-markers=hard in order from the oldest python version to the newest to get my preferred behaviour and
  • you would be able to run pip-compile --add-environment-markers=soft in order from the newest python version to the oldest one

So for example, in your numpy case you would run

  1. python3.8 -m piptools compile --add-environment-markers=soft --output-file=requirements.txt requirements.in
    

    And get numpy==the.latest.one

  2. python3.7 -m piptools compile --add-environment-markers=soft --output-file=requirements.txt requirements.in
    

    And still get the same numpy (assuming the.latest.one supports 3.7).

  3. ...

  4. python2.7 -m piptools compile --add-environment-markers=soft --output-file=requirements.txt requirements.in
    

    At this point, compile will see that the the.latest.one version is not available for the current python version, but a different the.old.one version is available and will split the requirements.txt to be something like

    numpy==the.latest.one; python_version>2.7
    numpy==the.old.one; python_version<=2.7
    

@henryiii
Copy link

henryiii commented Jun 23, 2021

. If you ever run into such a problem

Currently, it's not a problem, because setting a minimum Python version doesn't break things - a tremendous amount of work has gone into make it really easy to say "just increase your requires-python setting, and old users get old versions". This is completely opposite this work!

Ask the maintainer to release a fix/backport for an unsupported python version

That's the whole point of all the work for requires-python! Maintainers should be able to drop Python versions and everything works. And it does, except for this one proposal. No where else does pip or pip-compile solve newer versions of Python using older versions of Python.

Also, I personally operate within the official python version EOL support timeframes

This is a proposal for a personal version of pip-compile, then? Why not fork? See NEP 29 for the Scientific packages support. Manylinux is often 6 months or so after the official deadline. MyPy is just now staring to talk about dropping 3.5 for running. And Python 3.6 was wildly popular, it's the default in RHEL/CentOS 7 AND 8. RHEL/CentOS 7 is supported until 2024. You can't just assume that users will always operate in a place where all versions are supported. That's sort of what this feature was originally supposed to be for - detecting the start/end of versions so that it can specify dependencies that only apply for certain Python versions.

add-environment-markers

I don't like this being tied to environment markers at all, that's the job for a pip-combine, aka #826. What's wrong with this:

$ python3.7 -m piptools compile --output-file=requirements37.txt requirements.in
$ cp requirements37.txt requirements38.txt
$ python3.8 -m piptools compile --output-file=requirements38.txt requirements.in
$ python3.8 -m piptools combine --output-file=requirements.txt requirements37.txt requirements38.txt

Assuming there's a combine script in the future (and, you are not worried about putting them in one file, anyway, that's a different issue - you are worried about keeping as consistent of a solve as possible), wouldn't this work? The versions are not updated if they already solve, as you have already pointed out.

@RuRo
Copy link
Author

RuRo commented Jun 23, 2021

This is a proposal for a personal version of pip-compile, then?

It's clearly not. If you don't like my proposal, and you can't have a polite discussion, put a 👎 emoji on my original post and go submit your own. You ignore 90% of what I am saying and just keep pushing your own idea. It's not a bad idea, but as you yourself keep pointing out, its goal is different from my goal. I explained, what problem I am trying to solve and how I propose solving it. I also explained, why your proposal doesn't satisfy my needs.

You can't just assume that users will always operate in a place where all versions are supported.

I want to point out, that I just mentioned my personal use case and that I've never run into the problems you are describing because of the narrower python version window. I didn't say or imply anything about people or projects who still work to support older python versions. Also, pip-tools itself currently only supports python>=3.6, so it's not like I am unreasonably pushing things to the bleeding edge.

This is completely opposite this work!

I am somewhat doubtful, that this proposal will single-handedly undermine the effectiveness of the requires-python constraint. I already explained, that I don't think that this is a problem with my proposal. Also, there are quite a few solutions to this problem and "go ask the maintainer" was obviously just the last option, if nothing else I recommended worked for you.

No where else does pip or pip-compile solve newer versions of Python using older versions of Python.

You would run into the same problem even if you tried to pin the versions by hand. IMO, the root of evil here is not that my proposal allows older python versions to influence the pins for the newer python versions, but that you are attempting to produce version pins for a multi-python dependency set that is impossible to satisfy.


that's the job for a pip-combine

pip-combine ​doesn't exist yet. I've already mentioned, that if you make a separate proposal for pip-combine and it gets implemented and merged, then I will certainly reformulate my proposal in terms of pip-combine. But as things are right now, I don't want to further complicate my proposal and make it depend on an unrelated proposal, that might or might not get implemented.

What's wrong with this

  1. If python3.8 doesn't support some pinned version, it will be silently upgraded instead of producing an error

  2. Without additional modifications of how compile handles environment markers, if python3.8 requires some package a, which is not required by 3.7, its version will be updated to the latest available on each compile rerun instead of keeping the previous pin.

  3. It's 4 commands, (+ 3 per each additional python version) and it's (imo) hard to reason about what is happening here.
    Compare with "just run the same command for each python version"

    pythonX.Y -m piptools compile --output-file=requirements.txt --add-environment-markers requirements.in

    With a slightly more expressive name like maybe --add-python-version, this would be even more self explanatory and could even be included in the header

    # This file is autogenerated by pip-compile with python 3.6, 3.7, 3.8, 3.9
    # To update, run:
    #
    #    python3.6 -m piptools compile --output-file=requirements.txt --add-python-version requirements.in
    #    python3.7 -m piptools compile --output-file=requirements.txt --add-python-version requirements.in
    #    python3.8 -m piptools compile --output-file=requirements.txt --add-python-version requirements.in
    #    python3.9 -m piptools compile --output-file=requirements.txt --add-python-version requirements.in

@henryiii
Copy link

I'm sorry, I didn't mean to come off as argumentative. I was just responding to the fact that you were addressing my concern with "I don't need that in my use case". Anything put here should be designed for general use. You should assume large numbers of people see it and use it. If it doesn't scale, it's probably not a good idea.

I'm really worried about the broad impact that this could have if it's implemented and people start using it. As a maintainer of few (and hopefully growing number) of Python 3.7+ libraries, I don't want to start getting the exact sort of complaints you describe. As someone involved in rolling out binaries for new platforms, I don't want to see easy ways to ignore the latest version of libraries when a user is actively trying to upgrade to the latest versions. Universal2 will always be 3.8+, etc.

If it's "hidden" by requiring a few simple commands (which can be added to a script), then it's much less worrisome. Maybe adding a "remove unused requirements" flag would address your point 2. Point 1 is harder - but your original proposal didn't have a failure for this I think, but instead would have just have pinned as little of a difference as possible - which is exactly what this would do.

Also, for most users, I think the correct thing to do is to use requirements.in to pin versions they depend on the API explicitly for, and to otherwise not worry about version differences between Python versions. If there is a 'last supported version' on older Pythons, just using that will probably work for 90% of the use cases. You still can't skip testing on multiple versions, etc. A backport often has a different name than the builtin library, etc.

You have two distinct wishes. You want a as-similar-as-possible pinning mode, which you can do today following my procedure above (maybe with a new flag for removing unused requirements?). You'll still get multiple files, but they will be as similar as reasonable, which is what you want. The second wish is you want the files all merged into one - that's not supported by piptools, but has been requested. It's not the same as asking for the similar-as-possible pins. (At least, it does not have to be).

@henryiii
Copy link

PS. I have no say in the final decision, design, etc. of this, which is why I'm trying to make my worries known. I think the initial description doesn't make it very clear that you actually want something very specific - initially it looks like another way to do "combine" for multiple versions; I was a bit confused at first, at least. But what you want is minimal version changes across version of Python, which often is a bad idea, IMO. Doing it in reverse (newest first) helps quite a bit - possibly solving many of my worries. But that also makes it pretty similar to a normal merge.

@ssbarnea
Copy link
Member

I observed some heated remarks around here so I will remind you that we should be nice to everyone and accept other opinions, even if we do not agree with them. pip-tools is multi-purpose tools. Maybe we should put a warning about danger of hurting yourself if you go outside the documented use-cases ;)

As that bit around multiple python versions does introduce a serious complexity, i think that the only reliable way to achieve it is to add a new command: pip-merge, one that gets as input a series of requirements files an compiles/compress them into a single unified file. By keeping this outside the rest of the implementation we avoid blottering the code and making it impossible to manage. We can decide later if we can guess the python version for each input file from its comment or if you can give it as command line argument.

I know that openstack does something similar for managing wide project constraints. They do not use pip-tools as the project is older than pip-tools but there are things to be learnt both ways. Skip of the sorting (not random btw) used by that file, what it matters is that it covers multiple versions of python (look for markers).

If we do this, it would be up to the tool user to decide if they want to track intermediate files or only the consolidated (merged) file in git. They could even run tox in parallel to update the files before doing the merge.

How we decide to perform the compiling it would an implementation detail and if we find good reasons for supporting multiple behaviors we can add cli options to address that.

@slimshreydy
Copy link

Just wanted to check if there was some recent activity here! Would love a pip-merge functionality or some kind of way of making pip-compile cognizant of environment markers

@ssbarnea
Copy link
Member

The situation is even more complex, sys_platform conditions are quite common and you endup building lock files that vary depending on which platform you are running from.

@merwok
Copy link

merwok commented Nov 17, 2023

Why would a library pin its requirements?

@Jamim
Copy link

Jamim commented Nov 18, 2023

Why would a library pin its requirements?

If you ask me, it makes no sense to pin library requirements to certain versions in my opinion. However, even for libraries it makes sense to manage the requirements somehow. And some libraries pin their dev/test requirements. For instance, look at aiosonic.

@mireq
Copy link

mireq commented Nov 30, 2023

I have written ugly hack to solve this problem - pip-compile-universal

It's just stupid script which runs multiple pip-tools compilation commands with different python interpreter and then merges requirements to single file with version markers.

@webknjaz
Copy link
Member

@merwok I think there's a disconnect / false impression of what's actually pinned and where.

There's distribution runtime dependencies declaration (install requires + extras), which normally represent “what the project depends on to function” (as in a minimum version of something implementing a used feature). These are loose and uncapped. They mustn't be pinned on the dist metadata level, because this is what's used by the dependency resolver when such a project is pulled into some other environment and more restrictions cause more conflicts. The dist metadata describes abstract dependencies.

And then, there's concrete versions of dependencies installed in an env where a project is installed. They are derived from the abstract ones. But since a (virtual)env can only have one importable thing of the same name, those concrete dependencies each have one version that we often call a pin.

Pins are concrete dependencies that correspond to a partial subset of the abstract ones.

Back to the environments. When we say pins/lockfiles, we usually mean a way of representing a state of virtualenvs where the project and its deps are installed simultaneously, side-by-side, that aids reproducibility. Said virtualenvs exist in many contexts, for example:

  • a production app deployment environment (also applies to shipping apps along with the entire isolated env and pinned libs)
  • a staging app deployment environment that might differ from production
  • a contributor/dev environment
  • a CI environment
  • an environment for building the docs
  • an environment for invoking test automation tooling
  • build environment (like ephemeral PEP 517 envs that pip and pypa/build make, or the ones that the downstream distributor use) / it's now possible to pin it with pip-tools as well, by the way, via Add options for including build dependencies in compiled output #1681 (pending release)

Some of the above can also have different variations — for running tests against the oldest test deps separately from the newest ones. Or for running the tests under different OS+arch+Python combos.

Each of those envs needs a way to pin the deps separately. For example, Sphinx limitations for building the docs shouldn't cause dependency conflicts (or even be present) in the env for running the tests. Or a linter version that has incompatible transitive deps shouldn't be present in a docs env. And so on.

Hence, when using pip constraint files, we really describe the virtualenvs and not the project/dist/package itself. It is true that some of those pins are derived from the dist metadata for the runtime deps. But there may be additional sources of dependencies for different use cases that aren't.

In my opinion, the confusion manifests itself, when people adopt a misleading mental model, incorrectly assuming that lockfiles describe the project, while really they describe the envs that typically have that project installed (except for some dev/maintainer envs that may exist in a project).

To make matters worse, a portion of people think that it's acceptable to use extras as the source for provisioning those envs. This is a problem because of the semantics of extras — they are runtime deps for optional features (these are also lose and uncapped). The extras are connected to the project's runtime. And they end up in the dist metadata, accessible by the end-users. So the end-users are able to do things like pip install your-project[dev,docs,test], which perfectly demonstrates that it's your public API. Also, this reinforces the false link between the dist metadata spec and the virtualenvs.

So to answer the question,

Why would a library pin its requirements?

a library-type project wouldn't pin its runtime dependencies, but would pin the dependencies installed in the environments used by it during development/maintenance and other contexts (it shouldn't use extras for this, though). And an app-type project may want to ship an extra with pins for the loose runtime deps.
We need to stop associating the lockfiles with the project as if there's always a 1:1 relationship (there's not), and instead start thinking about the 1:1 relationship between the lockfiles and virtualenvs.

I'd like to acknoledge that in the examples above, I focused on separating the lockfiles, while this issue is asking to combine them. I know that some people would find it acceptable to combine the lockfiles. And if it works for them ­— that's probably fine.
Though, I'd like to explicitly list some cases I know of where it may be problematic:

  • different contexts having conflicting deps (something I mentioned above — linters/docs/tests);
  • different build environment pins across all projects being installed into an env and their individual build envs (quite recently, while using PIP_CONSTRAINT for build reproducibility, I faced a conflict with a runtime dependency PyYAML that capped the Cython versions below the version that we pinned/required for our wheel build env: [confusion report] CIBW_ENVIRONMENT seems to leak into the test stage pypa/cibuildwheel#1666) — this is rather evident when some bits of the dependency tree are build from sdists;
  • different runtimes for the same environment category (think different versions of the same transitive dep required to test under Python 2.7 on Windows vs Python 3.12 on macOS);
  • likely something else that slipped my mind too.

The point is that the more of those envs/contexts are clashed together, the better the chance that the “combined” lockfile will have conflicting entries, making it completely defunct.
Due to the reasons listed above, I ended up deciding that I'm better off with a matrix of separate per-env lockfiles: #826 (comment).

Now, I also wanted to address that “abuse of extras” problem. While it's technically possible to use them, they always depend on the project itself, which depends on its runtime deps that, in turn, bring more transitive deps into the env. This is another undesirable side effect, beside using a public API for a very tiny subset of the end-users.

Currently, I like having a .in file for each distinct environment category and a matrix of constraint files per env (like tests.in + tox-py311-cp311-linux-x86_64.txt, tox-py38-cp38-win32-amd64.txt, tox-pypy37-pp37-darwin-x86_64.txt). This requires some tooling, though, to support selecting and auto-injecting the correct constraint file names into pip install invocations.

Over time, I think PEP 735 may take over the function of declaring the input files for different environments/contexts. One of the advantages over the extras is that they won't be a public API exposed to anybody getting a dist from PyPI — it'll describe the direct deps of different environments. Also, the dependency groups will allow not depending on the project runtime deps unconditionally.
Hopefully, pip-tools will learn to consume them and in time, will be able to produce the constraint files, like from other sources.

@webknjaz
Copy link
Member

webknjaz commented Nov 30, 2023

Duplicate of #826

(It was suggested to leave one of the duplicate issues open some time ago @ #826 (comment), but I missed it somehow; Keeping the older one, come over there)

@webknjaz webknjaz marked this as a duplicate of #826 Nov 30, 2023
@webknjaz webknjaz closed this as not planned Won't fix, can't repro, duplicate, stale Nov 30, 2023
@webknjaz webknjaz added the docs Documentation related label Dec 1, 2023
jedie added a commit to YunoHost-Apps/django-for-runners_ynh that referenced this issue Aug 4, 2024
Add indirect depencies for Python <=3.11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation related feature Request for a new feature help wanted Request help from the community
Projects
None yet
Development

No branches or pull requests