`lock create` with "large" set of dependencies spends 95+% of time in sequential `pip download` #2036

huonw · 2023-01-12T05:11:21Z

In our (pants-using) repo, we have 128 dependencies, which leads to ./pants generate-lockfiles taking a long time, due to the pex3 lock create invocation. It seems this invocation is slow because pip download ... checks https://pypi.org/simple/... individually, for each package, sequentially (pypa/pip#825).

I notice pex takes a --jobs argument to parallelise some things, but there's only one pip download invocation, so there's currently no parallelism. Can pex parallelise this? (Potentially only with certain options, like --intransitive?)

I've created an example a set of 88 requirements, reduced from the top 100 downloaded packages on PyPI (from https://hugovk.github.io/top-pypi-packages/, not that it really matters):

click for requirements.txt

aiobotocore==2.4.2
aiohttp==3.8.3
aioitertools==0.11.0
aiosignal==1.3.1
asn1crypto==1.5.1
async-timeout==4.0.2
attrs==22.2.0
azure-core==1.26.2
azure-storage-blob==12.14.1
beautifulsoup4==4.11.1
botocore==1.27.59
cachetools==5.2.1
certifi==2022.12.7
cffi==1.15.1
chardet==5.1.0
charset-normalizer==2.1.1
click==8.1.3
colorama==0.4.6
cryptography==39.0.0
decorator==5.1.1
distlib==0.3.6
docutils==0.19
et-xmlfile==1.1.0
exceptiongroup==1.1.0
filelock==3.9.0
Flask==2.2.2
frozenlist==1.3.3
fsspec==2022.11.0
greenlet==2.0.1
idna==3.4
importlib-metadata==6.0.0
importlib-resources==5.10.2
iniconfig==2.0.0
isodate==0.6.1
itsdangerous==2.1.2
Jinja2==3.1.2
jmespath==1.0.1
jsonschema==4.17.3
MarkupSafe==2.1.1
msrest==0.7.1
multidict==6.0.4
numpy==1.24.1
oauthlib==3.2.2
openpyxl==3.0.10
packaging==23.0
pandas==1.5.2
Pillow==9.4.0
platformdirs==2.6.2
pluggy==1.0.0
proto-plus==1.22.2
protobuf==4.21.12
psutil==5.9.4
psycopg2-binary==2.9.5
pyarrow==10.0.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.21
pydantic==1.10.4
Pygments==2.14.0
PyJWT==2.6.0
pyOpenSSL==23.0.0
pyparsing==3.0.9
pyrsistent==0.19.3
pytest==7.2.0
python-dateutil==2.8.2
pytz==2022.7
PyYAML==6.0
requests==2.28.1
requests-oauthlib==1.3.1
requests-toolbelt==0.10.1
rsa==4.9
s3fs==2022.11.0
s3transfer==0.6.0
scipy==1.10.0
six==1.16.0
soupsieve==2.3.2.post1
SQLAlchemy==1.4.46
tabulate==0.9.0
tomli==2.0.1
tqdm==4.64.1
typing_extensions==4.4.0
urllib3==1.26.14
virtualenv==20.17.1
websocket-client==1.4.2
Werkzeug==2.2.2
wrapt==1.14.1
yarl==1.8.2
zipp==3.11.0

I added a little bit of extra logging of pip invocations (#2035), and then ran:

PEX_VERBOSE=3 pex3 lock create -r /tmp/frozen-requirements.txt --out=/tmp/x.json --preserve-pip-download-log |& ts -s '%.s'

Output (with my spaces and commentary in # CAPS)

0.047973 pex: Installed VendorImporter(root='/Users/huon/projects/pantsbuild/pex/venv/lib/python3.9/site-packages', importables=(_Importable(module='attrs', is_pkg=True, path='pex/vendor/_vendored/attrs', prefix='pex.third_party'), _Importable(module='attr', is_pkg=True, path='pex/vendor/_vendored/attrs', prefix='pex.third_party'), _Importable(module='pyparsing', is_pkg=False, path='pex/vendor/_vendored/packaging_21_3', prefix='pex.third_party'), _Importable(module='packaging', is_pkg=True, path='pex/vendor/_vendored/packaging_21_3', prefix='pex.third_party'), _Importable(module='toml', is_pkg=True, path='pex/vendor/_vendored/toml', prefix='pex.third_party'), _Importable(module='pip', is_pkg=True, path='pex/vendor/_vendored/pip', prefix='pex.third_party'), _Importable(module='easy_install', is_pkg=False, path='pex/vendor/_vendored/setuptools', prefix='pex.third_party'), _Importable(module='setuptools', is_pkg=True, path='pex/vendor/_vendored/setuptools', prefix='pex.third_party'), _Importable(module='pkg_resources', is_pkg=True, path='pex/vendor/_vendored/setuptools', prefix='pex.third_party'), _Importable(module='wheel', is_pkg=True, path='pex/vendor/_vendored/wheel', prefix='pex.third_party')))
...
0.285217 pex: Isolating pex: 0.0ms
0.285820 pex: Preserving `pip download` log at /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/pex-pip-log.g3g7s6vy/pip.log

# PIP DOWNLOAD INVOCATION
0.286779 pex: Executing: ... /Users/huon/.pex/venvs/c223a0474ff758eec891c9e8e183ce978ffa6f14/547bc0ed92f40a50fbeafded4a247435b07c2533/bin/python -sE /Users/huon/.pex/venvs/c223a0474ff758eec891c9e8e183ce978ffa6f14/547bc0ed92f40a50fbeafded4a247435b07c2533/pex --disable-pip-version-check --no-python-version-warning --exists-action a --no-input --use-deprecated legacy-resolver --isolated -v --cache-dir /Users/huon/.pex/pip_cache --log /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/pex-pip-log.g3g7s6vy/pip.log download --dest /private/var/folders/sv/vd266m4d4lvctgs2wpnhjs9w0000gn/T/tmpexr56764/Users.huon.projects.pantsbuild.pex.venv.bin.python --requirement /tmp/frozen-requirements.txt --index-url https://pypi.org/simple --retries 5 --timeout 15

14.054938 pex: Resolving for:
14.054998   /Users/huon/projects/pantsbuild/pex/venv/bin/python: 13859.4ms
...

# (ALL ARTEFACTS ARE CACHED)
14.355525 pex: Using cached artifact at /Users/huon/.pex/downloads/90b77e79eaa3eba6de819a0c442c0b4ceefc341a7a2ab77d7562bf49f425c5c2 for FileArtifact(url='https://files.pythonhosted.org/packages/fc/34/3030de6f1370931b9dbb4dad48f6ab1015ab1d32447850b9fc94e60097be/idna-3.4-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='90b77e79eaa3eba6de819a0c442c0b4ceefc341a7a2ab77d7562bf49f425c5c2'), verified=False, filename='idna-3.4-py3-none-any.whl')
...
14.360755 pex: Using cached artifact at /Users/huon/.pex/downloads/961d03dc3453ebbc59dbdea9e4e11c5651520a876d0f4db161e8674aae935da9 for FileArtifact(url='https://files.pythonhosted.org/packages/36/7a/87837f39d0296e723bb9b62bbb257d0355c7f6128853c78955f57342a56d/python_dateutil-2.8.2-py2.py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='961d03dc3453ebbc59dbdea9e4e11c5651520a876d0f4db161e8674aae935da9'), verified=False, filename='python_dateutil-2.8.2-py2.py3-none-any.whl')
...
14.365894 pex: Using cached artifact at /Users/huon/.pex/downloads/83e9a75d1911279afd89352c68b45348559d1fc0506b054b346651b5e7fee29f for FileArtifact(url='https://files.pythonhosted.org/packages/db/51/a507c856293ab05cdc1db77ff4bc1268ddd39f29e7dc4919aa497f0adbec/charset_normalizer-2.1.1-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='83e9a75d1911279afd89352c68b45348559d1fc0506b054b346651b5e7fee29f'), verified=False, filename='charset_normalizer-2.1.1-py3-none-any.whl')
14.365952 pex: Indexing downloads: 11.7ms

Based on the ts-inserted timestamps, the whole invocation takes ~14.4s, and the pip download invocation takes 13.8.

The pip log seems to contain sequential downloads:

2023-01-12T16:00:09,637 Looking up "https://pypi.org/simple/aiobotocore/" in the cache
2023-01-12T16:00:09,638 Request header has "max_age" as 0, cache bypassed
2023-01-12T16:00:09,638 Starting new HTTPS connection (1): pypi.org:443
2023-01-12T16:00:09,719 https://pypi.org:443 "GET /simple/aiobotocore/ HTTP/1.1" 304 0
...
2023-01-12T16:00:09,776 Looking up "https://pypi.org/simple/aiohttp/" in the cache
2023-01-12T16:00:09,776 Request header has "max_age" as 0, cache bypassed
2023-01-12T16:00:09,817 https://pypi.org:443 "GET /simple/aiohttp/ HTTP/1.1" 304 0
...
2023-01-12T16:00:10,247 https://pypi.org:443 "GET /simple/aioitertools/ HTTP/1.1" 304 0
...
2023-01-12T16:00:10,273 https://pypi.org:443 "GET /simple/aiosignal/ HTTP/1.1" 304 0
...
2023-01-12T16:00:22,167 https://pypi.org:443 "GET /simple/zipp/ HTTP/1.1" 304 0

The text was updated successfully, but these errors were encountered:

jsirois · 2023-01-12T05:24:50Z

I see you added logging, but the parallelism structure and limitations are "well known" already ... to those in the know.

With these constraints:

Pex moved away from a bespoke resolver to Pip some 3+ years ago with the express intent of leveraging the industry standard resolver to avoid bugs / meet expectations.

Really, that's it.

With that in mind, PEX does:

A resolve, via pip download
Builds any resolved sdists to wheels via parallel pip wheel jobs.
Installs all wheels in parallel via pip install

The 1st step is not parallelizable and it's also the 1st and ~only step creating a lock file. To parallelize resolves means parallelizing the Pip resolve process itself upstream or else going back to bespoke.

jsirois · 2023-01-12T18:08:05Z

@huonw I'm not sure what you're gunning for here. If it's faster locks after the 1st, there is a known unimplemented solution for that, which is to use the current lock to create a venv and then run the pip download inside that venv. That is a good deal faster than the current fresh isolated pip download used to perform resolves/ locks. There is coding needed to support this. The 1st time resolve / lock though absolutely cannot be sped up except by speeding up Pip / using a new resolver. If the 2nd + locks speed up is of interest to you, that would be good to know. You're not alone and I could prioritize that work for tackling over the next few months.

stuhood · 2023-01-12T20:49:56Z

@huonw I'm not sure what you're gunning for here. If it's faster locks after the 1st, there is a known unimplemented solution for that, which is to use the current lock to create a venv and then run the pip download inside that venv. That is a good deal faster than the current fresh isolated pip download used to perform resolves/ locks. There is coding needed to support this. The 1st time resolve / lock though absolutely cannot be sped up except by speeding up Pip / using a new resolver. If the 2nd + locks speed up is of interest to you, that would be good to know. You're not alone and I could prioritize that work for tackling over the next few months.

Would a pex3 lock update be any faster, because of the constraints that it uses?

jsirois · 2023-01-12T23:37:17Z

Let's just find out. @huonw can you provide a real before / after dual set of input requirements?

jsirois · 2023-01-13T16:29:37Z

Ok, I contrived an example. In short, lock update is exactly the same speed as lock re-creation given the same PEX_ROOT cache baseline; so no cheap win there:

Given:
requirements.txt
requirements2.txt

That diff is:

$ diff -u requirements.txt requirements2.txt
--- requirements.txt    2023-01-13 08:16:34.195292265 -0800
+++ requirements2.txt   2023-01-13 08:18:43.755243397 -0800
@@ -14,7 +14,7 @@
 cffi==1.15.1
 chardet==5.1.0
 charset-normalizer==2.1.1
-click<9
+click<=8
 colorama==0.4.6
 cryptography==39.0.0
 decorator==5.1.1

N.B.: I have to use a downgrade for this example since pex3 lock update ... updates an existing lock and requires the update conform to the original locks requirements. Since those requirements are all latest already, the only way to get an update is pin lower, still compatible with the initial requirements.

Lock 2x:
No cache initial lock:

$ rm -rf ~/.pex && time pex3 lock create --style universal --resolver-version pip-2020-resolver --pip-version 22.3 --target-system linux --target-system mac --interpreter-constraint "==3.10.*" -r requirements.txt --indent 2 -o lock.json

real    0m56.997s
user    0m19.995s
sys     0m1.179s

Cache from initial lock still laying there when 2nd lock is run:

$ time pex3 lock create --style universal --resolver-version pip-2020-resolver --pip-version 22.3 --target-system linux --target-system mac --interpreter-constraint "==3.10.*" -r requirements2.txt --indent 2 -o lock2.json

real    0m22.893s
user    0m16.522s
sys     0m0.448s

Lock update:
Initial lock again:

$ rm -rf ~/.pex && time pex3 lock create --style universal --resolver-version pip-2020-resolver --pip-version 22.3 --target-system linux --target-system mac --interpreter-constraint "==3.10.*" -r requirements.txt --indent 2 -o lock.json

real    0m56.608s
user    0m19.933s
sys     0m1.211s

Lock update instead of re-lock with new requirements:

$ cp lock.json lock2-update.json
$ time pex3 lock update -p "click<=8" --indent 2 lock2-update.json
Updates for lock generated by universal
  Updated click from 8.1.3 to 8

real    0m22.982s
user    0m16.925s
sys     0m0.599s

This adds logging for `pip` invocations, to make it easier to understand how pex is calling `pip`, such as when a single invocation is taking a long time (#2036). This mirrors similar logging for Python invocations: https://github.com/pantsbuild/pex/blob/1f8e25a714e52c45a50a6d6ab2ee7cb9e21a92bb/pex/interpreter.py#L713-L716

jsirois · 2023-01-13T16:48:13Z

Alright, I've labelled this as a question and I'll be aiming to close as answered by the end of the day. That close pends reviving some old tests of incremental resolve (the venv trick mentioned above) to see how fast that is / what work is involved in implementing it for real.

huonw · 2023-01-16T01:45:45Z

Thanks for waiting over the weekend, and for providing all that context. (And, of course, thank you for maintaining PEX!)

In summary, feel free to close this as "won't fix" (and/or an upstream issue). I understand the design constraints better now!

I see you added logging, but the parallelism structure and limitations are "well known" already ... to those in the know.

I was not in the know, and I imagine many others also are not. Thanks for merging the logging, hopefully that'll save the next person in a similar situation to me the few hours I took to narrow down what was going on.

Pex moved away from a bespoke resolver to Pip some 3+ years ago with the express intent of leveraging the industry standard resolver to avoid bugs / meet expectations.

Yep, that makes sense. That's key to my understanding.

In this case, there's an 'extra' bonus of the input requirements being the full transitive set of deps, locked to exact versions, so, to a naive rube like me, there's a chance no resolution is required, just finding the artefacts, which seems like it could be done in parallel, hence filing an issue (although I maybe didn't emphasise that extra constraint as much as I could've). But, I totally understand not adding special code paths for niche use-cases, given the discussion here.

(We're in this circumstance in our repo: we have pants/pex create a lock file from a set of requirements exported from poetry, because poetry gives us better control around adding deps without upgrading the world, e.g. pantsbuild/pants#15704.)

If the 2nd + locks speed up is of interest to you, that would be good to know. You're not alone and I could prioritize that work for tackling over the next few months.

Yeah, we're almost always adding/removing/adjusting dependencies within an existing lockfile, rather than creating new lockfiles.

That said, for us, even more important than optimisations in PEX is being able to do away with Poetry and use only pants+PEX (using both Poetry and PEX means we pay for two resolves when changing any deps, so in addition to being more convenient, using just PEX will be 2+x faster for us, automatically). As mentioned above, one blocker for this is pantsbuild/pants#15704, so that we can upgrade existing dependencies and add new ones, without upgrading everything else as a side effect. In addition (just so that you're not over indexing on the value of this discussion), we're only likely to benefit from optimisations if they're hooked into Pant's generate-lockfiles goal.

Let's just find out. @huonw can you provide a real before / after dual set of input requirements?

The most recent security update we did was bumping certifi==2021.10.8 to the version in the issue description. That is, requirements-after.txt is the list in the issue, and requirements-before.txt is taking that file and applying:

--- requirements-after.txt	2023-01-16 12:19:08.000000000 +1100
+++ requirements-before.txt	2023-01-16 12:19:24.000000000 +1100
@@ -12,3 +12,3 @@
 cachetools==5.2.1
-certifi==2022.12.7
+certifi==2021.10.8
 cffi==1.15.1

Running the same commands as you, gives similar results, for doing a full lock create:

rm -rf ~/.pex && time pex3 lock create --style universal --resolver-version pip-2020-resolver --pip-version 22.3 --target-system linux --target-system mac --interpreter-constraint "==3.10.*" -r requirements-before.txt --indent 2 -o lock.jsoon
#> 21.77s user 5.99s system 50% cpu 55.085 total

time pex3 lock create --style universal --resolver-version pip-2020-resolver --pip-version 22.3 --target-system linux --target-system mac --interpreter-constraint "==3.10.*" -r requirements-after.txt --indent 2 -o lock2.json
#> 13.33s user 1.37s system 83% cpu 17.594 total

As I think you were implying in your note, doing this sort of upgrade via pex3 lock update seems to be rejected, because the old and new requirement on certifi are incompatible.

rm -rf ~/.pex && time pex3 lock create --style universal --resolver-version pip-2020-resolver --pip-version 22.3 --target-system linux --target-system mac --interpreter-constraint "==3.10.*" -r requirements-before.txt --indent 2 -o lock.json 
#> 21.87s user 6.15s system 53% cpu 52.577 total

time pex3 lock update -p "certifi==2022.12.7" --indent 2 lock.json
#> ERROR: Cannot install certifi==2021.10.8 because these package versions have conflicting dependencies

But, that's probably not related to the performance question here. That said, if I'm understanding this correctly, we'd preferably still want to be able to do an "minimal update" when we replace constraints. For example, a concrete instance of this that we might be doing in the near future would be upgrading sqlalchemy>=1.4,<2 to sqlalchemy>=2,<3 (https://www.sqlalchemy.org/blog/2023/01/09/sqlalchemy-2.0.0rc2-released/).

jsirois · 2023-01-16T02:03:40Z

Ok, it sounds like ~all your problems are with Pants at the moment and you'll benefit from incremental PEX locking with the venv trick when it comes.

As far as update not working with pins, that's a bit of an artificial pickle you're in using poetry + Pants / Pex. Presumably, if just using Poetry or just using Pants / Pex you would not use all input pins and let locking do its job (for example allowing patch or minor to float high in your input requirements in many cases for well-behaved libraries). In that case, pex3 lock update -p ... I think does what you want in general, but if not an issue could be made for making update less strict to handle cases like your sqlalchemy upgrade. The current semantics are invented by me, basically update an existing lock staying faithful to its input parameters.

jsirois · 2023-01-16T02:13:27Z

In this case, there's an 'extra' bonus of the input requirements being the full transitive set of deps, locked to exact versions, so, to a naive rube like me, there's a chance no resolution is required, just finding the artefacts, which seems like it could be done in parallel

Yeah, a bit naive. For one there would need to be a signal (flag) coming from Pants to Pex to say --intransitive. Pants does not do that today. Even with that (it passes through to pip download --no-deps), Pip is - surprisingly - roughly no faster; so this special case would need to be hand-optimized by Pex to run multiple downloads in parallel and then merge the pip.log scrapes it does to get lock data. I would not block that contribution, but its narrow enough a use case that I won't be getting around to it any time soon.

jsirois · 2023-01-16T02:16:42Z

I'll still leave this open until I finish off here with the incremental resolve venv trick numbers. I went as far as to realize the trick was not, in fact, create a venv from the existing lock, then run pip download in that venv. That's no faster! Its create a venv from the existing lock, run pip install in that venv, extract the diff. That is faster, but different enough from how things are done today to require some good chunk of work last I looked.

Timings coming ~tomorrow.

jsirois · 2023-01-17T18:57:14Z

Ok, and the incremental timings. The raw work:

$ rm -rf ~/.pex && time pex3 lock create --style universal --resolver-version pip-2020-resolver --pip-version 22.3 --target-system linux --target-system mac --interpreter-constraint "==3.10.*" -r requirements.txt --indent 2 -o lock.json

real    0m57.794s
user    0m19.949s
sys     0m1.187s
$ time pex --lock lock.json --include-tools -o lock.pex --no-compress

real    0m10.991s
user    0m56.785s
sys     0m5.688s
$ time pex --lock lock.json --include-tools -o lock.pex --no-compress

real    0m2.159s
user    0m1.682s
sys     0m0.481s
$ time PEX_TOOLS=1 ./lock.pex venv incremental.venv --pip

real    0m3.778s
user    0m3.362s
sys     0m0.390s
$ time ./incremental.venv/bin/pip install -r requirements2.txt --log=pip.log
Requirement already satisfied: aiobotocore==2.4.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 1)) (2.4.2)
Requirement already satisfied: aiohttp==3.8.3 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 2)) (3.8.3)
Requirement already satisfied: aioitertools==0.11.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 3)) (0.11.0)
Requirement already satisfied: aiosignal==1.3.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 4)) (1.3.1)
Requirement already satisfied: asn1crypto==1.5.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 5)) (1.5.1)
Requirement already satisfied: async-timeout==4.0.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 6)) (4.0.2)
Requirement already satisfied: attrs==22.2.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 7)) (22.2.0)
Requirement already satisfied: azure-core==1.26.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 8)) (1.26.2)
Requirement already satisfied: azure-storage-blob==12.14.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 9)) (12.14.1)
Requirement already satisfied: beautifulsoup4==4.11.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 10)) (4.11.1)
Requirement already satisfied: botocore==1.27.59 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 11)) (1.27.59)
Requirement already satisfied: cachetools==5.2.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 12)) (5.2.1)
Requirement already satisfied: certifi==2022.12.7 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 13)) (2022.12.7)
Requirement already satisfied: cffi==1.15.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 14)) (1.15.1)
Requirement already satisfied: chardet==5.1.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 15)) (5.1.0)
Requirement already satisfied: charset-normalizer==2.1.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 16)) (2.1.1)
Collecting click<=8
  Downloading click-8.0.0-py3-none-any.whl (96 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 96.9/96.9 KB 2.0 MB/s eta 0:00:00
Requirement already satisfied: colorama==0.4.6 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 18)) (0.4.6)
Requirement already satisfied: cryptography==39.0.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 19)) (39.0.0)
Requirement already satisfied: decorator==5.1.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 20)) (5.1.1)
Requirement already satisfied: distlib==0.3.6 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 21)) (0.3.6)
Requirement already satisfied: docutils==0.19 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 22)) (0.19)
Requirement already satisfied: et-xmlfile==1.1.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 23)) (1.1.0)
Requirement already satisfied: exceptiongroup==1.1.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 24)) (1.1.0)
Requirement already satisfied: filelock==3.9.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 25)) (3.9.0)
Requirement already satisfied: Flask==2.2.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 26)) (2.2.2)
Requirement already satisfied: frozenlist==1.3.3 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 27)) (1.3.3)
Requirement already satisfied: fsspec==2022.11.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 28)) (2022.11.0)
Requirement already satisfied: greenlet==2.0.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 29)) (2.0.1)
Requirement already satisfied: idna==3.4 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 30)) (3.4)
Requirement already satisfied: importlib-metadata==6.0.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 31)) (6.0.0)
Requirement already satisfied: importlib-resources==5.10.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 32)) (5.10.2)
Requirement already satisfied: iniconfig==2.0.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 33)) (2.0.0)
Requirement already satisfied: isodate==0.6.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 34)) (0.6.1)
Requirement already satisfied: itsdangerous==2.1.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 35)) (2.1.2)
Requirement already satisfied: Jinja2==3.1.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 36)) (3.1.2)
Requirement already satisfied: jmespath==1.0.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 37)) (1.0.1)
Requirement already satisfied: jsonschema==4.17.3 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 38)) (4.17.3)
Requirement already satisfied: MarkupSafe==2.1.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 39)) (2.1.1)
Requirement already satisfied: msrest==0.7.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 40)) (0.7.1)
Requirement already satisfied: multidict==6.0.4 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 41)) (6.0.4)
Requirement already satisfied: numpy==1.24.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 42)) (1.24.1)
Requirement already satisfied: oauthlib==3.2.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 43)) (3.2.2)
Requirement already satisfied: openpyxl==3.0.10 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 44)) (3.0.10)
Requirement already satisfied: packaging==23.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 45)) (23.0)
Requirement already satisfied: pandas==1.5.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 46)) (1.5.2)
Requirement already satisfied: Pillow==9.4.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 47)) (9.4.0)
Requirement already satisfied: platformdirs==2.6.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 48)) (2.6.2)
Requirement already satisfied: pluggy==1.0.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 49)) (1.0.0)
Requirement already satisfied: proto-plus==1.22.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 50)) (1.22.2)
Requirement already satisfied: protobuf==4.21.12 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 51)) (4.21.12)
Requirement already satisfied: psutil==5.9.4 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 52)) (5.9.4)
Requirement already satisfied: psycopg2-binary==2.9.5 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 53)) (2.9.5)
Requirement already satisfied: pyarrow==10.0.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 54)) (10.0.1)
Requirement already satisfied: pyasn1==0.4.8 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 55)) (0.4.8)
Requirement already satisfied: pyasn1-modules==0.2.8 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 56)) (0.2.8)
Requirement already satisfied: pycparser==2.21 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 57)) (2.21)
Requirement already satisfied: pydantic==1.10.4 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 58)) (1.10.4)
Requirement already satisfied: Pygments==2.14.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 59)) (2.14.0)
Requirement already satisfied: PyJWT==2.6.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 60)) (2.6.0)
Requirement already satisfied: pyOpenSSL==23.0.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 61)) (23.0.0)
Requirement already satisfied: pyparsing==3.0.9 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 62)) (3.0.9)
Requirement already satisfied: pyrsistent==0.19.3 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 63)) (0.19.3)
Requirement already satisfied: pytest==7.2.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 64)) (7.2.0)
Requirement already satisfied: python-dateutil==2.8.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 65)) (2.8.2)
Requirement already satisfied: pytz==2022.7 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 66)) (2022.7)
Requirement already satisfied: PyYAML==6.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 67)) (6.0)
Requirement already satisfied: requests==2.28.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 68)) (2.28.1)
Requirement already satisfied: requests-oauthlib==1.3.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 69)) (1.3.1)
Requirement already satisfied: requests-toolbelt==0.10.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 70)) (0.10.1)
Requirement already satisfied: rsa==4.9 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 71)) (4.9)
Requirement already satisfied: s3fs==2022.11.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 72)) (2022.11.0)
Requirement already satisfied: s3transfer==0.6.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 73)) (0.6.0)
Requirement already satisfied: scipy==1.10.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 74)) (1.10.0)
Requirement already satisfied: six==1.16.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 75)) (1.16.0)
Requirement already satisfied: soupsieve==2.3.2.post1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 76)) (2.3.2.post1)
Requirement already satisfied: SQLAlchemy==1.4.46 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 77)) (1.4.46)
Requirement already satisfied: tabulate==0.9.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 78)) (0.9.0)
Requirement already satisfied: tomli==2.0.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 79)) (2.0.1)
Requirement already satisfied: tqdm==4.64.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 80)) (4.64.1)
Requirement already satisfied: typing_extensions==4.4.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 81)) (4.4.0)
Requirement already satisfied: urllib3==1.26.14 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 82)) (1.26.14)
Requirement already satisfied: virtualenv==20.17.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 83)) (20.17.1)
Requirement already satisfied: websocket-client==1.4.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 84)) (1.4.2)
Requirement already satisfied: Werkzeug==2.2.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 85)) (2.2.2)
Requirement already satisfied: wrapt==1.14.1 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 86)) (1.14.1)
Requirement already satisfied: yarl==1.8.2 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 87)) (1.8.2)
Requirement already satisfied: zipp==3.11.0 in ./incremental.venv/lib/python3.10/site-packages (from -r requirements2.txt (line 88)) (3.11.0)
Installing collected packages: click
  Attempting uninstall: click
    Found existing installation: click 8.1.3
    Uninstalling click-8.1.3:
      Successfully uninstalled click-8.1.3
Successfully installed click-8.0.0

real    0m1.264s
user    0m0.863s
sys     0m0.049s

So that's roughly 8s for the real-world use case:

16.1s (11s + 3.8s + 1.3s): lock created and never used at all (no tests run, packages created etc) - unlikely
7.3s (2.2s + 3.8s + 1.3s): lock created, used some, then updated using incremental scheme.

That's fairly significant and worth pursuing all other things being equal. On quick inspection, the Pip log output from the pip install will work with the existing locker as long as that locker is adjusted to be differential, since the install log just mentions skips and does not iunclude all needed lock data for the locked items that were not updated.
pip.log

I'll close this as answered and link an issue tracking implementation of incremental resolves using the lock -> venv -> pip install trick.

jsirois · 2023-01-19T00:15:10Z

Ok, the parallelization --intransitive resolves is now tracked by #2043.
The implementation of incremental lock resolves is tracked by #2044.

If I end up devoting time to these it will definitely be the incremental lock resolves of #2044 1st. That will both help the --intransitive corner case (though still probably not squeeze all the juice there that #2043 would) and the mainline use case.

huonw · 2023-01-19T00:20:07Z

That sounds good. Thanks for working through this discussion!

huonw mentioned this issue Jan 12, 2023

Log all pip invocations #2035

Merged

jsirois added the question label Jan 13, 2023

jsirois closed this as completed Jan 17, 2023

jsirois mentioned this issue Jan 19, 2023

Implement support for incremental lock resolves. #2044

Open

jsirois added the answered label Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`lock create` with "large" set of dependencies spends 95+% of time in sequential `pip download` #2036

`lock create` with "large" set of dependencies spends 95+% of time in sequential `pip download` #2036

huonw commented Jan 12, 2023 •

edited

Loading

jsirois commented Jan 12, 2023 •

edited

Loading

jsirois commented Jan 12, 2023 •

edited

Loading

stuhood commented Jan 12, 2023

jsirois commented Jan 12, 2023

jsirois commented Jan 13, 2023 •

edited

Loading

jsirois commented Jan 13, 2023

huonw commented Jan 16, 2023

jsirois commented Jan 16, 2023 •

edited

Loading

jsirois commented Jan 16, 2023 •

edited

Loading

jsirois commented Jan 16, 2023

jsirois commented Jan 17, 2023

jsirois commented Jan 19, 2023

huonw commented Jan 19, 2023

lock create with "large" set of dependencies spends 95+% of time in sequential pip download #2036

lock create with "large" set of dependencies spends 95+% of time in sequential pip download #2036

Comments

huonw commented Jan 12, 2023 • edited Loading

jsirois commented Jan 12, 2023 • edited Loading

jsirois commented Jan 12, 2023 • edited Loading

stuhood commented Jan 12, 2023

jsirois commented Jan 12, 2023

jsirois commented Jan 13, 2023 • edited Loading

jsirois commented Jan 13, 2023

huonw commented Jan 16, 2023

jsirois commented Jan 16, 2023 • edited Loading

jsirois commented Jan 16, 2023 • edited Loading

jsirois commented Jan 16, 2023

jsirois commented Jan 17, 2023

jsirois commented Jan 19, 2023

huonw commented Jan 19, 2023

`lock create` with "large" set of dependencies spends 95+% of time in sequential `pip download` #2036

`lock create` with "large" set of dependencies spends 95+% of time in sequential `pip download` #2036

huonw commented Jan 12, 2023 •

edited

Loading

jsirois commented Jan 12, 2023 •

edited

Loading

jsirois commented Jan 12, 2023 •

edited

Loading

jsirois commented Jan 13, 2023 •

edited

Loading

jsirois commented Jan 16, 2023 •

edited

Loading

jsirois commented Jan 16, 2023 •

edited

Loading