-
-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WiP: Incremental subsetting of PEX lock #14923
Conversation
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
Open question: Should we do this as well for the "other" cases (non-PEX lockfiles)? |
|
||
|
||
@dataclass(frozen=True) | ||
class PexReqs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all hairy, and I apologize. But I think that the result as it stands might be a bit off. AFAICT, the workflows look like:
- If we are using the new optimization:
- Don't actually build a requirements-only PEX, and instead hold onto the requirements which would have been used
- When actually creating a PEX containing user code, embed the thirdparty requirements
- If we are not using the optimization:
- Build a requirements-only PEX, and put it on the pex-path of another PEX containing user code.
The differentiation between "a requirements-only PEX" (which contains only thirdparty requirements) and the PEX for user code was intentional (although how important it is in practice is unclear), because thirdparty requirements and your code's transitive thirdparty deps change much less frequently than anything else.
So I think that you'll actually want to preserve that aspect, by adjusting the optimization to always create a RequirementsPex
, but to do so directly from the Lockfile, rather than from a "repository PEX". That should (AFAICT?) avoid the need for the new type at all, because the existing RequirementsPex
creation code can be adjusted directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit of building any PEXs at all with the new optimization? AFAICT building PEXs here puts us in the same boat as before w.r.t. maybe wanting run_against_entire_lockfile
, because for each unique subset of 3rdparty deps we'll be building a PEX and that's non-ideal.
Unless you mean for goals like package
? I'm pretty sure that code has no different behavior with this PR (although we should probably change that to do bullet point 1, subpoint ii. Today it might be PEXing the whole repo)
Sorry, really confused by this comment 😅
Don't actually build a requirements-only PEX, and instead hold onto the requirements which would have been used
...
Build a requirements-only PEX, and put it on the pex-path of another PEX containing user code.
That's what this PR is already doing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit of building any PEXs at all with the new optimization? AFAICT building PEXs here puts us in the same boat as before w.r.t. maybe wanting
run_against_entire_lockfile
, because for each unique subset of 3rdparty deps we'll be building a PEX and that's non-ideal.
It's not possible to run (a PEX) without building a PEX currently. So in the end, a PEX is built, and then a venv is built from it to actually execute. The primary benefit of building the PEX is that it can be cached: i.e., if you lint or test twice in a row with no code changes, we'll hit the cache for the construction of the PEX, and the only thing that is invalidated is running the built PEX.
Building a requirements/third-party PEX independently means that as long as thirdparty requirements have not changed, you can hit the cache for the network-accessing/thirdparty portion of your build. Without that behavior, changes to either first party or third party code will go through PEX's resolution logic.
To see the difference (and again: note in my comment that "how important it is in practice is unclear"), you'd want to compare the difference between main
and this branch for a case like: "I ran a test once, made a non-import edit to my test code, and then re-ran". On main, that edit won't actually affect any requirements, and so won't need to hit the network / run resolution logic / consume the lockfile at all: on this branch, it will re-run the PEX resolve. Warm runs like this are an important case for local latency, and we need to account for them (even at the expense of CI / cold performance sometimes).
Sorry, really confused by this comment 😅
Yea, sorry: it's definitely confusing.
Don't actually build a requirements-only PEX, and instead hold onto the requirements which would have been used
...
Build a requirements-only PEX, and put it on the pex-path of another PEX containing user code.That's what this PR is already doing
Understood: I was trying to explain the behavior difference between main
and this branch.
From my perspective, by far the most relevant optimization is skipping the creation of the "repository PEX" (which contains the entire lockfile)... it's not clear to me that adjusting the behavior where we currently create a thirdparty-requirements PEX is necessarily a win.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On main, that edit won't actually affect any requirements, and so won't need to hit the network / run resolution logic / consume the lockfile at all: on this branch, it will re-run the PEX resolve.
I'm not seeing that behavior.
joshuacannon@CEPHANDRIUS:~/work/techlabs$ ./pants_from_sources test path/to/a/test_file.py
19:28:19.52 [INFO] Completed: Building pytest.pex from 3rdparty/python/lockfiles/pytest.txt
19:28:19.92 [INFO] Completed: Building local_dists.pex
19:28:21.69 [INFO] Completed: Building pytest_runner.pex from 3rdparty/python/lockfiles/repository.lock
19:28:22.14 [WARN] Failed to generate JUnit XML data for path/to/a/test_file.py.
19:28:22.14 [ERROR] Completed: Run Pytest - path/to/a/test_file.py failed (exit code 1).
Then I touch that file and sprinkle newlines
joshuacannon@CEPHANDRIUS:~/work/techlabs$ ./pants_from_sources test path/to/a/test_file.py
19:31:27.77 [WARN] Failed to generate JUnit XML data for path/to/a/test_file.py.
19:31:27.77 [ERROR] Completed: Run Pytest - path/to/a/test_file.py failed (exit code 1).
I can share the workunit JSON for proof too.
I think maybe the disconnect is we ARE still capturing the third-party reqs separate from first-party user code. pytest_runner.pex
is a VenvPEX we're building which won't contain the user code. Same for pylint
and mypy
.
Let me profile the reqs.pex way to see if my fear about exploding cache or perf is valid. Then we can compare implementations / side-effects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like both solutions have similar characteristics due to PEXs hardlinking/symlinking. So I think the difference between the two is now in the noise and we likely might want to switch gears into what we imagine the future state is, and which option best puts us in that direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the requirements.pex
way doesn't involve touching client-code, I'm actually leaning that direction. Thanks for the push in the right direction @stuhood
chosen_resolve = await Get( | ||
ChosenPythonResolve, ChosenPythonResolveRequest(request.addresses) | ||
) | ||
lock_path = chosen_resolve.lockfile_path | ||
requirements_file_digest = await Get( | ||
Digest, | ||
PathGlobs( | ||
[lock_path], | ||
glob_match_error_behavior=GlobMatchErrorBehavior.error, | ||
description_of_origin=chosen_resolve.description_of_origin, | ||
), | ||
) | ||
_digest_contents = await Get(DigestContents, Digest, requirements_file_digest) | ||
lock_bytes = _digest_contents[0].content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently duplicated in the giant rule in pex.py
currently 😞 Should we make a new rule?
@property | ||
def description_of_origin(self) -> str: | ||
return f"the resolve `{self.name}` (from `[python].resolves`)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's probably other places this can be leveraged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for doing this! I think that we should probably preserve the "create a RequirementsPex
independent of the user code" behavior, but it seems like that might actually simplify things a bit and avoid the need for a new type.
# NB: A blank req_strings means install the entire lockfile | ||
req_strings=FrozenOrderedSet([]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be good to document on the Lockfile
class rather-than/in-addition-to here. We haven't done a great job with docstrings on these internal APIs, but this in particular could be surprising.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's only true if the Lockfile is a PEX lockfile AFAICT. 😵
@dataclass(frozen=True) | ||
class PexReqs: | ||
requirements: Lockfile | PexRequirements | ||
pexes: Iterable[Pex] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most iterables would not be frozen and hashable, so consider using tuple
here instead.
|
||
|
||
@dataclass(frozen=True) | ||
class PexReqs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all hairy, and I apologize. But I think that the result as it stands might be a bit off. AFAICT, the workflows look like:
- If we are using the new optimization:
- Don't actually build a requirements-only PEX, and instead hold onto the requirements which would have been used
- When actually creating a PEX containing user code, embed the thirdparty requirements
- If we are not using the optimization:
- Build a requirements-only PEX, and put it on the pex-path of another PEX containing user code.
The differentiation between "a requirements-only PEX" (which contains only thirdparty requirements) and the PEX for user code was intentional (although how important it is in practice is unclear), because thirdparty requirements and your code's transitive thirdparty deps change much less frequently than anything else.
So I think that you'll actually want to preserve that aspect, by adjusting the optimization to always create a RequirementsPex
, but to do so directly from the Lockfile, rather than from a "repository PEX". That should (AFAICT?) avoid the need for the new type at all, because the existing RequirementsPex
creation code can be adjusted directly.
if python_setup.run_against_entire_lockfile: | ||
# NB: PEX treats no requirements as "install entire lockfile" | ||
req_strings: FrozenOrderedSet[str] = FrozenOrderedSet() | ||
# @TODO: complain deprecated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect that we should wait at least one release to see whether users who are already using this setting are actually able to stop before we suggest that anyone does.
request: PexReqsRequest, | ||
python_setup: PythonSetup, | ||
) -> PexReqs: | ||
if python_setup.enable_resolves: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the size of this if-block, consider inverting the condition and having the other branch be the early return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #14923 (comment). I think I'll make this a rule.
Successor: #14944 |
(Right now this PR targets 2.11.x, when it's ready I'll re-target
main
)Background
This PR has 2 facets:
requirements.pex
(orrepository.pex
) to just passing the requirements to PEX when building the final "tool" venv PEX (generallytool_runner.pex
).--lock
) so PEX doesn't need pip to resolve anything.The result of this is that:
[python].run_against_entire_lockfile
or not. It was always be more space and time performant this way (the thing to remember here is PEX will use symlinks (if told) or hardlinks to ensure we aren't copying the giant wheels around)--changed
now will only request the dependencies needed for the operation, and will only download what isn't already cached.Metrics
On 2.11.x (96d5178) Vs this PR (+ pex-tool/pex#1694)
For cold cache, I run the following commands to warm up
Notes:
[pex].venv_use_symlinks = true
Full lint
This case models a CI run, which might be on a completely cold machine (case-and-point this repo's GHA). It also models upgrades to Pants' version where the PEX args differ, as well as updates to the lockfile, both invalidating the cache.
time ./pants_from_sources lint ::
On 2.11.X, installing
repository.pex
took 8m24s seconds, and acts like a giant tentpole (#11303 and #14886 are very relevant)Incremental (--changed) lint
This case models developer runs.
(Since I'm not changing any files in my branch, I'll just ask to lint a single file. Then ask to lint a completely different file to model the "incremental" nature)
time ./pants_from_sources lint path/to/a_test_file.py
followed by
time ./pants_from_sources lint different-path/to/a_different_file.py
First file:
Second file:
(note this file has a transitive dependency on
mxnet
which is very large)