-
-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pex_binary sandbox: Traversing dep graph, stopping after package targets #19155
pex_binary sandbox: Traversing dep graph, stopping after package targets #19155
Conversation
This might be related to #15855 (and may even fix it?). |
As is, this won't fix any issues, but it does lay the ground work for such fixes. Hopefully those fixes will be as simple as replacing I suspect additional code changes will be required however, because this changes an assumption that has been around for awhile. So, there are likely to be other bits of code that need to be adjusted. Using this will be more complex for Python distributions, in particular, because of how dist "ownership" of sources is calculated. Hopefully that is not the case for pex binaries. |
Ooh! It looks like the PEX binaries case will be just that simple! pants/src/python/pants/backend/python/util_rules/pex_from_targets.py Lines 540 to 543 in a963b6d
pants/src/python/pants/backend/python/goals/package_pex_binary.py Lines 144 to 148 in a963b6d
|
I think there's likely a handful of places this would be extremely useful, and we may already be doing something similar. |
…ackagesRequest Next commit adds it to fix it.
…ingPackagesRequest
3c7e4bb
to
cf71d2f
Compare
I have a fix for pex_binaries #15855. I pushed the test to show it fails without the fix, and then I'll push the commit that uses the new
pants/src/python/pants/backend/docker/util_rules/docker_build_context.py Lines 258 to 278 in 5becc13
|
@@ -538,7 +539,7 @@ async def create_pex_from_targets( | |||
sources_digests.append(request.additional_sources) | |||
if request.include_source_files: | |||
transitive_targets = await Get( | |||
TransitiveTargets, TransitiveTargetsRequest(request.addresses) | |||
TransitiveTargets, TransitiveTargetsWithoutTraversingPackagesRequest(request.addresses) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the change required to fix #15855 (this line, and the rest of the logic behind it).
src/python/pants/backend/python/util_rules/pex_from_targets_test.py
Outdated
Show resolved
Hide resolved
…instead of a rule
Co-authored-by: Joshua Cannon <joshdcannon@gmail.com>
I extracted #19306 from this as well. That PR adds the I might open a separate PR for the |
This predicate solution only helps for explicit deps because, as you pointed out, dep inference does not add packages to the dep graph. For packages that can "own" particular sources--like |
What? The case I described is absolutely ubiquitous. AFAICT it exists in ~100% of the cases where some code depends on a first-party package. |
I feel like I'm misunderstanding something fundamental. When we traverse deps we follow inferred deps, right? And my point is that we will follow them in this case, bypassing the package target entirely. So we will still "see" library B as a "direct" dep of library A, which I thought is what we are trying to avoid (since we intend to pull that code in via the package)? I think I am missing the use case this is intending to fix. |
Ah I see what you're saying now. This whole time I was thinking of runnable binaries (like |
Although to be fair, doing things this way is better than what we have today, albeit not ideal. You'll net fewer dependencies (just not the minimum you might be expecting) |
Yes. We follow inferred deps.
Yes and no. Yes we will still My changes cannot replace the
Adding this deps traversal feature turned out to be a very easy win for Ok, here's the rabbit hole on my use case. My key use-case is the nFPM backend which creates system packages (rpm, deb, archlinux, and apk/Alpine). Packaging an
So, I wanted a list of transitive deps from a dep tree walk that does not traverse any package targets. Then I can sort those targets into lists of nFPM packages, other packagable targets, nfpm_content targets, and all other source files. Then I request packages to put in the sandbox with the remaining source files. |
(I made this diagram to help me think through the different pieces of the nfpm dependency tree: https://whimsical.com/pants-deps-U25HpngLy7qD1SDCoLXmdT ) |
Got it, thank you! That is exactly what I was missing. My thinking was too Python-centric. OK, so this makes sense to me now as a feature. And I think the implementation overall makes sense. I can review when this is out of draft. |
…plugins (#19306) This builds on #19272, adding another `should_traverse_deps_predicate` that stops dependency traversal at any package targets. This is mostly extracted from #19155. `TraverseIfNotPackageTarget` will be useful whenever a `TransitiveTargetsRequest`, `CoarsenedTargetsRequest`, or `DependenciesRequest` could benefit from treating package targets as leaves. This PR does not change any `TransitiveTargetsRequest`s because that is probably a change in user-facing behavior (even if it counts as a bugfix) and needs to be documented as such. This PR merely adds the feature, which, on its own, does not impact anything else. Related: - #18254 - #17368 - #15855 - #15082 --------- Co-authored-by: Andreas Stenius <andreas.stenius@imanage.com>
This PR only covers the |
@benjyw @stuhood @thejcannon this is out of draft now. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that PR is tiny, and does fix the linked bug I think it's good to go.
I'd love to see this evolve further to continue aligning with user expectations, but we can do that incrementally.
Thanks again for all your hard work here.
(If you don't mind, I'd feel good if one other maintainer approved as well 😉 )
roots=request.addresses, | ||
union_membership=union_membership, | ||
), | ||
), | ||
) | ||
sources = await Get(PythonSourceFiles, PythonSourceFilesRequest(transitive_targets.closure)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also do the right thing and try and package
the package targets for inclusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This is a philosophical question and likely shouldn't be addressed in this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then it's not sources but local dists? which is handled further down, iiuc..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's even weirder I guess. I'm thinking of other packageables, like a pex_binary
or an archive
or some plugin-provided packageable thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you saying to embed other PEXes/archives/etc... inside PEXes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure why not? If someone has declared that dependency (between a Python source and a pex_binary
) the intent is pretty clear to me... that code depends on that PEX.
roots=request.addresses, | ||
union_membership=union_membership, | ||
), | ||
), | ||
) | ||
sources = await Get(PythonSourceFiles, PythonSourceFilesRequest(transitive_targets.closure)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then it's not sources but local dists? which is handled further down, iiuc..
This builds on:
include_special_cased_deps
flag withshould_traverse_deps_predicate
#19272TraverseIfNotPackageTarget
deps traversal predicate for use in plugins #19306DepsTraversalBehavior(Enum)
and some docstrings #19387NB: This PR used to include the above PRs, but I split this PR up to make the pieces easier to review.
This PR only updates the
pex_binary
rule to use the newTraverseIfNotPackageTarget
predicate when requesting the source files that should be included in this pex. Wheels, other pex_binaries, and other package targets are not handled via this code path since those are not python sources that get included in the pex based on theinclude_sources
flag.Fixes #15855
This fixes #15855 because anything in a
python_distribution
does not need to be included as a source file in thepex_binary
. wheels get included via theLocalDists
rules. In some cases, we might still get more sources in the pex than intended. This might happen if a dependency is inferred between sources, bypassing the logic that looks for wheels that own the relevant things. In any case, this is PR provides an improvement and resolves the sample error in #15855.Related:
python_distribution
nets a dependency on all it's files and not just the wheel generated from it. #18254python_distribution
#15082