Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require that a target's interpreter_constraints are a subset of their dependencies' #15241

Closed
stuhood opened this issue Apr 23, 2022 · 5 comments · Fixed by #15373
Closed

Require that a target's interpreter_constraints are a subset of their dependencies' #15241

stuhood opened this issue Apr 23, 2022 · 5 comments · Fixed by #15373
Assignees
Milestone

Comments

@stuhood
Copy link
Member

stuhood commented Apr 23, 2022

(relates to #12652 and #11072)

Currently we do not require that the title of this ticket holds. Instead, the interpreter_constraints of a target are treated as the constraints of the sources of the target, rather than necessarily of the target and its dependencies (docs). In order to calculate the actual constraints of a particular target, we currently merge the interpreter_constraints of a collection of targets (with InterpreterConstraints.create_from_targets): either the target itself, its direct dependencies, or its transitive dependencies.

This is challenging from a few perspectives:

  1. Performance - Although changes like MyPy and Pylint partition inputs via CoarsenedTarget #15141 can avoid target graph walks at the @rule graph level, they cannot avoid re-walking in order to compute the full transitive set of ICs for each root target without additionally doing a memoized graph-aware merge of InterpreterConstraints.
  2. Clarity - Although how interpreter_constraints are merged is well documented, it remains non-intuitive that a field declared in a BUILD file is not necessarily something that your code will run or be linted with: only something that applies to the direct sources.

To apply the requirement from the title, rather than ever computing transitive AND'd constraints (which require memoized set merging to make efficient), we would instead eagerly fail if a target's constraints were not a subset of any of its dependencies' constraints (which only requires memoized pairwise subset checks). It would still be the case though that code which succeeds in some cases (where none, or only some of your dependencies are used) may fail in others (when the full transitive set is used).

The argument in favor of the current behavior of interpreter_constraints is that it allows targets to begin claiming compatibility with newer versions of the language before their dependencies do (described in this post).

@stuhood
Copy link
Member Author

stuhood commented Apr 23, 2022

@Eric-Arellano : I know that we have discussed this a few times, but I wasn't able to find a ticket covering the idea.

@stuhood stuhood changed the title Require that a target's interpreter_constraints are compatible with their dependencies Require that a target's interpreter_constraints are a subset of their dependencies' Apr 23, 2022
@Eric-Arellano
Copy link
Contributor

After all of our discussion this past year, I am now on board with this. I have no idea how we handle our deprecation policy though

stuhood added a commit that referenced this issue Apr 25, 2022
`CoarsenedTarget`s are structure shared, and because they preserve their internal structure, they can service requests for transitive targets for different roots from the same datastructure. Concretely: Mypy and Pylint can consume `CoarsenedTargets` to execute a single `@rule`-level graph walk, and then compute per-root closures from the resulting `CoarsenedTarget` instances.

This does not address #11270 in a general way (and it punts on #15241, which means that we still need per-root transitive walks), but it might provide a prototypical way to solve that problem on a case-by-case basis.

Performance wise, this moves cold `check ::` for ~1k files from:
* `main`: 32s total, and 26s spent in partitioning
*  `branch`: 19s total, and 13s spent in partitioning

The rest of the time is wrapped up in #15241.
@stuhood
Copy link
Member Author

stuhood commented Apr 25, 2022

#15141 brought partitioning time down to 9s for pantsbuild/pants, but it is still graph-exponential. I think that we should probably make this change.

From a deprecation perspective, we already know that all targets have "valid but potentially over-broad" ICs. So I think that we could add the code that would check the edgewise condition as a deprecation that explains "IC is X, but needs to be a subset of Y", and then convert it to an error and remove the set merging in a following version. Because we know that X always intersects Y, we could maybe actually render the computed intersection?

stuhood added a commit to stuhood/pants that referenced this issue Apr 25, 2022
…d#15141)

`CoarsenedTarget`s are structure shared, and because they preserve their internal structure, they can service requests for transitive targets for different roots from the same datastructure. Concretely: Mypy and Pylint can consume `CoarsenedTargets` to execute a single `@rule`-level graph walk, and then compute per-root closures from the resulting `CoarsenedTarget` instances.

This does not address pantsbuild#11270 in a general way (and it punts on pantsbuild#15241, which means that we still need per-root transitive walks), but it might provide a prototypical way to solve that problem on a case-by-case basis.

Performance wise, this moves cold `check ::` for ~1k files from:
* `main`: 32s total, and 26s spent in partitioning
*  `branch`: 19s total, and 13s spent in partitioning

The rest of the time is wrapped up in pantsbuild#15241.
# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
stuhood added a commit that referenced this issue Apr 25, 2022
…ck of #15141) (#15244)

`CoarsenedTarget`s are structure shared, and because they preserve their internal structure, they can service requests for transitive targets for different roots from the same datastructure. Concretely: Mypy and Pylint can consume `CoarsenedTargets` to execute a single `@rule`-level graph walk, and then compute per-root closures from the resulting `CoarsenedTarget` instances.

This does not address #11270 in a general way (and it punts on #15241, which means that we still need per-root transitive walks), but it might provide a prototypical way to solve that problem on a case-by-case basis.

Performance wise, this moves cold `check ::` for ~1k files from:
* `main`: 32s total, and 26s spent in partitioning
*  `branch`: 19s total, and 13s spent in partitioning

The rest of the time is wrapped up in #15241.
@stuhood
Copy link
Member Author

stuhood commented May 2, 2022

#15301 is also caused by this, although it is not in a hot path.

IMO, we should fix both this and that one together. It's likely that we can fall back to partitioning only if the condition from this issue does not hold (i.e.: if we render the deprecation warning, then we need to fall back to the slow implementation), so that we get the performance benefit immediately for users who are already following the constraint.

@stuhood
Copy link
Member Author

stuhood commented May 6, 2022

I'll start this today.

@stuhood stuhood added this to the 2.12.x milestone May 6, 2022
stuhood added a commit that referenced this issue May 11, 2022
…ir dependencies' (#15373)

As described in #15241: we currently compute per-target interpreter constraints by doing per-target graph walks, which is both a scalability bottleneck (because you can almost never use the constraints directly on a target: you must compute them), and complex for users to reason about.

This change adds a `ValidateDependenciesRequest` union, which allows backends to validate the computed dependencies of a target. The python backend uses validation to deprecate the condition from #15241. When that deprecation triggers, most/all callsites which currently use `create_from_compatibility_fields`, `create_from_targets`, or the new `compute_from_targets` can instead directly consume the ICs of a root target in the graph.

This change also (temporarily) adds an `InterpreterConstraints.compute_from_targets` method which computes (in transitive-dependency-linear time) that the dependencies of a target have a superset of its own `interpreter_constraints`. This method allows us to (again, temporarily: see above) apply the optimization of avoiding set merging if targets are already valid. 

Reduces the runtime of un-memoized-but-cached `./pants check ::` by 10%.

Fixes #15241, fixes #15301, fixes #11072.
stuhood added a commit to stuhood/pants that referenced this issue May 11, 2022
…ir dependencies' (pantsbuild#15373)

As described in pantsbuild#15241: we currently compute per-target interpreter constraints by doing per-target graph walks, which is both a scalability bottleneck (because you can almost never use the constraints directly on a target: you must compute them), and complex for users to reason about.

This change adds a `ValidateDependenciesRequest` union, which allows backends to validate the computed dependencies of a target. The python backend uses validation to deprecate the condition from pantsbuild#15241. When that deprecation triggers, most/all callsites which currently use `create_from_compatibility_fields`, `create_from_targets`, or the new `compute_from_targets` can instead directly consume the ICs of a root target in the graph.

This change also (temporarily) adds an `InterpreterConstraints.compute_from_targets` method which computes (in transitive-dependency-linear time) that the dependencies of a target have a superset of its own `interpreter_constraints`. This method allows us to (again, temporarily: see above) apply the optimization of avoiding set merging if targets are already valid. 

Reduces the runtime of un-memoized-but-cached `./pants check ::` by 10%.

Fixes pantsbuild#15241, fixes pantsbuild#15301, fixes pantsbuild#11072.
# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
stuhood added a commit that referenced this issue May 11, 2022
…ir dependencies' (Cherry-pick of #15373) (#15407)

As described in #15241: we currently compute per-target interpreter constraints by doing per-target graph walks, which is both a scalability bottleneck (because you can almost never use the constraints directly on a target: you must compute them), and complex for users to reason about.

This change adds a `ValidateDependenciesRequest` union, which allows backends to validate the computed dependencies of a target. The python backend uses validation to deprecate the condition from #15241. When that deprecation triggers, most/all callsites which currently use `create_from_compatibility_fields`, `create_from_targets`, or the new `compute_from_targets` can instead directly consume the ICs of a root target in the graph.

This change also (temporarily) adds an `InterpreterConstraints.compute_from_targets` method which computes (in transitive-dependency-linear time) that the dependencies of a target have a superset of its own `interpreter_constraints`. This method allows us to (again, temporarily: see above) apply the optimization of avoiding set merging if targets are already valid. 

Reduces the runtime of un-memoized-but-cached `./pants check ::` by 10%.

Fixes #15241, fixes #15301, fixes #11072.

[ci skip-build-wheels]
[ci skip-rust]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.

2 participants