Improve performance of `TransitiveTargets`/`CoarsenedTargets` #11270

stuhood · 2020-12-08T20:45:39Z

When #10409 introduced cycle tolerance, it also removed structure sharing of TransitiveTarget instances. This meant that if you made 100 TransitiveTargets requests, much less of their work could be reused, and in particular, cycle detection could not be reused. Cycle detection is cheap enough to do for one concurrent instance, but not to do hundreds of times redundantly.

The text was updated successfully, but these errors were encountered:

stuhood · 2020-12-08T20:46:29Z

Unfortunately, this is blocked by some rule graph craziness that I'd like to lean in to fix via #11269. In the meantime, I'm going to revert the original fix for #11201, and raise the recursion limit.

#11202 introduced a lot of contention on the GIL that was only obvious when larger numbers of targets were being built. #11270 is a holistic fix for that issue, but it is currently blocked on #11269. In the meantime, we will land #11271 to dodge the original issue in #11201 to get us back to a medium-slow-but-working state, and then follow up on #11269 and #11270 to get us to the best place. This reverts commit fabcb45. [ci skip-build-wheels]

…sbuild#11272) In the meantime, we will land pantsbuild#11271 to dodge the original issue in pantsbuild#11201 to get us back to a medium-slow-but-working state, and then follow up on pantsbuild#11269 and pantsbuild#11270 to get us to the best place. This reverts commit fabcb45. [ci skip-build-wheels]

…11272) (#11277) #11202 introduced a lot of contention on the GIL that was only obvious when larger numbers of targets were being built. #11270 is a holistic fix for that issue, but it is currently blocked on #11269. In the meantime, we will land #11271 to dodge the original issue in #11201 to get us back to a medium-slow-but-working state, and then follow up on #11269 and #11270 to get us to the best place. This reverts commit fabcb45. [ci skip-build-wheels]

…11272) (#11275) #11202 introduced a lot of contention on the GIL that was only obvious when larger numbers of targets were being built. #11270 is a holistic fix for that issue, but it is currently blocked on #11269. In the meantime, we will land #11271 to dodge the original issue in #11201 to get us back to a medium-slow-but-working state, and then follow up on #11269 and #11270 to get us to the best place. This reverts commit fabcb45. [ci skip-build-wheels]

stuhood · 2022-02-08T20:09:56Z

I believe that rather than necessarily internally structure-sharing for multiple distinct TransitiveTargets requests, there might be another solution based on adjusting TransitiveTargets to be more like CoarsenedTargets... possibly by becoming a wrapper around CoarsenedTargets, but also possibly by just becoming a fully expanded map of targets similar to _DependencyMapping.

Rather than making a TransitiveTargets request per root target, you would make a request for all root targets, and then use methods of the return type to partition and recurse on any number of roots.

Eric-Arellano · 2022-02-08T20:35:49Z

How hard would that be to do? I suspect it will make a tangible impact on Pylint and MyPy -- the dedicated rule for partitioning seems to take at ~1 second on my very fast M1 in a not very huge repository..

stuhood · 2022-02-08T21:59:07Z

How hard would that be to do?

Not too hard? I suspect that the biggest challenge will be updating all of the relevant callsites to avoid making lots of requests in parallel. But that could be incremental, since TT could continue to expose its existing methods.

`CoarsenedTarget`s are structure shared, and because they preserve their internal structure, they can service requests for transitive targets for different roots from the same datastructure. Concretely: Mypy and Pylint can consume `CoarsenedTargets` to execute a single `@rule`-level graph walk, and then compute per-root closures from the resulting `CoarsenedTarget` instances. This does not address #11270 in a general way (and it punts on #15241, which means that we still need per-root transitive walks), but it might provide a prototypical way to solve that problem on a case-by-case basis. Performance wise, this moves cold `check ::` for ~1k files from: * `main`: 32s total, and 26s spent in partitioning * `branch`: 19s total, and 13s spent in partitioning The rest of the time is wrapped up in #15241.

…d#15141) `CoarsenedTarget`s are structure shared, and because they preserve their internal structure, they can service requests for transitive targets for different roots from the same datastructure. Concretely: Mypy and Pylint can consume `CoarsenedTargets` to execute a single `@rule`-level graph walk, and then compute per-root closures from the resulting `CoarsenedTarget` instances. This does not address pantsbuild#11270 in a general way (and it punts on pantsbuild#15241, which means that we still need per-root transitive walks), but it might provide a prototypical way to solve that problem on a case-by-case basis. Performance wise, this moves cold `check ::` for ~1k files from: * `main`: 32s total, and 26s spent in partitioning * `branch`: 19s total, and 13s spent in partitioning The rest of the time is wrapped up in pantsbuild#15241. # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

…ck of #15141) (#15244) `CoarsenedTarget`s are structure shared, and because they preserve their internal structure, they can service requests for transitive targets for different roots from the same datastructure. Concretely: Mypy and Pylint can consume `CoarsenedTargets` to execute a single `@rule`-level graph walk, and then compute per-root closures from the resulting `CoarsenedTarget` instances. This does not address #11270 in a general way (and it punts on #15241, which means that we still need per-root transitive walks), but it might provide a prototypical way to solve that problem on a case-by-case basis. Performance wise, this moves cold `check ::` for ~1k files from: * `main`: 32s total, and 26s spent in partitioning * `branch`: 19s total, and 13s spent in partitioning The rest of the time is wrapped up in #15241.

stuhood · 2022-06-08T21:37:06Z

Going to revisit this one.

#15141 fully fixed mypy since it is always monolithic, and not broken up into batches. But when pylint is broken up into batches (due to the default --lint-batch-size=128 option), it ends up making one request for CoarsenedTargets per batch (which is effectively equivalent to having made one TransitiveTargets request per batch: hence the title change for this issue).

I'm going to spend a little bit of time thinking through whether there is a holistic solution to avoid needing to change APIs. But in the absence of that, adjusting both the lint and test APIs to take CoarsenedTargets for their roots would mean that we did a single @rule-graph level graph walk for the entire goal, rather than doing one per lint-batch or test.

stuhood · 2022-06-18T00:01:33Z

I gave this one some thought last week, and came to the conclusion that while structure sharing is still helpful, recursive memoization is just not feasible in the presence of graph cycles. Which is part of why when we added support for cycles, we did so by moving TT/CT calculations from recursion to iteration.

But iteration is also memoizable: usually via a map/dict on the stack of your iteration function. Unlike recursion though, it isn't referentially transparent, and so requires (visible) mutability.

So: I think that by adding a new feature to the Engine/Graph to mark a @rule output as mutable (either by adding a new field to EngineAwareReturnType, or by marking the @rule itself somehow), we can memoize the iteration of transitive dependency computation, by consuming a mutable map of structure-shared TransitiveTarget instances.

Supporting mutable state in the Graph seemingly amounts to adding all of the dependencies of any Graph Node which depends on the mutable state to the mutable state (or, equivalently: putting the consumer Node into a cycle with the mutable state it is consuming, such that either Node being invalidated would invalidate the other). This accounts for the fact that a @rule might put any of its inputs into the mutable state.

stuhood added the performance label Dec 8, 2020

This was referenced Dec 8, 2020

Revert "Move graph cycle detection to rust. (#11202)" #11272

Merged

Target Hitting Recursion Limit During Pants Setup (with workaround) #11201

Closed

stuhood mentioned this issue Dec 8, 2020

Revert "Move graph cycle detection to rust. (#11202)" (cherrypick of #11272) #11275

Merged

stuhood mentioned this issue Dec 8, 2020

Revert "Move graph cycle detection to rust. (#11202)" (cherrypick of #11272) #11277

Merged

Eric-Arellano mentioned this issue Jan 14, 2021

Avoid resolving all targets for first-party module mapping #11459

Closed

Eric-Arellano added the help wanted label Jan 30, 2021

stuhood mentioned this issue Feb 4, 2021

Improve performance of running TransitiveTargets one target at-a-time #10863

Closed

stuhood added blocked and removed help wanted labels Jun 23, 2021

Eric-Arellano mentioned this issue Aug 5, 2021

Add [python-setup].disable_mixed_interpreter_constraints for repositories not using the interpreter_constraints field #12495

Closed

stuhood mentioned this issue Aug 10, 2021

Reintroduce structure-sharing for BUILD dependencies #12539

Closed

benjyw removed the performance label Sep 9, 2021

stuhood added this to 2022 Performance Push Feb 8, 2022

stuhood mentioned this issue Apr 13, 2022

MyPy and Pylint partition inputs via CoarsenedTarget #15141

Merged

stuhood mentioned this issue Apr 25, 2022

MyPy and Pylint partition inputs via CoarsenedTarget (Cherry-pick of #15141) #15244

Merged

stuhood mentioned this issue Jun 8, 2022

Add "escape hatch" to avoid long-running Determine if necessary to partition Pylint input #15784

Closed

stuhood changed the title ~~Re-introduce structure sharing for TransitiveTargets~~ Re-introduce structure sharing for TransitiveTargets/CoarsenedTargets Jun 8, 2022

stuhood self-assigned this Jun 8, 2022

stuhood changed the title ~~Re-introduce structure sharing for TransitiveTargets/CoarsenedTargets~~ Improve performance of TransitiveTargets/CoarsenedTargets Jun 17, 2022

stuhood mentioned this issue Sep 2, 2022

Fix PylintPartition and MyPyPartition to not store the transitive closure #16761

Merged

stuhood linked a pull request Oct 29, 2022 that will close this issue

Introduce "mutable" @rule outputs, and use them to memoize transitive graph walks #17394

Draft

stuhood mentioned this issue Nov 7, 2022

Filter coarsened targets to only those that are relevant in pylint runner rule #17487

Merged

cognifloyd mentioned this issue May 25, 2023

pex_binary sandbox: Traversing dep graph, stopping after package targets #19155

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of `TransitiveTargets`/`CoarsenedTargets` #11270

Improve performance of `TransitiveTargets`/`CoarsenedTargets` #11270

stuhood commented Dec 8, 2020 •

edited

Loading

stuhood commented Dec 8, 2020

stuhood commented Feb 8, 2022 •

edited

Loading

Eric-Arellano commented Feb 8, 2022

stuhood commented Feb 8, 2022

stuhood commented Jun 8, 2022 •

edited

Loading

stuhood commented Jun 18, 2022 •

edited

Loading

Improve performance of TransitiveTargets/CoarsenedTargets #11270

Improve performance of TransitiveTargets/CoarsenedTargets #11270

Comments

stuhood commented Dec 8, 2020 • edited Loading

stuhood commented Dec 8, 2020

stuhood commented Feb 8, 2022 • edited Loading

Eric-Arellano commented Feb 8, 2022

stuhood commented Feb 8, 2022

stuhood commented Jun 8, 2022 • edited Loading

stuhood commented Jun 18, 2022 • edited Loading

Improve performance of `TransitiveTargets`/`CoarsenedTargets` #11270

Improve performance of `TransitiveTargets`/`CoarsenedTargets` #11270

stuhood commented Dec 8, 2020 •

edited

Loading

stuhood commented Feb 8, 2022 •

edited

Loading

stuhood commented Jun 8, 2022 •

edited

Loading

stuhood commented Jun 18, 2022 •

edited

Loading