Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for cycle-tolerant "weak" Gets #10230

Merged
merged 6 commits into from
Jul 17, 2020

Conversation

stuhood
Copy link
Member

@stuhood stuhood commented Jul 1, 2020

Problem

As covered in #10059: in order to support fine-grained dependency inference on real world graphs (which very frequently involve file-level dependency cycles), we'd like to very carefully introduce a form of cycle-tolerance.

Solution

Introduce the concept of "weak" Gets (exposed via await Get(.., weak=True)) which can return None if a cycle would be formed that would cause a @rule to depend on its own result. A weak Get is intended to be similar in practice to a weak reference, and correspondingly non-weak edges are referred to as "strong" edges.

A few changes were made to the Graph to support this, but the most fundamental was that we now allow cycles in the Graph, as long as they don't involve any running Nodes. As described in the docstring for graph::Graph, we record dependencies for two reasons: 1) invalidation, and 2) deadlock detection. Invalidation is not concerned by cycles, and a deadlock can only occur in running code and thus does not need to walk non-live edges.

Result

Tests demonstrate that @rules are able to successfully consume "weak-weak" cycles, where a graph like:

digraph {
  A -> B [label=weak, style=dashed];
  B -> A [label=weak, style=dashed];
}

... produces the set {A, B} when entered from either direction. Followup work will additionally allow for "strong-weak" cycles, which "mostly" work right now, but which currently have behavior that depends on where the cycle is entered. See #10229 and the ignored test in this change.

@stuhood stuhood changed the title Add support for cycle-resistant "weak" Gets Add support for cycle-tolerant "weak" Gets Jul 1, 2020
@stuhood stuhood marked this pull request as ready for review July 2, 2020 00:12
@stuhood
Copy link
Member Author

stuhood commented Jul 2, 2020

The commits are mostly useful to review independently, but the "Add engine support" and "Allow cycles to exist in the graph" commits overlap a bit.

src/rust/engine/graph/src/tests.rs Outdated Show resolved Hide resolved
src/rust/engine/src/intrinsics.rs Outdated Show resolved Hide resolved
stuhood added a commit that referenced this pull request Jul 16, 2020
### Problem

I spent a few hours debugging why a boolean param was turning into an integer in some cases in the simple test in #10230, and determined that it was due to the behavior that `1 == True` and `0 == False` in Python. This impacts the engine in strange ways. For example: before the fix, the test in this change that attempts to use both a `bool` and an `int` fails with:
```
Exception: Values used as `Params` must have distinct types, but the following values had the same type (`int`):
  1
  1
``` 

### Solution

Always include types in the engine's definition of equality. Although this affects more than just memoization/interning, my expectation is that in any possible position where the engine will use `equals` the default behavior would be undesirable.
* Store the RunToken in the NodeContext to allow it to be used while
  creating edges, and label edges with the RunToken that created them.
* Rename RunId back to SessionId to avoid confusion with RunToken.
* Label edges with either Strong or Weak: not really used yet, but will
  be in pantsbuild#10229.
* Cycle detect only on "live" edges: see `live_edge_predicate`. This
  allows for removing the graph mutation and restarts in `report_cycle`.
@stuhood stuhood merged commit 0ad3a56 into pantsbuild:master Jul 17, 2020
@stuhood stuhood deleted the stuhood/weak-get branch July 17, 2020 16:37
stuhood added a commit that referenced this pull request Jul 17, 2020
### Problem

`retain_edges` is crazy slow, but the simplicity of the API allured me late in the implementation of #10230, after I had finished testing the performance of the change.

### Solution

Switch to `filter_map`, which we use elsewhere to good effect, and which a previous version of #10230 had used.

### Result

When repeatedly re-running tests with light edits, edge garbage collection goes from multiple seconds to single digit milliseconds.
stuhood added a commit that referenced this pull request Jul 22, 2020
…in file-addresses. (#10409)

### Problem

After more deeply investigating the issues with #10230 (demonstrated in #10393 and its revert), it doesn't seem like the right idea to rely on the engine's cycle detection (which the implementation of #10230 showed should primarily be used for deadlock detection) to expose high level cycle tolerance for the build graph.

### Solution

Move to iterative transitive target expansion (à la #10046), and cycle detect afterward using DFS with a stack of visited entries. 

### Result

Fixes #10059, and closes #10229. A followup will back out portions of #10230 (but not all of it, as there were a number of other improvements mixed in).

[ci skip-rust-tests]
stuhood added a commit to stuhood/pants that referenced this pull request Sep 17, 2020
stuhood added a commit to stuhood/pants that referenced this pull request Sep 17, 2020
stuhood added a commit that referenced this pull request Sep 17, 2020
### Problem

"weak" `Get`s were a potential solution for optionally allowing for cycles in the runtime `Graph`, but they were only partially implemented (see #10229), and not particularly trustworthy. We ended up going with a higher level solution in #10409, but the weak-`Get`s code still exists, and increases the complexity of the `graph` crate unnecessarily.

### Solution

Revert #10230. Closes #10229 and closes #10394.

[ci skip-build-wheels]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants