| 
 | 1 | +# Caching in the new trait solver  | 
 | 2 | + | 
 | 3 | +Caching results of the trait solver is necessary for performance.  | 
 | 4 | +We have to make sure that it is sound. Caching is handled by the  | 
 | 5 | +[`SearchGraph`]  | 
 | 6 | + | 
 | 7 | +[`SearchGraph`]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L102-L117  | 
 | 8 | + | 
 | 9 | +## The global cache  | 
 | 10 | + | 
 | 11 | +At its core, the cache is fairly straightforward. When evaluating a goal, we  | 
 | 12 | +check whether it's in the global cache. If so, we reuse that entry. If not, we  | 
 | 13 | +compute the goal and then store its result in the cache.  | 
 | 14 | + | 
 | 15 | +To handle incremental compilation the computation of a goal happens inside of  | 
 | 16 | +[`DepGraph::with_anon_task`] which creates a new `DepNode` which depends on all queries  | 
 | 17 | +used inside of this computation. When accessing the global cache we then read this  | 
 | 18 | +`DepNode`, manually adding a dependency edge to all the queries used: [source][wdn].  | 
 | 19 | + | 
 | 20 | +### Dealing with overflow  | 
 | 21 | + | 
 | 22 | +Hitting the recursion limit is not fatal in the new trait solver but instead simply  | 
 | 23 | +causes it to return ambiguity: [source][overflow]. Whether we hit the recursion limit  | 
 | 24 | +can therefore change the result without resulting in a compilation failure. This  | 
 | 25 | +means we must consider the remaining available depth when accessing a cache result.  | 
 | 26 | + | 
 | 27 | +We do this by storing more information in the cache entry. For goals whose evaluation  | 
 | 28 | +did not reach the recursion limit, we simply store its reached depth: [source][req-depth].  | 
 | 29 | +These results can freely be used as long as the current `available_depth` is higher than  | 
 | 30 | +its `reached_depth`: [source][req-depth-ck]. We then update the reached depth of the  | 
 | 31 | +current goal to make sure that whether we've used the global cache entry is not  | 
 | 32 | +observable: [source][update-depth].  | 
 | 33 | + | 
 | 34 | +For goals which reach the recursion limit we currently only use the cached result if the  | 
 | 35 | +available depth *exactly matches* the depth of the entry. The cache entry for each goal  | 
 | 36 | +therefore contains a separate result for each remaining depth: [source][rem-depth].[^1]  | 
 | 37 | + | 
 | 38 | +## Handling cycles  | 
 | 39 | + | 
 | 40 | +The trait solver has to support cycles. These cycles are either inductive or coinductive,  | 
 | 41 | +depending on the participating goals. See the [chapter on coinduction] for more details.  | 
 | 42 | +We distinguish between the cycle heads and the cycle root: a stack entry is a  | 
 | 43 | +cycle head if it recursively accessed. The *root* is the deepest goal on the stack which  | 
 | 44 | +is involved in any cycle. Given the following dependency tree, `A` and `B` are both cycle  | 
 | 45 | +heads, while only `A` is a root.  | 
 | 46 | + | 
 | 47 | +```mermaid  | 
 | 48 | +graph TB  | 
 | 49 | +    A --> B  | 
 | 50 | +    B --> C  | 
 | 51 | +    C --> B  | 
 | 52 | +    C --> A  | 
 | 53 | +```  | 
 | 54 | + | 
 | 55 | +The result of cycle participants depends on the result of goals still on the stack.  | 
 | 56 | +However, we are currently computing that result, so its result is still unknown. This is  | 
 | 57 | +handled by evaluating cycle heads until we reach a fixpoint. In the first iteration, we  | 
 | 58 | +return either success or overflow with no constraints, depending on whether the cycle is  | 
 | 59 | +coinductive: [source][initial-prov-result]. After evaluating the head of a cycle, we  | 
 | 60 | +check whether its [`provisional_result`] is equal to the result of this iteration. If so,  | 
 | 61 | +we've finished evaluating this cycle and return its result. If not, we update the provisional  | 
 | 62 | +result and reevaluate the goal: [source][fixpoint]. After the first iteration it does not  | 
 | 63 | +matter whether cycles are coinductive or inductive. We always use the provisional result.  | 
 | 64 | + | 
 | 65 | +### Only caching cycle roots  | 
 | 66 | + | 
 | 67 | +We cannot move the result of any cycle participant to the global cache until we've  | 
 | 68 | +finished evaluating the cycle root. However, even after we've completely evaluated the  | 
 | 69 | +cycle, we are still forced to discard the result of all participants apart from the root  | 
 | 70 | +itself.  | 
 | 71 | + | 
 | 72 | +We track the query dependencies of all global cache entries. This causes the caching of  | 
 | 73 | +cycle participants to be non-trivial. We cannot simply reuse the `DepNode` of the cycle  | 
 | 74 | +root.[^2] If we have a cycle `A -> B -> A`, then the `DepNode` for `A` contains a dependency  | 
 | 75 | +from `A -> B`. Reusing this entry for `B` may break if the source is changed. The `B -> A`  | 
 | 76 | +edge may not exist anymore and `A` may have been completely removed. This can easily result  | 
 | 77 | +in an ICE.  | 
 | 78 | + | 
 | 79 | +However, it's even worse as the result of a cycle can change depending on which goal is  | 
 | 80 | +the root: [example][unstable-result-ex]. This forces us to weaken caching even further.  | 
 | 81 | +We must not use a cache entry of a cycle root, if there exists a stack entry, which was  | 
 | 82 | +a participant of its cycle involving that root. We do this by storing all cycle participants  | 
 | 83 | +of a given root in its global cache entry and checking that it contains no element of the  | 
 | 84 | +stack: [source][cycle-participants].  | 
 | 85 | + | 
 | 86 | +### The provisional cache  | 
 | 87 | + | 
 | 88 | +TODO: write this :3  | 
 | 89 | + | 
 | 90 | +- stack dependence of provisional results  | 
 | 91 | +- edge case: provisional cache impacts behavior  | 
 | 92 | + | 
 | 93 | + | 
 | 94 | +[`with_anon_task`]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L391  | 
 | 95 | +[wdn]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_middle/src/traits/solve/cache.rs#L78  | 
 | 96 | +[overflow]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L276  | 
 | 97 | +[req-depth]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_middle/src/traits/solve/cache.rs#L102  | 
 | 98 | +[req-depth-ck]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_middle/src/traits/solve/cache.rs#L76-L86  | 
 | 99 | +[update-depth]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L308  | 
 | 100 | +[rem-depth]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_middle/src/traits/solve/cache.rs#L124  | 
 | 101 | +[^1]: This is overly restrictive: if all nested goal return the overflow response with some  | 
 | 102 | +availabledepth `n`, then their result should be the same for any depths smaller than `n`.  | 
 | 103 | +We can implement this optimization in the future.  | 
 | 104 | +[chapter on coinduction]: ./coinduction.md  | 
 | 105 | +[`provisional_result`]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L57  | 
 | 106 | +[initial-prov-result]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L366-L370  | 
 | 107 | +[fixpoint]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L425-L446  | 
 | 108 | +[^2]: summarizing the relevant [zulip thread]  | 
 | 109 | +[zulip thread]: https://rust-lang.zulipchat.com/#narrow/stream/364551-t-types.2Ftrait-system-refactor/topic/global.20cache  | 
 | 110 | +[unstable-result-ex]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/tests/ui/traits/next-solver/cycles/coinduction/incompleteness-unstable-result.rs#L4-L16  | 
 | 111 | +[cycle-participants]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_middle/src/traits/solve/cache.rs#L72-L74  | 
0 commit comments