Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a cache for Chalk RecursiveSolver #11667

Closed
wants to merge 2 commits into from

Conversation

danielhuang
Copy link

@danielhuang danielhuang commented Mar 10, 2022

I've tested this with DEFAULT_QUERY_SEARCH_LIMIT set to 400000, and with automatic imports enabled without a minimum length. The first completion request takes around 1000ms, but repeating the same completion takes around 100ms using the cache.

One problem with this implementation is using ChalkCache as a wrapper. ChalkCache contains Arc<Cache<...>>, and Cache itself contains Arc<...>, resulting in 2 nested Arcs. This doesn't seem like a major concern, though.

Also, I'm not sure how to structure this. For one, calling chalk_cache seems to bounce back and forth between different files.

#7542

@flodiebold
Copy link
Member

flodiebold commented Mar 10, 2022

We can't reuse the cache like this because 1. it will give wrong results when the code changes (e.g. if you remove an impl, the cache might still contain results that use this impl), and 2. it will hide dependencies for the trait_solve query (e.g. solving a certain query would look up trait X, but that subgoal is already cached so it doesn't, so we don't see that dependency and don't recalculate the query when that trait changes).

(We actually used to reuse the solver with a Salsa query that recreated it whenever the database changed to get around this, but this resulted in worse performance, IIRC because it made us recalculate trait queries that didn't actually need to be recalculated all the time.)

@flodiebold
Copy link
Member

To expand on this a bit:

  • There are some plans in Chalk to allow for fine-grained caching with the recursive solver that integrates with Salsa, but it's complicated because the trait solver needs to handle cycles in a particular way that's currently not possible in Salsa.
  • We're currently relying more on Salsa caching per trait query. This should already cover simple cases like repeating the exact same completion request after a small change, so it's surprising that this would have an effect at all -- if you just changed the code inside a function, we should not recompute any trait queries. If you see that happening, that would be something that should be investigated.
  • Apart from those unwanted recalculations, another thing that we've seen was the Salsa revalidation (i.e. Salsa checking the dependencies of the trait_solve query) taking a lot of time. That would also be something interesting to investigate; there might be something weird happening that causes the query to have a huge amount of dependencies in some cases (e.g. checking all impls for some trait), or there might be some way to make the revalidation in Salsa more efficient.

@danielhuang
Copy link
Author

danielhuang commented Mar 10, 2022

We can't reuse the cache like this because 1. it will give wrong results when the code changes (e.g. if you remove an impl, the cache might still contain results that use this impl), and 2. it will hide dependencies for the trait_solve query (e.g. solving a certain query would look up trait X, but that subgoal is already cached so it doesn't, so we don't see that dependency and don't recalculate the query when that trait changes).

I've looked into this and have worked around this issue by using a different cache based on trait implementations and crate map. This isn't as effective as fine-grained caching, but it works somewhat effectively.

We're currently relying more on Salsa caching per trait query. This should already cover simple cases like repeating the exact same completion request after a small change, so it's surprising that this would have an effect at all -- if you just changed the code inside a function, we should not recompute any trait queries. If you see that happening, that would be something that should be investigated.

This is the main reason I am attempting to implement additional caching. Even when I ask for completion multiple times for the same code, some time is spent in infer:wait, which I believe spends most of its time for trait solving (although I'm not completely sure).

@danielhuang
Copy link
Author

danielhuang commented Mar 10, 2022

This is the profiling after the cache was warmed up:

   95ms - handle_completion
        7ms - iterate_method_candidates
            0   - crate_def_map:wait (40 calls)
            0   - deref_by_trait (1 calls)
            1ms - render_method (40 calls)
            0   - resolve_obligations_as_possible (2 calls)
            0   - trait_solve::wait (48 calls)
            5ms - ???
       86ms - import_on_the_fly @ 
           85ms - import_assets::search_for_imports
               85ms - import_assets::search_for
                   85ms - import_assets::trait_applicable_items
                       45ms - items_with_name @ Name: , crate: AssocItemsOnly, assoc items: Some("fetch_front_back_end"), limit: Some(400000)
                           45ms - find_items
                               24ms - query_external_importables
                                   24ms - search_dependencies @ Query { query: "", lowercased: "", name_only: true, assoc_items_only: true, search_mode: Fuzzy, case_sensitive: false, limit: 400000, exclude_import_kinds: {} }
                                        2ms - fst_path (5988 calls)
                                        3ms - import_map::Query::import_matches (18145 calls)
                                       18ms - ???
                                0   - crate_symbols (1 calls)
                               20ms - get_name_definition (738 calls)
                       38ms - iterate_method_candidates
                            0   - deref_by_trait (1 calls)
                            2ms - find_path_prefixed (46 calls)
                            0   - resolve_obligations_as_possible (2 calls)
                            3ms - trait_solve::wait (609 calls)
                           32ms - ???
                        0   - applicable_inherent_traits (1 calls)
                        0   - env_traits (1 calls)
                    0   - Semantics::analyze_impl (1 calls)
                    0   - import_assets::scope_definitions (1 calls)
            0   - Semantics::analyze_impl (1 calls)
            1ms - render_resolution (12 calls)
        0   - CompletionContext::new (1 calls)
        0   - Semantics::analyze_impl (2 calls)
        0   - complete_unqualified_path (1 calls)
        0   - crate_def_map:wait (157 calls)
        0   - deref_by_trait (1 calls)
        0   - find_path_prefixed (3 calls)
        0   - item::Builder::build (15 calls)
        0   - resolve_obligations_as_possible (2 calls)
        0   - trait_solve::wait (2 calls)

This testing was done with limits increased/disabled, and was for calling a method (item. // cursor after .)

The main functions taking time are items_with_name and iterate_method_candidates. items_with_name looks like it is returning a list of possible items, without calculating which ones are valid. I'm not completely sure what iterate_method_candidates does.

@flodiebold
Copy link
Member

We can't reuse the cache like this because 1. it will give wrong results when the code changes (e.g. if you remove an impl, the cache might still contain results that use this impl), and 2. it will hide dependencies for the trait_solve query (e.g. solving a certain query would look up trait X, but that subgoal is already cached so it doesn't, so we don't see that dependency and don't recalculate the query when that trait changes).

I've looked into this and have worked around this issue by using a different cache based on trait implementations and crate map. This isn't as effective as fine-grained caching, but it works somewhat effectively.

This does not solve the second problem, of missing dependencies for the trait_solve query. Also, the impls in one crate and the crate graph are by far not the only things that can affect the results of trait solving. Getting this correct would be extremely hard and very error-prone.

We're currently relying more on Salsa caching per trait query. This should already cover simple cases like repeating the exact same completion request after a small change, so it's surprising that this would have an effect at all -- if you just changed the code inside a function, we should not recompute any trait queries. If you see that happening, that would be something that should be investigated.

This is the main reason I am attempting to implement additional caching. Even when I ask for completion multiple times for the same code, some time is spent in infer:wait, which I believe spends most of its time for trait solving (although I'm not completely sure).

infer:wait without any entries below it can mean one of two things: 1. the infer query for the function is running in another thread, or 2. this is just time spent validating dependencies, and the query didn't need to run at all afterwards. It's probably 1, so it might help to e.g. disable semantic highlighting to get better profiling results.

I have one theory for why this might actually make things faster: If we do indeed spend a lot of time validating dependencies for trait_solve, this could be reduced by the cache because we're now missing lots of dependencies where the cache was used. This is not the right way to achieve this though, because those dependencies would now just be untracked, which would lead to wrong results.

As a more general note: Salsa is our main way of caching. Adding caches outside of Salsa, like this one for example, is mostly not a good idea because it is very hard to integrate them correctly, and it is very hard to test that they are integrated correctly because bugs will only show up when you change the code in specific ways.

@flodiebold
Copy link
Member

Also, I'm not sure how useful it is to disable the flyimports when testing this. Allowing flyimports without prefix is not really a goal right now, and making that faster will not necessarily make the normal completions or even flyimports with prefix faster.

@danielhuang
Copy link
Author

danielhuang commented Mar 11, 2022

Allowing flyimports without prefix

I believe this is how IntelliJ Rust works. First, they calculate the simple completions and show them. Afterward, they calculate all possible completions asynchronously. However, I don't think this is possible with current LSP.

@Veykril
Copy link
Member

Veykril commented May 27, 2022

Closing this for the time being as this doesn't seem to be the right way to solve the problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants