Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split ghcide actions into different descriptors #1857

Merged
merged 7 commits into from
May 25, 2021

Conversation

berberman
Copy link
Collaborator

This PR attempts to split the big code action provider in ghcide. I want to minimize the trivial changes in implementation functions (suggestExtendImport, suggestImportDisambiguation, etc.), so I refactored Development.IDE.Plugin.CodeAction.Args to keep the automatic function arguments passing, making it compatible with the old code. However, iiuc the provider should be broken down into different providers, rather than descriptors like this PR does. One of our initial goals is to make those code actions switchable, but providers are anonymous, whereas descriptors have their names, so it's not clear how to generate configuration per provider. Thus this PR introduces some new descriptors for ghcide code actions. This might not be reasonable, but with the abstraction of CodeActionArgs, it won't be hard to reorganize them in providers.

Now the code actions are grouped blindly and I havn't tested them

@pepeiborra
Copy link
Collaborator

We want descriptors, not providers, so this PR is doing the right thing. I may have said provider where I really meant descriptor.

@berberman berberman marked this pull request as ready for review May 23, 2021 02:21
@pepeiborra
Copy link
Collaborator

@jneira Does the split have the granularity that you expected? It seems fine to me, but it might not be enough to support disabling specific code actions.

@jneira
Copy link
Member

jneira commented May 23, 2021

fine with me too, if this pr open the way to disable cas even in small groups we could wait to issues asking for

Copy link
Collaborator

@pepeiborra pepeiborra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, awesome work @berberman

@berberman berberman changed the title [WIP] Split ghcide actions into different descriptors Split ghcide actions into different descriptors May 24, 2021
@berberman berberman merged commit 13a2cc2 into haskell:master May 25, 2021
@berberman berberman deleted the ghcide-ca-split branch May 25, 2021 13:33
@pepeiborra
Copy link
Collaborator

Out of curiosity I just checked the benchmark results, and they look quite impressive:

  • code actions: total time doesn't change, but user spends wayyy less time waiting (15%). I'm not sure where this improvement comes from (did we change to use delayed data?) but it's nice.
  • code actions after edit: total time does decrease significantly (~35%), which is probably thanks to the extra parallelism.
version    name                             success   samples   startup              setup                   userTime             delayedTime             totalTime            maxResidency   allocatedBytes
upstream   code actions                     True      50        2.7157943            2.1442910000000003e-2   7.908476607000001    2.5013517999999995e-2   7.938987232000001    171MB          6844MB
HEAD       code actions                     True      50        2.676736845          1.5433488e-2            1.183338935          7.195596829000001       8.384399064          170MB          6933MB
upstream   code actions after edit          True      50        2.455734591          0.0                     34.89700874300001    2.9194995000000005e-2   34.932517633         195MB          29005MB
HEAD       code actions after edit          True      50        2.566527158          0.0                     14.851208827999997   8.068489619000003       22.927151044000002   195MB          18714MB

@Ailrun
Copy link
Member

Ailrun commented May 25, 2021

What happened to memory allocation (for "code actions after edit" bench)? That's quite impressive.

@berberman
Copy link
Collaborator Author

I'm not sure where this improvement comes from (did we change to use delayed data?)

Previously, we waited all used rules run finished, including TypeCheck, GetHieAst, GetBindings, etc., and then compute code actions. But only suggestHideShadow is using typechecked ast and hie ast, so there is no need for other code action computations to wait typecheck finished. This PR made rule results shared per group, so only code actions in bindingsPluginDescriptor will not be calculated until the typecheck finished.

If I'm right, I guess moving suggestHideShadow out to a standalone plugin descriptor will make it even faster 🤔

@berberman
Copy link
Collaborator Author

We also can enable parallelism in each group, but I'm not sure whether the contention will make it worse, given:

-- | There's no concurrency in each provider,
-- so we don't need to be thread-safe here
onceIO :: MonadIO m => IO a -> m (IO a)
onceIO io = do
var <- liftIO $ newIORef Nothing
pure $
readIORef var >>= \case
Just x -> pure x
_ -> io >>= \x -> writeIORef' var (Just x) >> pure x

@pepeiborra
Copy link
Collaborator

Do we need that onceIO at all? Shake already guarantees that these rules will only be run once in the same session, and subsequent reruns will be lookups. So the onceIO can be dropped

@berberman
Copy link
Collaborator Author

But a delayed action will still be enqueued, no? Let's try removing onceIO and see what will happen.

@berberman
Copy link
Collaborator Author

After removing onceIO:

version    name                             success   samples   startup              setup                   userTime             delayedTime
upstream   code actions                     True      50        2.079015637          2.4707264000000003e-2   3.6805140430000005   5.06126036
HEAD       code actions                     True      50        2.554838365          0.0                     1.3539628190000002   7.1702762180000015
upstream   code actions after edit          True      50        2.210849848          0.0                     18.546111556         6.303753765000001
HEAD       code actions after edit          True      50        2.427225427          0.0                     24.017535580999997   7.096961165000001

@pepeiborra
Copy link
Collaborator

pepeiborra commented May 26, 2021

But a delayed action will still be enqueued, no? Let's try removing onceIO and see what will happen.

So you are saying that we will end up with multiple copies of the same redundant delayed action? This is true, but I'm assuming that the actual work will only be done once.

What does the benchmark show, is upstream the version with onceIO and HEAD the version without?

@berberman
Copy link
Collaborator Author

berberman commented May 26, 2021

What does the benchmark show, is upstream the version with onceIO and HEAD the version without?

Yes, I think HEAD is the version without onceIO: berberman@6201d94

@pepeiborra
Copy link
Collaborator

The numbers in the benchmark look contradictory. Why is userTime 66% in the "code actions" experiment, but 33% higher in the "code actions after edit" exp.?

@berberman
Copy link
Collaborator Author

berberman commented May 26, 2021

I re-run the benchmark CI. It seems that the gap is narrowed, but HEAD is still slightly slower than upstream overall.

version    name                             success   samples   startup              setup          userTime             delayedTime             totalTime            maxResidency   allocatedBytes
upstream   code actions                     True      50        2.039180623          0.0            1.6012565329999997   5.546552721000003       7.152452846          169MB          6887MB
HEAD       code actions                     True      50        2.318210653          1.2344288e-2   1.1096260069999995   7.146050902000002       8.261057349          170MB          6945MB
upstream   code actions after edit          True      50        2.171567757          0.0            21.922519571000002   5.966397669             27.894771285         196MB          26335MB
HEAD       code actions after edit          True      50        2.4681653640000003   0.0            22.032106895000002   6.8835301719999995      28.921297008000003   195MB          26841MB

@pepeiborra
Copy link
Collaborator

Based on those results, I would remove the onceIO. Otherwise, contributors will think that they need to do their own caching and we'll end up duplicating state and introducing subtle bugs.

@berberman
Copy link
Collaborator Author

berberman commented May 27, 2021

Otherwise, contributors will think that they need to do their own caching

I understand your concern, but the situation here is very different, and I'm still a bit worried about the overhead of executing those IO actions many times. For example, ToCodeAction instances help us package the following function:

suggestExtendImport :: ExportsMap -> ParsedSource -> Diagnostic -> [(T.Text, CodeActionKind, Rewrite)]

to something like:

f :: CodeActionArgs -> IO [(T.Text, CodeActionKind, [TextEdit])]
f CodeActionArgs {..} = do
  x <- caaExportsMap
  y <- fmap astA <$> caaAnnSource
  case y of
    Just ps -> do
      let results = suggestExtendImport x ps caaDiagnostic
      concat
        <$> mapM
          ( \(a, b, c) -> do
              d <- caaDf``
              e <- fmap annsA <$> caaAnnSource
              case (d, e) of
                (Just df, Just anns)
                  | Right edit <- rewriteToEdit df anns c -> pure [(a, b, edit)]
                _ -> pure []
          )
          results
    _ -> pure []

where

data CodeActionArgs = CodeActionArgs
{ caaExportsMap :: IO ExportsMap,
caaIdeOptions :: IO IdeOptions,
caaParsedModule :: IO (Maybe ParsedModule),
caaContents :: IO (Maybe T.Text),
caaDf :: IO (Maybe DynFlags),
caaAnnSource :: IO (Maybe (Annotated ParsedSource)),
caaTmr :: IO (Maybe TcModuleResult),
caaHar :: IO (Maybe HieAstResult),
caaBindings :: IO (Maybe Bindings),
caaGblSigs :: IO (Maybe GlobalBindingTypeSigsResult),
caaDiagnostic :: Diagnostic
}

As you can see, for N results produced by suggestExtendImport, caaDf is executed N times and caaAnnSource is executed N + 1 times. I don't know if our experiments can reflect this fact. Although we avoid unnecessary runs of rules in defineEarlyCutoff' with shake, without the help of onceIO, runAction will enqueue many many duplicate DelayedActions to the action queue. Even worse, functions like suggestExtendImport in the same plugin descriptor will not be able to share the rule resuts, ending up with more runActions. Apart from that, ExportsMap may also be calculated repeatedly.

@pepeiborra
Copy link
Collaborator

I see what you mean now, thank you for elaborating the argument. I agree that this is a bit different because every provider may execute an Action multiple times. This is odd!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants