Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Experimental] [WIP] Transparent Compiler #15179

Merged
merged 326 commits into from
Jan 16, 2024
Merged

[Experimental] [WIP] Transparent Compiler #15179

merged 326 commits into from
Jan 16, 2024

Conversation

0101
Copy link
Contributor

@0101 0101 commented May 3, 2023

Transparent Compiler

Synopsis

This is an attempt to replace the current BackgroundCompiler (powered by IncrementalBuilder) with a new system that will be easier to understand and change and that will also be in line with #11976

Issues with current Background Compiler

What bothers me a lot with current IncrementalBuilder / GraphNode / BoundModel system is the opaqueness of the workflow. It's often difficult to follow the code because it suddenly calls a graph node, which contains a computation which was constructed at some prior point but now there's no link to the actual code that will be executed. Hence the name Transparent compiler and the attempt to make this better 😄 (mostly I had no good idea for what to call this).

How is this different

The idea is to replace the GraphNode with a sort of cache/memoization, so that the computation can be accessed from anywhere given the correct key/input. Instead of having to have a reference to a pre-constructed graph which needs to be stored and updated.

So the resulting code should look mostly as if we're calculating everything from scratch for every single request - hopefully simple and with no mis-direction - except some intermediate steps (and the whole result as well) will be cached/memoized.

Will this work?

🤷 I don't know, but I want to try.

Goals

Functional goals

  • Work with editor buffers instead of the filesystem (in a more natural way than the current "live buffers" solution which relies on notifications and a callback function to retrieve file contents)
  • Leverage dependency graph to avoid unnecessary computation and enable parallelization

Non-functional goals

  • Immutable compilation model - avoid state in the background compiler. Get everything we need as input and return a result.
  • Easily understandable code

Possible future benefits

  • When we move to out-of-process/LSP model, caches from parsing and type checking in the IDE can be re-used in building the assemblies

Technical details

Caching / memoization

We need to deal with concurrency and cancellation and we don't want to repeat the same computations. To do this we use a similar system that was already implemented in GraphNode, but never used. The cache will store a Task (or node, still to be decided) and will serialize all requests with a MailboxProcessor an async lock which makes sure to only start the computation once and keep track of request count to see if it should be canceled.

API

We need a new format for parsing/checking requests that will contain everything needed to perform the operations. The major difference to the current API is replacing FSharpProjectOptions.SourceFiles file, which is now just a string with path, with a record that will also have a Version (which we will fully rely on when caching) and a GetSource: unit -> Task<ISourceText> which we will call if we need to get the contents.

The output/results should remain the same.

Transition from current background compiler

The new API is a superset of the old one so we can keep all the old code and switch between them with a configuration flag. This way we can offer it to early adopters for testing with a way to switch back if it doesn't work properly.
Also during developmentTransparentCompilerkeeps it's own instance ofBackgroundCompiler` and uses it for workflows that are not yet implemented, so it can be tested before completing everything.

Status

It's a work in progress. Feel free to have a look and give any kind of feedback!

You can also check out this branch, build it and try it out (in VS, it can be enabled in F# advanced options). It should more or less work for F#-only solutions that don't do anything fancy.

Some of the things that still need to be done:

  • Enable tests with TC in CI
  • More benchmarks
  • Support for in-memory C# project references
  • Support for scripts
  • Support for type providers
  • Preserve stack traces in AsyncMemoize
  • Enable requesting 2 levels of checking a file - fastest possible to get diagnostics and semantic classification - and all-inclusive for all IDE features, such as completions that could include things from files we don't depend on
  • Plug in to existing tests for FSharpChecker
  • Figure out merging TcInfos from files processed in parallel
  • Support for itemKeyStore and other stuff from TcInfoExtras
  • Reuse TcIntermediate results in ParseAndCheckFileInProject if available
  • Make it work properly with signature files with NodeToTypeCheck
  • Support for in-memory F# project references
  • Support for on-disk project references
  • Make sure we don't hold on to stuff that should be GC'd in AsyncMemoize
  • Implement all background compiler APIs
  • Allow using a DLL on disk if it's up to date rather than doing a in-memory type check of that project
  • Don't (strongly) keep different versions of the same thing in the caches
  • ? Add some types around AsyncMemoize to prevent accidentally using the wrong key vs. computation input
  • ? Get rid of NodeCode and use just Task (maybe cancellable task builder?) need it for diagnostics scopes
  • More efficient implementation of leafSequence don't need it
  • ? Figure out merging TcInfos from files processed in parallel without callbacks too hard

Some preliminary synthetic benchmarks

BenchmarkDotNet=v0.13.2, OS=Windows 10 (10.0.19045.2846)
AMD Ryzen 9 5900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK=7.0.203
  [Host]     : .NET 7.0.5 (7.0.523.17405), X64 RyuJIT AVX2 DEBUG
  Job-EQVOVN : .NET 7.0.5 (7.0.523.17405), X64 RyuJIT AVX2

InvocationCount=1  IterationCount=4  UnrollFactor=1  
WarmupCount=1  
ProjectType Mean Error StdDev Completed Work Items Gen0 Gen1 Allocated
Incremental Builder Dependency Chain 3,931.7 ms 97.63 ms 15.11 ms 156 14000 2000 5471.11 MB
Incremental Builder Dependent Groups 3,882.0 ms 160.07 ms 24.77 ms 150 14000 2000 5440.74 MB
Incremental Builder Parallel Groups 4,142.5 ms 259.68 ms 40.19 ms 150 14000 2000 5624.65 MB
Transparent Compiler Dependency Chain 3,954.3 ms 1,549.37 ms 239.77 ms 1308 9000 2000 3466.58 MB
Transparent Compiler Dependent Groups 560.2 ms 79.75 ms 12.34 ms 867 1000 - 553.02 MB
Transparent Compiler Parallel Groups 1,318.1 ms 79.63 ms 12.32 ms 1164 9000 3000 3518.59 MB

@0101 0101 changed the title [Experimental][WIP] Transparent Compiler [Experimental] [WIP] Transparent Compiler May 3, 2023
@0101 0101 self-assigned this May 3, 2023
@majocha
Copy link
Contributor

majocha commented May 4, 2023

This looks awesome and absolutely should work!

@0101 0101 added this to the May-2023 milestone May 4, 2023
@0101
Copy link
Contributor Author

0101 commented May 19, 2023

I added some basic synthetic project benchmarks (results in the description). Looks promising so far. In the DependencyChain project where each file depends on the previous one the performance is similar to IncrementalBuilder, but with less allocations and memory consumed. In the other scenarios where we can leverage graph checking by either parallelization or skipping unneeded work we can see the expected time savings.

There is significantly more Completed Work Items. If it has no other side effects then it probably won't bother us but might need to look into that...

@TheAngryByrd
Copy link
Contributor

@0101 let me know when you think this API is pretty stable and I'll give it a run in FSAC.

@0101 0101 modified the milestones: May-2023, June-2023 Jun 1, 2023
@0101
Copy link
Contributor Author

0101 commented Jun 8, 2023

Status update

After the latest tweaks it seems to work with Giraffe solution. Which is the simplest one I found that contains some real world useful code :) So if someone has some smaller projects like that, you could already try it out.

It doesn't quite work with VisualFSharp solution yet, but I'll be now looking at what are the missing pieces for that...

@auduchinok
Copy link
Member

It doesn't quite work with VisualFSharp solution yet, but I'll be now looking at what are the missing pieces for that...

There's some special logic around having FSharp.Core as a project in a solution. You may want to try using it on FSharp.Compiler.Service.sln in a fresh working copy, it doesn't contain that project reference.

@auduchinok
Copy link
Member

auduchinok commented Jun 8, 2023

After the latest tweaks it seems to work with Giraffe solution.

@0101 Would it be possible to run the benchmarks on it too? 🙂

@0101
Copy link
Contributor Author

0101 commented Jun 8, 2023

@0101 Would it be possible to run the benchmarks on it too? 🙂

Yeah I'm hoping to do that, but not sure exactly how yet. So far my idea was to enable the SyntheticProject / workflow builder to load real projects from disk and then work with them in a similar way.

@0101 0101 marked this pull request as ready for review January 12, 2024 14:15
@0101 0101 requested a review from a team as a code owner January 12, 2024 14:15
azure-pipelines.yml Outdated Show resolved Hide resolved
@psfinaki
Copy link
Member

@0101 please post the latest and greatest benchmark results before merging - practice shows it will be helpful for future reference.

@0101
Copy link
Contributor Author

0101 commented Jan 15, 2024

Current results for the Giraffe benchmark

Note that this is with checking all preceding files - not skipping ones that we don't depend on.


BenchmarkDotNet v0.13.10, Windows 11 (10.0.22631.3007/23H2/2023Update/SunValley3)
11th Gen Intel Core i7-11850H 2.50GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK 8.0.101
  [Host]     : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2 DEBUG
  Job-RBYAIG : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2

IterationCount=8  WarmupCount=1  

Method Use Transparent Compiler Signature Files Mean Error StdDev Gen0 Completed Work Items Lock Contentions Gen1 Allocated
Some Workflow False False 1,407.8 ms 29.97 ms 13.31 ms 5000 68 - 2000 1169.71 MB
Some Workflow False True 801.4 ms 136.68 ms 60.69 ms 2000 80 - 1000 573.97 MB
Some Workflow True False 1,184.4 ms 59.55 ms 31.14 ms 4000 1146 - 1000 969.67 MB
Some Workflow True True 615.0 ms 281.38 ms 147.17 ms - 1435 1 - 197.62 MB

@nojaf
Copy link
Contributor

nojaf commented Jan 15, 2024

Thanks for sharing these numbers @0101! Great work!

azure-pipelines.yml Outdated Show resolved Hide resolved
azure-pipelines.yml Outdated Show resolved Hide resolved
azure-pipelines.yml Outdated Show resolved Hide resolved
@vzarytovskii
Copy link
Member

vzarytovskii commented Jan 16, 2024

@0101 Shall this be merged, or are there any pending changes?

@0101
Copy link
Contributor Author

0101 commented Jan 16, 2024

@vzarytovskii I want to improve the tests as pointed out by Smaug123, but it can also be done separately.

@vzarytovskii
Copy link
Member

@vzarytovskii I want to improve the tests as pointed out by Smaug123, but it can also be done separately.

Up to you, but if you ask me, we should merge now to not end up in a never ending cycle of fixing small things and never merging.

@auduchinok
Copy link
Member

Yeah, you're right. I should figure out how to do it without arbitrary time delays. Probably use the events somehow to progress the test.

I've tried to update our FCS fork and also found some instability, seemingly related to requests cancelation. My theory is it may be #16348 to blame, but I haven't an opportunity to look into it further yet. It seems that the previous approach from #16137 works as expected (though, some access to metadata is not guarded with cancellation tokens).

@0101 Could you try to check if reverting #16348 would improve stability of the tests?

@vzarytovskii
Copy link
Member

@auduchinok fwiw (not really related to this PR per se), once this is merged, I will be rewriting NodeBuilder to state machines (pretty much cancellable task), so cancellation will change a bit there.

@0101
Copy link
Contributor Author

0101 commented Jan 16, 2024

Yeah, you're right. I should figure out how to do it without arbitrary time delays. Probably use the events somehow to progress the test.

I've tried to update our FCS fork and also found some instability, seemingly related to requests cancelation. My theory is it may be #16348 to blame, but I haven't an opportunity to look into it further yet. It seems that the previous approach from #16137 works as expected (though, some access to metadata is not guarded with cancellation tokens).

@0101 Could you try to check if reverting #16348 would improve stability of the tests?

I wouldn't expect that to be a problem, since the tests don't run in parallel. I just didn't spend much effort on writing the tests properly, so I still want to do that first.

@0101 0101 merged commit 2352770 into dotnet:main Jan 16, 2024
28 checks passed
@Smaug123
Copy link
Contributor

Smaug123 commented Jan 16, 2024

Up to you, but if you ask me, we should merge now to not end up in a never ending cycle of fixing small things and never merging.

I have a counter-opinion, but I commit very rarely to this repo so downweight it accordingly.

I consider flaky tests to be seriously bad (like "drop everything and fix this; if nothing else just delete the test" bad). Their effect is to very rapidly cause people to distrust all the tests, they greatly increase the friction of contributing, and they essentially prevent most automation on the repo (such as the not rocket science rule); and the reputation of a project "having flaky tests" is extremely hard to reverse. To my surprise, Dan Luu has not actually written directly about this, but he has written on broken builds and the normalisation of deviance.


let processStateUpdate post (key: KeyData<_, _>, action: StateUpdate<_>) =
task {
do! Task.Delay 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Task.Delay with 0 ms is a special case and just returns Task.CompletedTask.

Task.Yield ()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, yeah, that's what I wanted. Wonder if it's actually necessary.

psfinaki added a commit that referenced this pull request Jan 25, 2024
* Name resolution: keep type vars in subsequent checks (#16456)

* Keep typars produced in name resolution

* Better debug errors

* Unwrap measure type vars

* Undo check declarations change

* Fix reported range

* Undo occurrence change

* Skip path typars

* Add test

* More freshen typar APIs properly

* Fantomas

* Cleanup

* Add release notes

* 123

---------

Co-authored-by: Vlad Zarytovskii <vzaritovsky@hotmail.com>

* Build benchmarks in CI (#16518)

* Remove profiling startpoint project

* Add bench build job

* Up

* up

* up

---------

Co-authored-by: Kevin Ransom (msft) <codecutter@hotmail.com>

* More ValueOption in compiler: part 1 (#16323)

* More ValueOption in compiler: part 1

* release notes

* Update CheckComputationExpressions.fs

* release notes

* `[Experimental]` `[WIP]` Transparent Compiler (#15179)

* Track CheckDeclarations.CheckModuleSignature activity. (#16534)

* Add Computation Expression Benchmarks (#16541)

* add benchmarks for various usages of CEs

* refactor

* move CE source files to dedicated ce folder

* Update Roslyn to a version which uses Immutable v7 (#16545)

* revert #16326 (addition of XliffTasks reference) (#16548)

* updated devcontainer image (#16551)

* Add higher-order-function-based API for working with untyped AST (#16462)

* Add module-based API for working with untyped AST

* Fantomas

* tryPickUntil → tryPickDownTo

* Don't need that

* Thread path while walking

* Update comment

* Simplify

* Expose `Ast.fold` and `Ast.tryPick`.
* Expose `SyntaxNode.(|Attributes|)`.
* Ensure a few more syntax node cases get hit.

* Update FCS release notes

* Update surface area

* Add back `foldWhile`; add `exists`, `tryNode`

* Put `Ast.foldWhile` back in.

* Add `Ast.exists`.

* Add `Ast.tryNode`.

* `SyntaxTraversal.Traverse` → `Ast.tryPick`…

* Replace uses of `SyntaxTraversal.Traverse` in `FSharpParseFileResults`
  with the appropriate function from the `Ast` module: `exists`,
  `tryPick`, `tryNode`.

* Update surface area

* Need that

* Just to be safe

* Add `Ast.tryPickLast`

* Handle multiple args mid-pipeline

* Before, no signature help was offered in a case like this:

  ```fsharp
  [1..10]
  |> List.fold (fun acc _ -> acc) ‸
  |> List.filter (fun x -> x > 3)
  ```

  The service will now offer help for the `state` parameter when the
  cursor ‸ is in that location.

* `*` instead of error

* `FSharpParseFileResults.TryRangeOfFunctionOrMethodBeingApplied` was
  previously returning the range of the (zero-width)
  `SynExpr.ArbitraryAfterError`. It now returns the range of the `*`
  (`op_Multiply`) instead.

* Update surface area

* Fmt

* Missed in merge

* Add VS release notes entry

* # → ###

* Add 	ryPick tests

* Add a few more tests

* \n

* Bump release notes

* Fmt

* `Ast` → `ParsedInput`

* Use `ParsedInput` as the main AST type.

* Move the `position` parameter rightward.

* Update surface area

* Less `function`

* Update untyped AST docs

* Add basic examples for `ParsedInput` module functions.

* Merge the existing `SyntaxVisitorBase` docs into the new file.

* Clean up doc comments

---------

Co-authored-by: Vlad Zarytovskii <vzaritovsky@hotmail.com>

* Move paren entries to appropriate releases (#16561)

* [main] Update dependencies from dotnet/source-build-reference-packages (#16532)

* Update dependencies from https://github.com/dotnet/source-build-reference-packages build 20240115.2

Microsoft.SourceBuild.Intermediate.source-build-reference-packages
 From Version 9.0.0-alpha.1.24059.3 -> To Version 9.0.0-alpha.1.24065.2

* Update dependencies from https://github.com/dotnet/source-build-reference-packages build 20240116.1

Microsoft.SourceBuild.Intermediate.source-build-reference-packages
 From Version 9.0.0-alpha.1.24059.3 -> To Version 9.0.0-alpha.1.24066.1

* Update dependencies from https://github.com/dotnet/source-build-reference-packages build 20240117.1

Microsoft.SourceBuild.Intermediate.source-build-reference-packages
 From Version 9.0.0-alpha.1.24059.3 -> To Version 9.0.0-alpha.1.24067.1

* Update dependencies from https://github.com/dotnet/source-build-reference-packages build 20240117.1

Microsoft.SourceBuild.Intermediate.source-build-reference-packages
 From Version 9.0.0-alpha.1.24059.3 -> To Version 9.0.0-alpha.1.24067.1

* Update dependencies from https://github.com/dotnet/source-build-reference-packages build 20240117.1

Microsoft.SourceBuild.Intermediate.source-build-reference-packages
 From Version 9.0.0-alpha.1.24059.3 -> To Version 9.0.0-alpha.1.24067.1

* Update dependencies from https://github.com/dotnet/source-build-reference-packages build 20240117.1

Microsoft.SourceBuild.Intermediate.source-build-reference-packages
 From Version 9.0.0-alpha.1.24059.3 -> To Version 9.0.0-alpha.1.24067.1

---------

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Co-authored-by: Vlad Zarytovskii <vzaritovsky@hotmail.com>

* Attempt to make links from single identifier module names. (#16550)

* Add scenarios where parentheses are around module name.

* Address problem tighter to nameof usage.

* Restore missing commit and inline nameof ident check.

* Add release note entry.

* rewrite SizeOfValueInfo in Optimizer.fs to be tail-recursive (#16559)

* rewrite SizeOfValueInfo in Optimizer.fs to be tail-recursive

* use Brians rewrite into one local function

* stringbuilder is not threadsafe (#16557)

* Array postfix notation in fsharp core api (#16564)

* changed array types to postfix form in all signatures
* changed array types to postfix form in the implementation files

* Revert 16348 (#16536)

* Improve AsyncMemoize tests

* relax test condition

* Revert "Cancellable: set token from node/async in features code (#16348)"

This reverts commit d4e3b26.

* remove UsingToken

* remove UsingToken

* test improvement

* relax test condition

* use thread-safe collections when collecting events from AsyncMemoize

* fix flaky test

* release note

* Small code reshuffle for diff minimization (#16569)

* Moving code around

* Small code reshuffle for diff minimization

* wat

* Refactor parens API (#16461)

* Refactor parens API

* Remove `UnnecessaryParentheses.getUnnecessaryParentheses`.

* Expose `SynExpr.shouldBeParenthesizedInContext`.

* Expose `SynPat.shouldBeParenthesizedInContext`.

* Expose `SyntaxTraversal.TraverseAll`.

* Fantomas

* Use `ParsedInput.fold`

* Tests

* Update surface area

* Clean up sigs & comments

* Update release notes

* Remove redundant async

* Remove stubs (no longer needed)

* Preserve original stacktrace in state machines if available (#16568)

* Preserve original stacktrace in state machines if available

* Update release notes

* Automated command ran: fantomas

  Co-authored-by: vzarytovskii <1260985+vzarytovskii@users.noreply.github.com>

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* check reportErrors and feature support at top level (#16549)

* Align DU case augmentation with previous behavior in EraseUnions (#16571)

* Align DU case augment with previous behavior in EraseUnions

* Update 8.0.300.md

* modify tests

* Refresh debug surface area (#16573)

* Remove superfluous rec keywords and untangle some functions (#16544)

* remove some superfluous rec keywords and untangle two functions that aren't mutually recursive.

* Don't throw on invalid input in Graph construction (#16575)

* More ValueOption in compiler: part 2 (#16567)

* More ValueOption in complier: part 2

* Update release notes

* extra optimization

* extra optimization 2

* fantomas

* Update dependencies from https://github.com/dotnet/arcade build 20240123.2 (#16579)

Microsoft.DotNet.Arcade.Sdk
 From Version 8.0.0-beta.24060.4 -> To Version 8.0.0-beta.24073.2

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>

* [main] Update dependencies from dotnet/source-build-reference-packages (#16574)

* Update dependencies from https://github.com/dotnet/source-build-reference-packages build 20240122.5

Microsoft.SourceBuild.Intermediate.source-build-reference-packages
 From Version 9.0.0-alpha.1.24067.1 -> To Version 9.0.0-alpha.1.24072.5

* Update dependencies from https://github.com/dotnet/source-build-reference-packages build 20240123.1

Microsoft.SourceBuild.Intermediate.source-build-reference-packages
 From Version 9.0.0-alpha.1.24067.1 -> To Version 9.0.0-alpha.1.24073.1

---------

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Co-authored-by: Tomas Grosup <tomasgrosup@microsoft.com>

* Improve AsyncMemoize tests (#16580)

---------

Co-authored-by: Eugene Auduchinok <eugene.auduchinok@gmail.com>
Co-authored-by: Vlad Zarytovskii <vzaritovsky@hotmail.com>
Co-authored-by: Petr <psfinaki@users.noreply.github.com>
Co-authored-by: Kevin Ransom (msft) <codecutter@hotmail.com>
Co-authored-by: Petr Pokorny <petrpokorny@microsoft.com>
Co-authored-by: Florian Verdonck <florian.verdonck@outlook.com>
Co-authored-by: dawe <dawedawe@posteo.de>
Co-authored-by: Tomas Grosup <tomasgrosup@microsoft.com>
Co-authored-by: Martin <29605222+Martin521@users.noreply.github.com>
Co-authored-by: Brian Rourke Boll <brianrourkeboll@users.noreply.github.com>
Co-authored-by: dotnet-maestro[bot] <42748379+dotnet-maestro[bot]@users.noreply.github.com>
Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Co-authored-by: Jakub Majocha <1760221+majocha@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NO_RELEASE_NOTES Label for pull requests which signals, that user opted-out of providing release notes
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

10 participants