-
Notifications
You must be signed in to change notification settings - Fork 587
Concurrent type checking #202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
var nextMergeId atomic.Uint32 | ||
var ( | ||
symbolMergeMutex sync.Mutex | ||
nextMergeId uint32 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With concurrent checkers, are we going to have a problem here? Exhausting int32 globally with parallel checking doesn't seem infeasible at all anymore...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would take 4+ billion merged symbols, and then we'd really only have a problem if some of the symbols from 4+ billion in the past are still around to cause conflicts. I'm not too worried.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the old compiler had a max of 2^52
, that's a lot of headroom we had before that could have been insulating us from this problem; I suspect that someone typing in a large project could actually get up to this limit, honestly...
# Conflicts: # internal/compiler/checker_test.go # internal/compiler/program.go
Will there be a way to run everything else in parallel, but checking in a single-threaded manner? Or to change the split-up count? |
# Conflicts: # internal/compiler/checker_test.go # internal/compiler/program.go # internal/compiler/utilities.go
symbolMergeMutex.Lock() | ||
if symbol.MergeId == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially confused about this, until I realized that the race is on symbol.MergeId
itself (and not necessarily the global), since symbols can be shared between checkers.
Not in this PR there isn't; it's possible we could add some sort of flag to do that, since "concurrent checking" is the most dangerous thing comparatively. |
This PR introduces concurrent type checking. By default, we now create four type checkers and divide programs into four equal parts that are type checked concurrently. Diagnostics are then collected from each of the checkers and merged into a single set of diagnostics. The division into equal parts is determined from the index of each particular source file in the program modulo the number of checkers. This means the partitioning is deterministic for the same set of source files, which in turn means the produced set of diagnostics remains stable across compilations. The set of diagnostics may however differ from a single-threaded run of the compiler since the compiler may choose different locations for elaborations.
Concurrent type checking isn't as perfectly parallelizable as parsing and binding because checking often resolves types across source file boundaries, potentially duplicating work between the four checkers. Indeed, concurrent type checking always consumes more memory (because of the duplicated types) and in the worst case is no faster than single-threaded checking. However, in practice, concurrent type checking appears to have a significant positive effect. For example, with single-threaded parsing, binding, and checking, the compile time for VSCode is around 12s, and with concurrent parsing and binding, but single threaded checking (as was the case before this PR), the compile time is 8.5s. With full concurrent parsing, binding, and type checking, the compile time drops to 4s at a cost of consuming 20% more memory. (Measured on an 8-core Intel Core i9.)
Note that four type checkers were experimentally chosen as a good balance between performance and memory consumption. With eight type checkers, VSCode compiles about 10% faster, but memory consumption further increases by 20%.
Also note that #200 is a prerequisite for this PR. Without #200, types are ordered by type IDs, which become non-deterministic with concurrent checking.