Skip to content

Concurrent type checking #202

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jan 8, 2025
Merged

Concurrent type checking #202

merged 10 commits into from
Jan 8, 2025

Conversation

ahejlsberg
Copy link
Member

@ahejlsberg ahejlsberg commented Jan 1, 2025

This PR introduces concurrent type checking. By default, we now create four type checkers and divide programs into four equal parts that are type checked concurrently. Diagnostics are then collected from each of the checkers and merged into a single set of diagnostics. The division into equal parts is determined from the index of each particular source file in the program modulo the number of checkers. This means the partitioning is deterministic for the same set of source files, which in turn means the produced set of diagnostics remains stable across compilations. The set of diagnostics may however differ from a single-threaded run of the compiler since the compiler may choose different locations for elaborations.

Concurrent type checking isn't as perfectly parallelizable as parsing and binding because checking often resolves types across source file boundaries, potentially duplicating work between the four checkers. Indeed, concurrent type checking always consumes more memory (because of the duplicated types) and in the worst case is no faster than single-threaded checking. However, in practice, concurrent type checking appears to have a significant positive effect. For example, with single-threaded parsing, binding, and checking, the compile time for VSCode is around 12s, and with concurrent parsing and binding, but single threaded checking (as was the case before this PR), the compile time is 8.5s. With full concurrent parsing, binding, and type checking, the compile time drops to 4s at a cost of consuming 20% more memory. (Measured on an 8-core Intel Core i9.)

Note that four type checkers were experimentally chosen as a good balance between performance and memory consumption. With eight type checkers, VSCode compiles about 10% faster, but memory consumption further increases by 20%.

Also note that #200 is a prerequisite for this PR. Without #200, types are ordered by type IDs, which become non-deterministic with concurrent checking.

Comment on lines -45 to +49
var nextMergeId atomic.Uint32
var (
symbolMergeMutex sync.Mutex
nextMergeId uint32
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With concurrent checkers, are we going to have a problem here? Exhausting int32 globally with parallel checking doesn't seem infeasible at all anymore...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would take 4+ billion merged symbols, and then we'd really only have a problem if some of the symbols from 4+ billion in the past are still around to cause conflicts. I'm not too worried.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the old compiler had a max of 2^52, that's a lot of headroom we had before that could have been insulating us from this problem; I suspect that someone typing in a large project could actually get up to this limit, honestly...

# Conflicts:
#	internal/compiler/checker_test.go
#	internal/compiler/program.go
@DanielRosenwasser
Copy link
Member

DanielRosenwasser commented Jan 7, 2025

Will there be a way to run everything else in parallel, but checking in a single-threaded manner? Or to change the split-up count?

# Conflicts:
#	internal/compiler/checker_test.go
#	internal/compiler/program.go
#	internal/compiler/utilities.go
Comment on lines +54 to 55
symbolMergeMutex.Lock()
if symbol.MergeId == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was initially confused about this, until I realized that the race is on symbol.MergeId itself (and not necessarily the global), since symbols can be shared between checkers.

@jakebailey
Copy link
Member

Will there be a way to run everything else in parallel, but checking in a single-threaded manner? Or to change the split-up count?

Not in this PR there isn't; it's possible we could add some sort of flag to do that, since "concurrent checking" is the most dangerous thing comparatively.

@ahejlsberg ahejlsberg merged commit a172481 into main Jan 8, 2025
14 checks passed
@jakebailey jakebailey deleted the concurrent-checking branch January 13, 2025 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants