Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put file writing behind IO semaphore #557

Merged
merged 6 commits into from
Mar 19, 2025
Merged

Conversation

jakebailey
Copy link
Member

@jakebailey jakebailey commented Mar 13, 2025

Fixes #525

Apparently, file writing also consumes locks on macOS; stick it behind the semaphore too.

@jakebailey
Copy link
Member Author

Yeah, I think no effect on Windows. But I also only see like a 5% speedup on macOS in a microbenchmark so IDK.

@DanielRosenwasser
Copy link
Member

When I tried this out with Windows, I also didn't see an effect. Do you think it's worth adding the semaphore for all platforms, or only MacOS?

Also, I would definitely rename it to osOpenSema and leave a comment on why it's useful on the ReadFile side

@DanielRosenwasser
Copy link
Member

@hrueger any chance you can try out this branch?

@hrueger
Copy link

hrueger commented Mar 13, 2025

grafik
Sure, here you go 👍

@jakebailey
Copy link
Member Author

jakebailey commented Mar 13, 2025

In your before case, do you mind running with --pprofDir=. and then upload the profiles to https://pprof.host? Or just uploading it to GitHub here?

@hrueger
Copy link

hrueger commented Mar 13, 2025

I assume you need the cpuprofile? https://pprof.host/jr4g/
Do you need the memory profile, too?

@jakebailey
Copy link
Member Author

CPU is great. Yeah, this is wild:

image

I'd be interested in the after CPU profile, too. Hopefully syscall isn't there.

@hrueger
Copy link

hrueger commented Mar 13, 2025

This was done with d94208e checked out: https://pprof.host/jw4g/

@jakebailey
Copy link
Member Author

Hmm, better, but the syscalls are still there a lot more than I would have expected.

If you're really bored, I'd be interested if you could try changing openFileSema's size from 128 to something smaller and see if there's a number which appears to work better on your machine.

@hrueger
Copy link

hrueger commented Mar 13, 2025

Haha 😅 I just played around a bit and it is really strange. Unsure if I'm doing something completely wrong:

hannes@MacBook-Pro-von-Hannes-312 src % echo "128 (default)"
128 (default)
hannes@MacBook-Pro-von-Hannes-312 src % ../../typescript-go/built/local/tsgo --pprofDir=.
Files:             4809
Types:          1657223
Parse time:      0.533s
Bind time:       0.037s
Check time:      2.533s
Emit time:       2.264s
Total time:      5.368s
Memory used:   3007793K
Memory allocs: 28509597
CPU profile: 26897-cpuprofile.pb.gz
Memory profile: 26897-memprofile.pb.gz
hannes@MacBook-Pro-von-Hannes-312 src % echo 64
64
hannes@MacBook-Pro-von-Hannes-312 src % ../../typescript-go/built/local/tsgo --pprofDir=.
Files:             4809
Types:          1657223
Parse time:      0.547s
Bind time:       0.038s
Check time:      2.430s
Emit time:       1.115s
Total time:      4.131s
Memory used:   3004652K
Memory allocs: 28438633
CPU profile: 27272-cpuprofile.pb.gz
Memory profile: 27272-memprofile.pb.gz
hannes@MacBook-Pro-von-Hannes-312 src % echo 32
32
hannes@MacBook-Pro-von-Hannes-312 src % ../../typescript-go/built/local/tsgo --pprofDir=.
Files:             4809
Types:          1657223
Parse time:      0.565s
Bind time:       0.036s
Check time:      2.323s
Emit time:       0.557s
Total time:      3.482s
Memory used:   3006260K
Memory allocs: 28514345
CPU profile: 27585-cpuprofile.pb.gz
Memory profile: 27585-memprofile.pb.gz
hannes@MacBook-Pro-von-Hannes-312 src % echo 16
16
hannes@MacBook-Pro-von-Hannes-312 src % ../../typescript-go/built/local/tsgo --pprofDir=.
Files:             4809
Types:          1657223
Parse time:      0.549s
Bind time:       0.088s
Check time:      2.387s
Emit time:       0.431s
Total time:      3.456s
Memory used:   3006638K
Memory allocs: 28517224
CPU profile: 28009-cpuprofile.pb.gz
Memory profile: 28009-memprofile.pb.gz
hannes@MacBook-Pro-von-Hannes-312 src % echo 8                                           
8
hannes@MacBook-Pro-von-Hannes-312 src % ../../typescript-go/built/local/tsgo --pprofDir=.
Files:             4809
Types:          1657223
Parse time:      0.524s
Bind time:       0.038s
Check time:      2.440s
Emit time:       0.405s
Total time:      3.408s
Memory used:   3005708K
Memory allocs: 28456760
CPU profile: 28361-cpuprofile.pb.gz
Memory profile: 28361-memprofile.pb.gz
hannes@MacBook-Pro-von-Hannes-312 src % echo 4                                           
4
hannes@MacBook-Pro-von-Hannes-312 src % ../../typescript-go/built/local/tsgo --pprofDir=.
Files:             4809
Types:          1657223
Parse time:      0.504s
Bind time:       0.038s
Check time:      2.416s
Emit time:       0.414s
Total time:      3.372s
Memory used:   3003958K
Memory allocs: 28462450
CPU profile: 28885-cpuprofile.pb.gz
Memory profile: 28885-memprofile.pb.gz
hannes@MacBook-Pro-von-Hannes-312 src % echo 1
1
hannes@MacBook-Pro-von-Hannes-312 src % ../../typescript-go/built/local/tsgo --pprofDir=.
Files:             4809
Types:          1657223
Parse time:      0.486s
Bind time:       0.065s
Check time:      2.433s
Emit time:       0.496s
Total time:      3.479s
Memory used:   3002193K
Memory allocs: 28333024
CPU profile: 29302-cpuprofile.pb.gz
Memory profile: 29302-memprofile.pb.gz

26897-128_cpuprofile.pb.gz
27272-64_cpuprofile.pb.gz
27585-32_cpuprofile.pb.gz
28009-16_cpuprofile.pb.gz
28361-8_cpuprofile.pb.gz
28885-4_cpuprofile.pb.gz
29302-1_cpuprofile.pb.gz

It seems like its not getting faster nor slower between 32 and 1...

@jakebailey
Copy link
Member Author

Wow, it's incredible that setting it so low makes us faster. We should set it low if that's what macOS works best with, of course.

The scaling to me means:

  • Jarred's post was correct, and macOS has a big mutex on file IO.
  • That mutex stops scaling at around 32 users.

I couldn't reproduce this problem on my M1, so I don't really know how to evaluate the right option here.

@hrueger
Copy link

hrueger commented Mar 13, 2025

I can see if I can get two friends with an M2 and an M4 to test. I can try on a Mac Mini M1 tomorrow.

@jakebailey jakebailey changed the title Put file writing behind IO semaphore on macOS Put file writing behind IO semaphore Mar 13, 2025
@hrueger
Copy link

hrueger commented Mar 15, 2025

Sorry for the delay. I just tried it on my M1 Mac Mini (lowest spec) and it behaves quite differently. It does not seem to have as big of a problem without the sephamore (current main branch):

runner@Mini-von-GitHub src % ../../typescript-go/built/local/tsgo 
Files:             4846
Types:          1661932
Parse time:      0.919s
Bind time:       0.131s
Check time:      4.299s
Emit time:       3.495s
Total time:      8.845s
Memory used:   3019654K
Memory allocs: 28656079

but it still benefits from the changes in this branch:

runner@Mini-von-GitHub src % ../../typescript-go/built/local/tsgo
Files:             4846
Types:          1661932
Parse time:      0.927s
Bind time:       0.092s
Check time:      3.685s
Emit time:       1.256s
Total time:      5.959s
Memory used:   3015549K
Memory allocs: 28622953

@bep
Copy link

bep commented Mar 16, 2025

A comment in from the side line here (but I do have a fair amount experience with concurrency in Go).

I suspect that a better fix would be to add some upper bound (numWorkers) to the workgroup you use for emitting (

func NewWorkGroup(singleThreaded bool) WorkGroup {
)

@jakebailey
Copy link
Member Author

jakebailey commented Mar 16, 2025

Hi @bep! Huge fan of hugo 😄

Early on, we found that limiting the work itself was slower than just letting Go schedule everything; maybe that's no longer true, though.

FWIW this IO semaphore trick is something gopls does, though it doesn't have to worry about writing out to disk or anything, only frequent file reads (which I introduced early on here too since it helped so much).

@sandersn
Copy link
Member

I reproduced this by compiling vscode on my dual-corp-enrolled M3 Pro (running both Microsoft and Github security software):

on main (plugged in)
23.7, 34.7, 36.5, 23.2, 5.2, 35.8, 5.3, 30.2, 31.7, 18.5, 28.8

on this branch (plugged in)
3.3, 0.8, 1.3, 2.0, 0.8, 1.5, 1.5, 0.8, 1.1, 1.3, 1.2

@jakebailey jakebailey added this pull request to the merge queue Mar 19, 2025
Merged via the queue into main with commit ffa0582 Mar 19, 2025
21 checks passed
@jakebailey jakebailey deleted the jabaile/macos-read-sema branch March 19, 2025 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Emit Time Pretty Slow on M3 Max MacBook
6 participants