-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: make timers faster #6239
Comments
Optimization of the timer heap can give up to 10%. On some of our prod servers we see 5-8% of time in siftdown. But the real problem is the global mutex. We need:
Here is a prototype for 1 and 2: Distribution is somewhat tricky. We will need hierarchical epoll descriptors, and kqueue/IOCP does not support hierarchical descriptors AFAICT. For timers we will need to somehow block on global minimum time, rather than on per-P minimum time. |
@dvyukov How does CL 5760 "get rid of the timer thread"? There is a |
timerproc blocks in notetsleepg, which consumes a thread, so timerproc is also always a thread. |
Thanks for the quick explanation. If I am reading it correctly, at least |
Yes, it hands off P, but it still consumes thread and causes 2 context switches per timer. |
Change https://golang.org/cl/171883 mentions this issue: |
Change https://golang.org/cl/171829 mentions this issue: |
Change https://golang.org/cl/171827 mentions this issue: |
Change https://golang.org/cl/171826 mentions this issue: |
Change https://golang.org/cl/171821 mentions this issue: |
Change https://golang.org/cl/171828 mentions this issue: |
Change https://golang.org/cl/171884 mentions this issue: |
This new facility will be used by future CLs in this series. Change the only blocking call to netpoll to do the right thing when netpoll returns an empty list. Updates #6239 Updates #27707 Change-Id: I58b3c2903eda61a3698b1a4729ed0e81382bb1ed Reviewed-on: https://go-review.googlesource.com/c/go/+/171821 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Add support to the main scheduler loop for handling timers on P's. This is not used yet, as timers are not yet put on P's. Updates #6239 Updates #27707 Change-Id: I6a359df408629f333a9232142ce19e8be8496dae Reviewed-on: https://go-review.googlesource.com/c/go/+/171826 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
When we put timers on P's, the wasm code will not be able to rely on the timer goroutine. Use the beforeIdle hook to schedule a wakeup. Updates #6239 Updates #27707 Change-Id: Idf6309944778b8c3d7178f5d09431940843ea233 Reviewed-on: https://go-review.googlesource.com/c/go/+/171827 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Updates #6239 Updates #27707 Change-Id: I52cab8bf3dc8c552463725fc1d9e4e6b12230b03 Reviewed-on: https://go-review.googlesource.com/c/go/+/171828 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
Add new fields to runtime.timer, and adjust the various timer functions in preparation for adding timers to P's. This continues to use the old timer code. Updates #6239 Updates #27707 Change-Id: I9adb3814f657e083ec5e22736c4b5b52b77b6a3f Reviewed-on: https://go-review.googlesource.com/c/go/+/171829 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
Change https://golang.org/cl/203890 mentions this issue: |
Change https://golang.org/cl/204045 mentions this issue: |
This fixes the Plan 9 support for the new timer code. Updates #6239 Updates #27707 Change-Id: Ia498c399b8924910b97fcde07545fae3588aad47 Reviewed-on: https://go-review.googlesource.com/c/go/+/204045 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
Updates #6239 Updates #27707 Change-Id: I0a62c1374db485dd830bf02e59625997d9247fc3 Reviewed-on: https://go-review.googlesource.com/c/go/+/203890 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
No big changes in the runtime package benchmarks. Changes in the time package benchmarks: name old time/op new time/op delta AfterFunc-12 1.57ms ± 1% 0.07ms ± 1% -95.42% (p=0.000 n=10+8) After-12 1.63ms ± 3% 0.11ms ± 1% -93.54% (p=0.000 n=9+10) Stop-12 78.3µs ± 3% 73.6µs ± 3% -6.01% (p=0.000 n=9+10) SimultaneousAfterFunc-12 138µs ± 1% 111µs ± 1% -19.57% (p=0.000 n=10+9) StartStop-12 28.7µs ± 1% 31.5µs ± 5% +9.64% (p=0.000 n=10+7) Reset-12 6.78µs ± 1% 4.24µs ± 7% -37.45% (p=0.000 n=9+10) Sleep-12 183µs ± 1% 125µs ± 1% -31.67% (p=0.000 n=10+9) Ticker-12 5.40ms ± 2% 0.03ms ± 1% -99.43% (p=0.000 n=10+10) Sub-12 114ns ± 1% 113ns ± 3% ~ (p=0.069 n=9+10) Now-12 37.2ns ± 1% 36.8ns ± 3% ~ (p=0.287 n=8+8) NowUnixNano-12 38.1ns ± 2% 37.4ns ± 3% -1.87% (p=0.020 n=10+9) Format-12 252ns ± 2% 195ns ± 3% -22.61% (p=0.000 n=9+10) FormatNow-12 234ns ± 1% 177ns ± 2% -24.34% (p=0.000 n=10+10) MarshalJSON-12 320ns ± 2% 250ns ± 0% -21.94% (p=0.000 n=8+8) MarshalText-12 320ns ± 2% 245ns ± 2% -23.30% (p=0.000 n=9+10) Parse-12 206ns ± 2% 208ns ± 4% ~ (p=0.084 n=10+10) ParseDuration-12 89.1ns ± 1% 86.6ns ± 3% -2.78% (p=0.000 n=10+10) Hour-12 4.43ns ± 2% 4.46ns ± 1% ~ (p=0.324 n=10+8) Second-12 4.47ns ± 1% 4.40ns ± 3% ~ (p=0.145 n=9+10) Year-12 14.6ns ± 1% 14.7ns ± 2% ~ (p=0.112 n=9+9) Day-12 20.1ns ± 3% 20.2ns ± 1% ~ (p=0.404 n=10+9) Updates #6239 Updates #27707 Change-Id: I51e25a90f941574f1a9cf83a22e84ac8c678537d Reviewed-on: https://go-review.googlesource.com/c/go/+/171883 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Change https://golang.org/cl/204280 mentions this issue: |
The addAdjustedTimers function was a late addition, and it got some of the state machine wrong, leading to failures like https://storage.googleapis.com/go-build-log/930576b6/windows-amd64-2016_53d0319e.log Updates #6239 Updates #27707 Change-Id: I9e94e563b4698ff3035ce609055ca292b9cab3df Reviewed-on: https://go-review.googlesource.com/c/go/+/204280 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
Change https://golang.org/cl/204717 mentions this issue: |
Change https://golang.org/cl/204800 mentions this issue: |
If multiple goroutines call time.(*Timer).Reset then the timer will go from timerWaiting to timerDeleted to timerModifying to timerModifiedLater. The timer can be on a different P, meaning that simultaneously cleantimers could change it from timerDeleted to timerRemoving to timerRemoved. If Reset sees timerRemoved, it was doing an atomic.Store of timerWaiting, meaning that it did not necessarily see the other values set in the timer, so the timer could appear to be in an inconsistent state. Use atomic.Cas to avoid that possibility. Updates #6239 Updates #27707 Fixes #35272 Change-Id: I1d59a13dc4f2ff4af110fc6e032c8c9d59cfc270 Reviewed-on: https://go-review.googlesource.com/c/go/+/204717 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
When dropping a P, if it has any timers, and if some thread is sleeping in the netpoller, wake the netpoller to run the P's timers. This mitigates races between the netpoller deciding how long to sleep and a new timer being added. In sysmon, if all P's are idle, check the timers to decide how long to sleep. This avoids oversleeping if no thread is using the netpoller. This can happen in particular if some threads use runtime.LockOSThread, as those threads do not block in the netpoller. Also, print the number of timers per P for GODEBUG=scheddetail=1. Before this CL, TestLockedDeadlock2 would fail about 1% of the time. With this CL, I ran it 150,000 times with no failures. Updates #6239 Updates #27707 Fixes #35274 Fixes #35288 Change-Id: I7e5193e6c885e567f0b1ee023664aa3e2902fcd1 Reviewed-on: https://go-review.googlesource.com/c/go/+/204800 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
Change https://golang.org/cl/206938 mentions this issue: |
In the discussion of CL 171828 we decided that it was not necessary to acquire timersLock around the call to moveTimers, because the world is stopped. However, that is not correct, as sysmon runs even when the world is stopped, and it calls timeSleepUntil which looks through the timers. timeSleepUntil acquires timersLock, but that doesn't help if moveTimers is running at the same time. Updates #6239 Updates #27707 Updates #35462 Change-Id: I346c5bde594c4aff9955ae430b37c2b6fc71567f Reviewed-on: https://go-review.googlesource.com/c/go/+/206938 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
Change https://golang.org/cl/207348 mentions this issue: |
Dan Scales pointed out a theoretical deadlock in the runtime. The timer code runs timer functions while holding the timers lock for a P. The scavenger queues up a timer function that calls wakeScavenger, which acquires the scavenger lock. The scavengeSleep function acquires the scavenger lock, then calls resetTimer which can call addInitializedTimer which acquires the timers lock for the current P. So there is a potential deadlock, in that the scavenger lock and the timers lock for some P may both be acquired in different order. It's not clear to me whether this deadlock can ever actually occur. Issue 35532 describes another possible deadlock. The pollSetDeadline function acquires pd.lock for some poll descriptor, and in some cases calls resettimer which can in some cases acquire the timers lock for the current P. The timer code runs timer functions while holding the timers lock for a P. The timer function for poll descriptors winds up in netpolldeadlineimpl which acquires pd.lock. So again there is a potential deadlock, in that the pd lock for some poll descriptor and the timers lock for some P may both be acquired in different order. I think this can happen if we change the deadline for a network connection exactly as the former deadline expires. Looking at the code, I don't see any reason why we have to hold the timers lock while running a timer function. This CL implements that change. Updates #6239 Updates #27707 Fixes #35532 Change-Id: I17792f5a0120e01ea07cf1b2de8434d5c10704dd Reviewed-on: https://go-review.googlesource.com/c/go/+/207348 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
Updates #6239 Updates #27707 Change-Id: I65e6471829c9de4677d3ac78ef6cd7aa0a1fc4cb Reviewed-on: https://go-review.googlesource.com/c/go/+/171884 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
The ideas suggested here have been implemented for 1.14. Any further timer speedups can be done in new issues. |
Change https://golang.org/cl/209580 mentions this issue: |
We were using the race context of the P that held the timer, but since we unlock the P's timers while executing a timer that could lead to a race on the race context itself. Updates #6239 Updates #27707 Fixes #35906 Change-Id: I5f9d5f52d8e28dffb88c3327301071b16ed1a913 Reviewed-on: https://go-review.googlesource.com/c/go/+/209580 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>
Yes. Interruptible sleep is often modelled using select, kqueue, or epoll, all of which started life as ways to poll a set of non blocking file descriptors, of which a network fd was the most common as cross platform async file io is a unicorn. |
The text was updated successfully, but these errors were encountered: