Description
This bug report was migrated from our old Bugzilla tracker.
These attachments are available in the static archive:
- Benchmark results of Condition Variables implementation (not this patchset) (srwcond.png, image/png, 2020-12-10 09:55:30 +0000, 40418 bytes)
- Benchmark results of Atomics+WaitOnAddr implementation (this patchset) (atomwaitaddr.png, image/png, 2020-12-10 09:56:44 +0000, 42679 bytes)
- 0001 Semaphore test: Put test into separate function. (0001-Semaphore-test-Put-test-into-separate-function.patch, text/plain, 2020-12-10 09:57:59 +0000, 4694 bytes)
- 0002 Semaphore test: Add overhead tests. (0002-Semaphore-test-Add-overhead-tests.patch, text/plain, 2020-12-10 09:59:09 +0000, 5761 bytes)
- 0003 Atomic test: Fix use after free (0003-Atomic-test-Fix-use-after-free.patch, text/plain, 2020-12-10 09:59:38 +0000, 2488 bytes)
0004 Add SDL_sem implementation using Atomics and WaitOnAddress API. (0004-Add-SDL_sem-implementation-using-Atomics-and-WaitOnA.patch, text/plain, 2020-12-10 10:00:22 +0000, 11758 bytes)Add SDL_sem implementation using Atomics and WaitOnAddress API v2 (0004-Add-SDL_sem-implementation-using-Atomics-and-WaitOnA.patch, text/plain, 2020-12-13 12:37:37 +0000, 11851 bytes)- 0004 Add SDL_sem impl using Atomics and WaitOnAddress API v3 (0004-Add-SDL_sem-implementation-using-Atomics-and-WaitOnA.patch, text/plain, 2020-12-17 20:45:20 +0000, 12282 bytes)
Reported in version: 2.0.13
Reported for operating system, platform: Windows 10, All
Comments on the original bug report:
On 2020-12-10 09:54:06 +0000, Joel Linn wrote:
Before this, the SDL_sem implementation on Windows uses the
CreateSemaphore()
API. The returned handles however are (inter-process) Kernel Objects and require a context switch on every interaction.I propose an implementation based on Atomics and the
WaitOnAddress
API. (I made an implementation using Windows Condition Variables as well but it didn't perform as well) Though I am not the first one to think about this: https://devblogs.microsoft.com/oldnewthing/20170612-00/?p=96375I expanded the semaphore test case to measure the overhead of a hot semaphore and attached the results.
testsem
was run 40k times, changing the implementation back and forth between every run. Blue always is the current implementation using Kernel Objects. Mind the y axis is logarithmic so a gauss distribution will look like an inverted parabola.
On 2020-12-10 09:55:30 +0000, Joel Linn wrote:
Created attachment 4565
Benchmark results of Condition Variables implementation (not this patchset)
On 2020-12-10 09:56:44 +0000, Joel Linn wrote:
Created attachment 4566
Benchmark results of Atomics+WaitOnAddr implementation (this patchset)
On 2020-12-10 09:57:59 +0000, Joel Linn wrote:
Created attachment 4567
0001 Semaphore test: Put test into separate function.
On 2020-12-10 09:59:09 +0000, Joel Linn wrote:
Created attachment 4568
0002 Semaphore test: Add overhead tests.
On 2020-12-10 09:59:38 +0000, Joel Linn wrote:
Created attachment 4569
0003 Atomic test: Fix use after free
On 2020-12-10 10:00:22 +0000, Joel Linn wrote:
Created attachment 4570
0004 Add SDL_sem implementation using Atomics and WaitOnAddress API.
On 2020-12-13 03:23:16 +0000, Sam Lantinga wrote:
Thanks, we'll add this after 2.0.14 release.
On 2020-12-13 12:37:37 +0000, Joel Linn wrote:
Created attachment 4577
Add SDL_sem implementation using Atomics and WaitOnAddress API v2v2: - Fix mixed int/LONG types
- Reorder definitions
- Add missing include
On 2020-12-17 20:45:20 +0000, Joel Linn wrote:
Created attachment 4584
0004 Add SDL_sem impl using Atomics and WaitOnAddress API v3v3: - Use
GetModuleHandle()
to load the API SetI wasn't sure before what the correct way to dyn load an API Set is. It is not documented by MS but I found a discussion about it on the MS C++ STL issue tracker. microsoft/STL#593 (comment)
Two things need to be satisfied to be able to legally load an API Set dynamically using
GetModuleHandle()
:
- It needs to be legal to call the function on an API Set in the first place. It's nowhere documented explicitly but a senior from MS confirmed it's legal and they do it as well in their STL implementation (see github link, they discussed it in length).
- The module that implements the API set (currently KernelBase.dll but that is an implementation detail of W10 Desktop and may change.) needs to be loaded already. In our case this is guaranteed because we already link to other functions of that API Set statically (like WaitForSingleObject, see https://docs.microsoft.com/en-us/uwp/win32-and-com/win32-apis#apis-from-api-ms-win-core-synch-l1-2-0dll for a list).
On 2020-12-23 21:45:11 +0000, Sam Lantinga wrote:
These patches are in, thanks!
https://hg.libsdl.org/SDL/rev/bf345b3e71cc
https://hg.libsdl.org/SDL/rev/392cf406c64f
https://hg.libsdl.org/SDL/rev/df6cefaccc75
https://hg.libsdl.org/SDL/rev/2ed27b25bff2