Make command buffer/descriptor set allocators Sync
again
#2046
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Made
StandardCommandBufferAllocator
andStandardDescriptorSetAllocator
thread-safe using TLS.At first I wanted to implement the TLS myself, using a simple
RwLock<HashMap<ThreadId, T>>
, but I noticed that Windows kept getting into deadlocks with that for some reason. So instead I opted for an established library,thread_local
, which was added to the private dependencies. It's a very light-weight library and also more optimized than what I had in mind. For one it doesn't need acquiring any read locks, and also the entries in the TLS can be reused. This would not have been the case with what I have tried, as threads would always drop their allocator on exit and when a new thread would allocate it would create a brand new allocator for itself. This way those reentrant threads can reuse the allocators.Another good thing that came from trying this library is that after benchmarking, there's almost no overhead to using this TLS over none at all. It's 6ns of overhead on my CPU, granted I also have a very dated CPU. Therefore, I think it makes no sense to have a non-
Sync
version at all, if we're going to use this library. The only performance implication this has is when a lot of threads need to access the TLS concurrently, which would lead to CPU-level contention on theAtomicPtr
s that the TLS uses. But even with unreasonably high contention that I tested it with (all of my threads going at it) the times for allocation only doubled. If this is an issue the user still has the option to quite simply not share the allocators between threads.