-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vulkan threading tweaks and minor #10049
Conversation
This at least allows us to detect that the backend failed to init. Happens when switching backends with debugger attached (probably driver bug?)
We end up with a second thread start at frame 1, so the thread needs to start at frame 1 too.
Cool. I'll look at this in detail later and try it on mobile, too. |
From some quick testing this seems very promising. Good job finding my bugs :) There are some things that we'll need to fix, for example, when running in debug mode I run into this validation error on startup:
This is because the main thread calls g_Vulkan->WaitUntilQueueIdle() from WindowsVulkanContext::Resize , while the submit-thread is submitting stuff to the queue. We need to delegate that to the thread as well somehow. God of War asserts:
for some reason, needs looking into. Anyway, very cool. Will try on mobile soon where this should really help a lot. |
Works on Nexus 5X, very unscientific quick testing seems to indicate something like 20-30% speed boost! Time to rip out the old multithreading code... and maybe should try something like this for GL too, not sure I can be bothered though. Regarding the old multithreading code, I'm thinking to rip out the option and the synchronization and the event queue stuff in the GPU, but keep the code to spin up a separate CPU thread because that might have to be the way to go for any sort of GL deferred drawing - must draw on the thread we spin up from Java, while Vulkan is entirely thread-agnostic in a much nicer way. |
We don't want a dependency on the thread state, of course.
We'll need to refcount framebuffers or have them deleted on the render side - GoW is hitting that. Opinion on using RefCountedObject or? Part of the issue is the I'm thinking we might as well just mutex Yeah, the problem with the old option is that at the time I didn't realize games could rely on drawsync timing so heavily, and wanted to move off the overhead of list decoding if possible. Agreed on removing, except the actual thread separation, although that'll need changes to target a queue runner rather than the GPU. Removing it might also improve performance a very tiny bit since there's some overhead in places for the threadsync in the GPU... -[Unknown] |
Refcount seems fine, but we'll have to make sure to use atomic increments and decrements. Or we could add a deleter queue to the thin3d VulkanContext. So many abstractions... I'm fine with mutexing WaitUntilQueueIdle. Yup, I didn't realize that either. And DL decoding is a cost but turns out that on mobile, queueing up GL or Vulkan operations is more expensive so that's the part we want on a thread the most anyway. I don't expect much improvements from getting rid of the overhead, but I guess every bit helps... |
Fixes crash in GoW when using a thread.
Needed when the last rendered FB needs to be downloaded.
Otherwise it's only done after destroying Draw, so no need to mutex.
df11f34
to
5802747
Compare
@@ -218,6 +218,8 @@ void VulkanRenderManager::DestroyBackbuffers() { | |||
} | |||
thread_.join(); | |||
} | |||
vulkan_->WaitUntilQueueIdle(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this should be a better place for this.
auto &frameData = frameData_[i]; | ||
frameData.readyForRun = false; | ||
if (!shutdown && !frameData.readyForFence) { | ||
vkDestroyFence(vulkan_->GetDevice(), frameData.fence, nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about vkResetFences? Unless we zap the whole device, fences should be reusable.
(Hm, I made this comment before but it got lost?)
EDIT: No wait, we want them signalled.. The comment above should be something like "Pre-signal the fences for next time around." in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, yeah. Typed that when I had only moved the flags so far...
-[Unknown]
When resizing or similar, we may end up with frames we never ran. This also happens on startup. We need them signaled at start so we can wait on them, or we may deadlock.
49109c1
to
ead4c5f
Compare
Now there's just the matter of whether this should be an option, and whether it should be on by default perhaps... |
@unknownbrackets Having a strange issue now where in Release mode, it hangs right on startup, while it works in debug. Do you see that too? EDIT: It's some race when recreating the backbuffers due to the initial Resize event. Looking into it. |
Fixed it, I think. |
Cool, yeah, never hit that race - I only hit the fence one intermittently, so maybe I didn't catch it after fixing the fence. -[Unknown] |
Messed with this a bit - didn't test many games afterward (and it's still off by default), but it seems to work in the few I tested and improve performance (including ones using readback, I think.)
Also fixes an issue with switching backends with the debugger attached (quick backend switch mode.)
Oh, forgot to note - I moved FlushSync to the thread because I didn't want to worry about previous pending frames (this was causing issues it seemed like.)
Open to better names than
VKRRunType
. A bit tired, not sure what's better.-[Unknown]