-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HLE: Slice the very slow memset/memcpy variants #18560
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, code looks good. Though since the PSP is doing cooperative multitasking, this only applies to interrupt-caused thread rescheduling, it shouldn't trigger in other circumstances, right?
Just requesting an explanatory comment.
Also, will help out with testing.
@@ -231,6 +248,9 @@ static int Replace_memcpy_jak() { | |||
} | |||
} | |||
|
|||
if (sliced) { | |||
return 5 + bytes * -8 + 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a comment that a negative value is used to signal Comp_ReplacementFunc that the function call is to be repeated.
No, you misunderstand the PSP's threading model. You actually told me this years ago, and I thought we talked about how it was not really the case. CPUs of course execute interrupts and the PSP firmware's interrupt handlers reschedule threads. That's enough for me to consider it incorrect to call the threading model strictly "cooperative". Those interrupts can be vblanks, timers, whatever. It's basically impossible to have a better priority thread that is in a runnable state but isn't running - it will always get switched to, because whatever put it in that state would've been an interrupt handler on a timer or dmac or something, and will switch it. When I made this change in PPSSPP it fixed a bunch of bugs. I did it after finally deciding to test whether it really was strictly cooperative. It's easy to test your hypothesis if you want to. Simply create a low-priority background thread that spins on Keep in mind these memcpy and memset implementations are SLOW. So a -[Unknown] |
Yeah, you're right, we have talked about this, and it's true that if you have vblank interrupts or other interrupts enabled, then it's really quite close to a fully preemptive system, just not with a defined rescheduling interval like Windows or Linux have. But it still isn't completely pre-emptive. Let's say you have only two threads with the same priority, if you don't have any interrupts enabled and both just have their PC in a spin function and don't call anything related to interrupts or scheduling, the one running will keep running and the other will never end up being scheduled. This is unlike Linux and Windows and other pre-emptive systems, and this fact sticks in my mind a little too much.. because as you say, in reality various interrupts that are there for other purposes than purely rescheduling do cause rescheduling. (Also what I'm requesting a comment about it not this at all, just the negative value thing) |
When they take an especially long time, this allows thread switches meanwhile. Important for cases where they might consume more than a total frame worth of cycles in a background thread.
a098846
to
e1eecb4
Compare
At least according to JPCSP and pspsdk, there are fields on the thread info struct called preempts: I would say it is a preemptive system (when considering that the firmware/kernel reschedules in its interrupt handlers which are not optional), and it's just that the thread priority is very different from traditional thread priority. So yes, the best-priority thread at the top of that priority queue is always running, and there's no slicing or "fairness" among other threads. But that's all about priority. As soon as a better-priority thread could run, it preempts. It's simply not a cooperative threading model. At its core, it may arguably be - if you ripped out the rest of the kernel. But that's not a valid way to think of the system. -[Unknown] |
Alright, I'm convinced - hereby updating my mental model with push -f :) I'll test this in a bit on all the archs except RISC-V and then merge. |
Hm, this is broken on ARM64 at least - Daxter freaks out completely. I'll take a look later. |
Actually, rebasing on master, it's no longer broken on ARM64. So I guess this was branched at an unfortunate time? I'm going to test it rebased for ARM32 as well. |
ARM32/ARM64 both confirmed working after rebase. Also checked that these functions actually got called in some games I tested :) |
I've been busy lately with some family/real life stuff, but had a little time to write this. I found that sometimes, these slow memcpy/memset variants can be called with very large sizes (megabytes) and cause stutters and performance issues by consuming such a large amount of cycles all at once.
This makes them execute in slices so thread scheduling and vsync interrupts can still properly happen.
I didn't test ARM or anything at all so hopefully it's not broken. Also have not checked against recent commits.
-[Unknown]