-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible speedups #53
Conversation
This pull request fixes 3 alerts when merging 1e881fa into 2069160 - view on LGTM.com fixed alerts:
|
Both Wolfenstein 3D and Doom don't start up for me with this PR on Windows 10 (64bit build). |
I don't understand why audio_batch_cb is commented out too. This means there will be no sound at all in the libretro core. if this is something you ported from Provenance and did not test in RetroArch with the libretro core, that isn't going to do us much good. We'd need a backport of the changes you made then to the core and to ensure things still work. |
That being said, there are some good ideas in here. |
Thanks! This was being used in Provenance and we have our own video and audio pipelines. This PR was never intended for mass consumption but rather for me to get it to work well in Prov and back port to upstream. But as you know, there really isn't a true upstream (my original work wasn't from retroarch's fork but the original .Zip source). So any things that don't make sense are due to that history. I later back ported to your fork but a lot of hacks and such for my specific purposes still exist, which is why i wanted to state that in the original description. I felt bad keeping all these changes to myself after seeing other wanting this core to make progress so as much as I myself hate when people do a dump of code "here's my amazing but difficult to read changes, bye", at the same time, I have very little time to focus on this as much as I would love to so I figured it was better to submit in the present state and hopefully someone else more talented and with the time can take some value from this work. |
I'm shocked this builds on Windows and you got it to load at least! Some ROMs work better with internal or provided BIOSs and other flags in my experience, you may already probably know that if you're familiar with the issues with this core already. I think my hacks on the blitter broke the ID Roms though because they were coded better than most and used some hardware hacks for performance so I'm not suprised those are broken. I bet a little debugging and I or someone else could figure out what's wrong there. This core could use some unit tests to figure out what about the cpu emu is different with my changes but that's probably a stretch at this point. |
@@ -2031,7 +2061,7 @@ static void DSP_jr(void) | |||
}//*/ | |||
dsp_pc += 2; // For DSP_DIS_* accuracy | |||
DSPOpcode[pipeline[plPtrExec].opcode](); | |||
dsp_opcode_use[pipeline[plPtrExec].opcode]++; | |||
// dsp_opcode_use[pipeline[plPtrExec].opcode]++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is a cause of breaking @twinaphex
i'll re-add and repush
Alright, thanks. |
also i'm using pre-processor macros in XCode, |
Pushed update to un-delete this. |
The bit shifting and masking is expensive on ARM64 for some reason. The unions seem to greatly reduce the perfomance hit of these common calls.
The values for offset0 and offset1 were coming out to 63 when they should be no more than 3. I think the devide should have beena modulus? I wrote out the code with more vars to figure ouit what was going on
looking at the offical shamusworld repo the division was moved after the 0xFF checks
GPU_RUNNING running macro was pretty slow on ARM for some reason. Bitswise structs are faster in my testing
Signed-off-by: Joe Mattiello <mail@joemattiello.com>
Signed-off-by: Joe Mattiello <mail@joemattiello.com>
Signed-off-by: Joe Mattiello <mail@joemattiello.com>
Re-based and force pushed off of upstream b44e2e2 |
This pull request fixes 2 alerts when merging ada5c15 into b44e2e2 - view on LGTM.com fixed alerts:
|
FYI, libretro wiki says Wolf 3D already doesn't boot, Doom needs a hack which I don't have a menu UI for in Prov, not sure if you tested with that in retroarch. Another issue is that the wiki for retroarch doesn't mention Jaguar BIOS, but it's been known since the old forums on this core's original author that some games only run or run better with the emulation bios or the Atari official bios. The wiki for Jaguar makes no mention of BIOS so that should probably be amended and the core option for the bios be mentioned. virtualjaguar-libretro/libretro.c Line 58 in b44e2e2
|
Closing for #55 with rebase |
I'm making a PR for some old optimizations I made for an iOS fork.
This PR is more for the devs to test and cherry-pick, I don't expect this code to be "merge quality", but certainly buildable/testable at least.
For instance, probably don't want these hacks that were specific to iOS loading,
811e73f
The blitter really is the slowest part. I wanted to write something with SIMD or better vectorization or improved memcpy or something. That was by far the slowest part when I profiled this pretty heavily a while ago (at least it was excruciatingly slow on ARM without my C struct hacks to help the compiler optimize better.
Update:
Forgot to mention I have these macros defined in my build (in XCode) that need to be added to a header or C compile flags
C
flags
Inline macro may need to be customized based on arch. Mac OS is using Clang/C99/gnu++11 syntax for inline.
I also have
unroll loops
aka-qunroll=no
but i forget why.