No memmove #36

bsmiles32 · 2021-01-18T10:56:48Z

This PR is based on #35 (but might be rebasable on master if needed).
The intent was to avoid the "memmove" operation during each consume_cbuff_data call.
Now we expose the fact that the buffer might be split in 2 chunks and it's up to the caller to plan accordingly.
Some care has been taken to round accesses to a multiple of the sample size (a sample cannot be splitted between the 2 buffer chunks).

I'd like some feedback on low-end systems to see if that helps a bit with performance.

Jj0YzL5nvJ · 2021-02-16T20:59:35Z

Win7 x86

 __  __                         __   _  _   ____  _
|  \/  |_   _ _ __   ___ _ __  / /_ | || | |  _ \| |_   _ ___
| |\/| | | | | '_ \ / _ \ '_ \| '_ \| || |_| |_) | | | | / __|
| |  | | |_| | |_) |  __/ | | | (_) |__   _|  __/| | |_| \__ \
|_|  |_|\__,_| .__/ \___|_| |_|\___/   |_| |_|   |_|\__,_|___/
             |_|         https://mupen64plus.org/
Mupen64Plus Console User-Interface Version 2.5.9

UI-Console: attached to core library 'Mupen64Plus Core' version 2.5.9
UI-Console:             Includes support for Dynamic Recompiler.
Core: Using full mem base
Core: Goodname: Donkey Kong 64 (U) [!]
Core: Name: DONKEY KONG 64
Core: MD5: 9EC41ABF2519FC386CADD0731F6E868C
Core: CRC: EC58EABF AD7C7169
Core: Imagetype: .z64 (native)
Core: Rom size: 33554432 bytes (or 32 Mb or 256 Megabits)
Core: Version: 1449
Core: Manufacturer: Nintendo
Core: Country: USA
UI-Console Status: Cheat codes disabled.
UI-Console: using Video plugin: 'angrylion's RDP Plus v1.6' v0.1.0
UI-Console: using Audio plugin: 'Mupen64Plus SDL Audio Plugin' v2.5.9
Input: Using auto-config file at: 'InputAutoCfg.ini'
UI-Console: using Input plugin: 'Mupen64Plus SDL Input Plugin' v2.5.9
UI-Console: using RSP plugin: 'Static Interpreter' v0.1.1
Core: input plugin did not specify a render callback; there will be no on screen display by the input plugin.
Input: 0 SDL joysticks were found.
Input: N64 Controller #1: Forcing default keyboard configuration
Input: Using auto-config file at: 'InputAutoCfg.ini'
Input: 1 controller(s) found, 1 plugged in and usable in the emulator
Input Warning: Couldn't open rumble support for joystick #1
Input Warning: Couldn't open rumble support for joystick #2
Input Warning: Couldn't open rumble support for joystick #3
Input Warning: Couldn't open rumble support for joystick #4
Input: Mupen64Plus SDL Input Plugin version 2.5.9 initialized.
Core: Using video capture backend: dummy
Core: Game controller 0 (Standard controller) has a Memory pak plugged in
Core: Game controller 1 (Standard controller) has a Memory pak plugged in
Core: Game controller 2 (Standard controller) has a Memory pak plugged in
Core: Game controller 3 (Standard controller) has a Memory pak plugged in
Core: Using CIC type X105
Core: Setting video mode: 640x480
Audio: Using resampler trivial
Audio: Initializing SDL audio subsystem...
Input Warning: Couldn't open rumble support for joystick #1
Input Warning: Couldn't open rumble support for joystick #2
Input Warning: Couldn't open rumble support for joystick #3
Input Warning: Couldn't open rumble support for joystick #4
Core Warning: OSD not compatible with OpenGL core context. OSD deactivated.
Core: Initializing 4 RDRAM modules for a total of 8 MB
Core: Starting R4300 emulator: Dynamic Recompiler
Audio: Initializing SDL audio subsystem...
Audio Warning: sdl_push_samples: pushing 2216 bytes, but only 160 available !
Audio Warning: sdl_push_samples: pushing 608 bytes, but only 192 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 1368 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 660 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 1104 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 1548 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 1992 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 656 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 472 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 2080 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 744 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 36 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 480 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 924 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 1368 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 20 available !
Audio Warning: sdl_push_samples: pushing 2224 bytes, but only 464 available !
Audio Warning: sdl_push_samples: pushing 2232 bytes, but only 152 available !
Audio Warning: sdl_push_samples: pushing 2232 bytes, but only 1128 available !
Audio Warning: sdl_push_samples: pushing 2232 bytes, but only 940 available !
Audio Warning: sdl_push_samples: pushing 2232 bytes, but only 752 available !
Audio Warning: sdl_push_samples: pushing 2232 bytes, but only 564 available !
Audio Warning: sdl_push_samples: pushing 2232 bytes, but only 376 available !
Audio Warning: sdl_push_samples: pushing 2232 bytes, but only 188 available !
Audio Warning: sdl_push_samples: pushing 2240 bytes, but only 512 available !
Audio Warning: sdl_push_samples: pushing 2240 bytes, but only 972 available !
Audio Warning: sdl_push_samples: pushing 2240 bytes, but only 792 available !
Audio Warning: sdl_push_samples: pushing 2240 bytes, but only 612 available !
Audio Warning: sdl_push_samples: pushing 2240 bytes, but only 432 available !
Audio Warning: sdl_push_samples: pushing 2240 bytes, but only 252 available !
Audio Warning: sdl_push_samples: pushing 2240 bytes, but only 72 available !
Audio Warning: sdl_push_samples: pushing 2240 bytes, but only 1044 available !
Audio Warning: sdl_push_samples: pushing 2248 bytes, but only 864 available !
Audio Warning: sdl_push_samples: pushing 2248 bytes, but only 692 available !
Audio Warning: sdl_push_samples: pushing 2248 bytes, but only 520 available !
Audio Warning: sdl_push_samples: pushing 2248 bytes, but only 860 available !
Audio Warning: sdl_push_samples: pushing 2248 bytes, but only 688 available !
Audio Warning: sdl_push_samples: pushing 2248 bytes, but only 388 available !
Audio Warning: sdl_push_samples: pushing 2248 bytes, but only 216 available !
Audio Warning: sdl_push_samples: pushing 2256 bytes, but only 44 available !
Audio Warning: sdl_push_samples: pushing 2256 bytes, but only 392 available !
Audio Warning: sdl_push_samples: pushing 2256 bytes, but only 228 available !
Audio Warning: sdl_push_samples: pushing 2256 bytes, but only 704 available !
Audio Warning: sdl_push_samples: pushing 2256 bytes, but only 540 available !
Audio Warning: sdl_push_samples: pushing 2256 bytes, but only 376 available !
Audio Warning: sdl_push_samples: pushing 2256 bytes, but only 212 available !
Audio Warning: sdl_push_samples: pushing 2256 bytes, but only 48 available !
Audio Warning: sdl_push_samples: pushing 2256 bytes, but only 524 available !
Audio Warning: sdl_push_samples: pushing 2264 bytes, but only 1000 available !
Audio Warning: sdl_push_samples: pushing 2264 bytes, but only 204 available !
Audio Warning: sdl_push_samples: pushing 2264 bytes, but only 48 available !
Audio Warning: sdl_push_samples: pushing 2264 bytes, but only 1044 available !
Audio Warning: sdl_push_samples: pushing 2264 bytes, but only 376 available !
Audio Warning: sdl_push_samples: pushing 2264 bytes, but only 220 available !
Audio Warning: sdl_push_samples: pushing 2264 bytes, but only 704 available !
Audio Warning: sdl_push_samples: pushing 2264 bytes, but only 548 available !
Audio Warning: sdl_push_samples: pushing 2264 bytes, but only 392 available !
Audio Warning: sdl_push_samples: pushing 2264 bytes, but only 876 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 732 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 596 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 2240 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 2092 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 164 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 16 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 508 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 1000 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 340 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 832 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 172 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 664 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 4 available !
Audio Warning: sdl_push_samples: pushing 2272 bytes, but only 496 available !
Core Status: Stopping emulation.
Core: R4300 emulator finished.
Core Status: Rom closed.

mupen64plus/mupen64plus-core#415?

richard42 · 2021-02-20T04:47:53Z

it looks okay to me, though the complexity of the circular buffer has increased quite a bit and I didn't really try to think through every possible corner case. Is the memmove really a big enough performance concern to justify the increase in complexity?

This allows to avoid memmove after each consume_cbuff_data. Some extra care was necessary to round queue transactions to a multiple of the sample size. I also did some refactoring on the cbuff to allow it to grow and on the resamplers to allw them to report not only how many bytes were consumed, but also how many bytes were produced, so we can move cbuff pointers accordingly. tail/head methods reports the available sizes in 2 parameters : "available" for the first chunk size, and "extra" for the optional second chunk (which starts at the buffer beginning). It's up to the caller to split it's contiguous accesses around these two chunks. Another method that I initially wanted to try was the virtual memory trick to mirror the beggining of the buffer at the end of itself. But this trick requires some code specific platform and can be tricky when we need to grow the buffer. So I went this explicit 2 chunk approach which is simpler and doesn't require platform specific code.

bsmiles32 · 2021-02-21T19:51:03Z

@richard42 It's a fair point about complexity/performance. And I didn't really measure the performance of my changes, just thought about the theoretical benefits (and didn't observe any obvious change in my testings). That's why I asked if it helps for lower-end machines. Any how, I've rebased my changes, so it's easier to see what has changed.

I don't mind if this PR doesn't make it because of added complexity vs performance increase, I proposed it mostly because I could.

bsmiles32 requested review from Gillou68310 and richard42 January 18, 2021 10:56

bsmiles32 force-pushed the no_memmove branch from 1c76454 to f0ac812 Compare February 21, 2021 19:31

Jj0YzL5nvJ mentioned this pull request Mar 30, 2023

Mupen64plus's audio popping (Windows 10) mupen64plus/mupen64plus-core#571

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No memmove #36

No memmove #36

bsmiles32 commented Jan 18, 2021

Jj0YzL5nvJ commented Feb 16, 2021 •

edited

Loading

richard42 commented Feb 20, 2021

bsmiles32 commented Feb 21, 2021

No memmove #36

Are you sure you want to change the base?

No memmove #36

Conversation

bsmiles32 commented Jan 18, 2021

Jj0YzL5nvJ commented Feb 16, 2021 • edited Loading

richard42 commented Feb 20, 2021

bsmiles32 commented Feb 21, 2021

Jj0YzL5nvJ commented Feb 16, 2021 •

edited

Loading