Skip to content

Conversation

@tmc
Copy link
Contributor

@tmc tmc commented Oct 31, 2025

Add benchmarks comparing SyscallN, RegisterFunc, and callback performance across different argument counts.
This helps measure and compare the overhead of different calling approaches in purego's function invocation system.

Closes #362

Current output:

goos: linux
goarch: arm64
pkg: github.com/ebitengine/purego
BenchmarkCallingMethods/RegisterFunc/Callback/1args-4            3150840               345.5 ns/op            96 B/op          6 allocs/op
BenchmarkCallingMethods/RegisterFunc/Callback/2args-4            3140649               415.8 ns/op           152 B/op          7 allocs/op
BenchmarkCallingMethods/RegisterFunc/Callback/3args-4            1962769               622.9 ns/op           224 B/op          8 allocs/op
BenchmarkCallingMethods/RegisterFunc/Callback/5args-4            2104015               549.3 ns/op           336 B/op         10 allocs/op
BenchmarkCallingMethods/RegisterFunc/Callback/10args-4           1280574              1162 ns/op             600 B/op         15 allocs/op
BenchmarkCallingMethods/RegisterFunc/Callback/14args-4            787353              1330 ns/op             888 B/op         19 allocs/op
BenchmarkCallingMethods/RegisterFunc/Callback/15args-4           1000000              1087 ns/op             928 B/op         20 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/1args-4               7312062               179.1 ns/op            40 B/op          3 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/2args-4               4575436               259.8 ns/op            72 B/op          4 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/3args-4               5702834               203.9 ns/op           112 B/op          5 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/5args-4               4748464               357.9 ns/op           176 B/op          7 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/10args-4              2852462               384.0 ns/op           328 B/op         12 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/14args-4              2587800               679.0 ns/op           472 B/op         16 allocs/op
BenchmarkCallingMethods/RegisterFunc/CFunc/15args-4              2343153               482.9 ns/op           512 B/op         17 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/1args-4                5052274               268.1 ns/op            56 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/2args-4                3734990               331.6 ns/op            80 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/3args-4                3882723               302.8 ns/op           112 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/5args-4                3444187               405.1 ns/op           160 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/10args-4               1766395               624.9 ns/op           272 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/14args-4               2003835               598.1 ns/op           416 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/Callback/15args-4               1750140               848.6 ns/op           416 B/op          3 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/1args-4                  23856542                49.10 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/2args-4                  25922942                46.47 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/3args-4                  25489951                49.48 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/5args-4                  24297174                51.78 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/10args-4                 23921144                48.11 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/14args-4                 24247138                52.55 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/SyscallN/CFunc/15args-4                 22587331                58.45 ns/op            0 B/op          0 allocs/op
BenchmarkCallingMethods/RoundTrip/1args-4                        4651420               222.7 ns/op            56 B/op          3 allocs/op
BenchmarkCallingMethods/RoundTrip/2args-4                        4831054               257.3 ns/op            80 B/op          3 allocs/op
BenchmarkCallingMethods/RoundTrip/3args-4                        3868870               380.7 ns/op           112 B/op          3 allocs/op
BenchmarkCallingMethods/RoundTrip/5args-4                        3571844               327.6 ns/op           160 B/op          3 allocs/op
BenchmarkCallingMethods/RoundTrip/10args-4                       2216022               608.3 ns/op           272 B/op          3 allocs/op
PASS
ok      github.com/ebitengine/purego    57.446s

@hajimehoshi
Copy link
Member

hajimehoshi commented Oct 31, 2025

The benchmark doesn't use C functions. Using C functions in a dynamic library would be more meaningful to test actual cases in the real world.

@tmc
Copy link
Contributor Author

tmc commented Oct 31, 2025

The benchmark doesn't use C functions. Using C functions in a dynamic library would be more meaningful to test actual cases in the real world.

will extend the callbacks to not be the go no-ops but c calls

Add benchmark tests comparing RegisterFunc, SyscallN, and callback performance
across different argument counts (1, 2, 3, 5, 10, 14, 15 args). The benchmarks
measure:

- RegisterFunc with Go callbacks vs C functions
- SyscallN with Go callbacks vs C functions
- Round-trip calls (Go → C → Go callback)

Includes corresponding C library with sum functions and callback wrappers
to enable realistic performance comparisons between different calling
approaches in purego.
@tmc
Copy link
Contributor Author

tmc commented Oct 31, 2025

done.

note: this uses long+uintptr because of the bug addressed in #360

Replace direct purego.Dlopen/Dlclose calls with load.OpenLibrary/CloseLibrary
from the internal load package for consistency with other test code. Includes
proper error handling in the defer cleanup function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add benchmarking

2 participants