Releases: ARM-software/synchronization-benchmarks
Releases · ARM-software/synchronization-benchmarks
v2025.07
Makefile
- add
TAG
variable for distinguished build directory names - new targets for generating disassembly using
objdump -d
- templatize recipes for the specific tests in
/ext
Test changes
multiple tests
include/atomics.h
: fix asm positional constraints
pthread_mutex_lock
and pthread_mutex_trylock
- new tests
tbb_spin_rw_mutex
- fix volatile cast of state variable
- support USE_BUILTIN on aarch64
- use a CPU list instead of bitmask for pure readers to allow more than 64 CPUs to be specified
- use the lock pointer from the test harness instead of a global state variable for improved reproducibility
- parameterize pause count for
atomic_backoff_pause
- initialize per-thread counter to a unique value so that non-pure readers do not try to write at the same time
Lockhammer test harness
Command line flags
-T duration_sec
: renamed from-D duration_sec
-T tag
: stores a tag string in the results JSON-v -v
: increase verbosity to more verbose-o pinorder
: support CPU ranges, e.g. -o 10-20:2 for even-numbered CPUs in CPU10-20, inclusive-i interleave
: remove, and make interleave a subparameter to-t threads[:interleave]
-h
: print help only if-h
is used
Measurement
- always measure the TSC frequency on x86
- blackhole iterations chosen from median of 5 searches
- add
lock_acquire
andlock_release
callsite labels in disassembly
Reporting/JSON
- add a
test_type_name
string for test-specific variant - store
cpuorder_filename
andpinorder_string
in the JSON - store the
-T tag
in the JSON - store the
-o pinorder
and-t threads[:interleave]
argument aspinorder_string
in the JSON
Support scripts
view-results-json.sh
- handle missing keys
- handle non-numerical results, isnormal when using older
jq
versions - show the new pinorder/num_threads string
- add
-T tag
to select results only of that tag - add
-a additional_key
flag to also show data from that key
run-tests.sh
- show summary of configuration, and require
-n
or-N
to proceed - estimate runtime based on tests to run
- add flags to override the built-in lists of permuted parameters
- add
-T tag
to run only from the build dirs containing the tag - add
--
support to pass additional flags directly to the lh_$test program - add
-M hugepage_size
to specify the hugepage size to use instead of the default 1GB - rename the
-T duration_sec
flag to-D duration_sec
json-to-command.sh
- new; reconstructs the
lh_$testname
command line from a JSON's measurement record
v2025.05
Improve run-to-run reproducibility
- add time-based measurement mode so that no threads finish early to let the remaining threads have less competition
- reuse the same physical address run-to-run by allocating the lock memory from a persistent hugetlb page
- detect cpufreq driver and governors that could affect performance under load
- improve thread cleanup in case of crash or test timeout
Usability for scaling studies
- any-cpu pinorder assignment; CPU0 does not need to be used explicitly/implicitly
- multiple measurements per run, permuting iterations, critical/parallel durations, and num_threads/pinorders
- results capture to JSON, with a script provided to display/compare multiple JSONs
- per-thread fairness and execution duration data
- aarch64: runtime support to disable LSE in outline atomics
Improve maintainability
- separate compilation of measurement code so that the test harness does not use the same target optimizations as the lock implementation; faster compilation
- rewritten Makefile with automatic dependencies, separate build directories, parallel make, and phony targets for building all variants in one make command
- implement a __cpu_relax() macro to centralize its implementation across tests
- long options support and help screen
- vim modelines for per-file whitespace type and indentation spacing
Test-specific changes
- osq_lock: support smp_cond_load_relaxed macro, provide more control over relax and backoff durations
- jvm_objectmonitor: more accurately represent jdk-9, fix a use-after-free, reduce the presence of malloc
- cas_event_mutex: allow use of __atomic intrinsic on aarch64
- various: quell compiler warnings, support compilation with clang, document limitations/TODO