-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Introduce (mini) unit test framework #1734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
16822f5
to
5fcb69c
Compare
Nice! I looked at existing unit test frameworks in the past, but nothing seemed appropriate for us. There are not that many for C, and they were either overkill or too simple (just handful of ifdefs) so they didn't add any functionality. I thought writing our own is too annoying (or I was just lazy). But the framework is ~300 lines, that seems fine to me. |
see also #1211 |
b1de641
to
7b184a1
Compare
9389623
to
f49e570
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @furszy. I played around with this a little bit. It reduces the execution time on my machine from 26 seconds to 10 seconds (-jobs=16
). Very nice! Some observations:
- It would be helpful if a helptext would be output when
tests
is run with-h
or--help
. - I think showing all the tests that have passed is a bit overkill. I'm already assuming that the tests pass if they do not show up in the output. I only need to see tests that don't pass.
- Maybe a future PR can autodetect the number of cores and set
-jobs
automatically by default? - There's an
-iter
command line flag, but the test output shows "test count
". It would be better if we were consistent.
|
af43348
to
0b3c74f
Compare
Thanks for the review jonasnick!
Awesome :). I think we can actually do even better, will do some changes.
The help message was actually already there, but for
Sure. Will hide the logging behind a
I'm not sure we want that. Sequential execution is usually "standard" on any system because we don’t know what else the user might be running. Picking a number of parallel tasks automatically (even if it is a low number) could hang the CPU or even make it run slower than sequential if the system is overloaded.
Sure 👍🏼. That was carried over from the previous code; will improve it. |
Yeah. Just reworked the framework to support registering and running groups of tests in a generic manner. This means we can now run specific tests and/or specific groups of tests via the On top of that, made the framework reusable across binaries and improved the overall API (we can now easily connect the Other than that, the A simple usage example: |
0b3c74f
to
4900aee
Compare
4900aee
to
aa5f041
Compare
0ddcd09
to
fed7e38
Compare
Thanks for the feedback, hebasto and theStack. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK fed7e38
Tested a bit more and verified that all existing tests have been adapted to the new framework. Didn't review the autotools and CMake changes in-depth as I'm not too familiar with build systems. Left only a few non-blocking nits below. Also, dumping some nice-to-have follow-up ideas came to my mind (not sure if introducing additional complexity is worth it though):
- show summary at the end (number of executed/passed/failed tests)
- if a single test fails, still run the others (would need some deeper changes though, as right now we just crash on a failed condition)
- allow to run tests specified by index numbers shown in
--print_tests
nit: unit_test.c is still in UTF-8 according to the file
utility.
fed7e38
to
10eee5d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unit_test.c is still in UTF-8 according to the file utility.
Fixed now. Thanks.
#define CASE(name) { #name, run_##name } | ||
#define CASE1(name) { #name, name } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, having this in a follow-up PR is fine. (But also here is fine.)
eb30dfa
to
9a0214b
Compare
To fix Autotools: diff --git a/Makefile.am b/Makefile.am
index d379c3f..dc79857 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -123,7 +123,7 @@ if USE_TESTS
TESTS += noverify_tests
noinst_PROGRAMS += noverify_tests
noverify_tests_SOURCES = src/tests.c
-noverify_tests_CPPFLAGS = $(SECP_CONFIG_DEFINES) -DSUPPORTS_CONCURRENCY=$(SUPPORTS_CONCURRENCY)
+noverify_tests_CPPFLAGS = $(SECP_CONFIG_DEFINES) $(TEST_DEFINES)
noverify_tests_LDADD = $(COMMON_LIB) $(PRECOMPUTED_LIB)
noverify_tests_LDFLAGS = -static
if !ENABLE_COVERAGE
diff --git a/configure.ac b/configure.ac
index 55eebdf..6028ee2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -447,8 +447,8 @@ fi
if test "x$enable_tests" != x"no"; then
AC_CHECK_HEADERS([sys/types.h sys/wait.h unistd.h])
AS_IF([test "x$ac_cv_header_sys_types_h" = xyes && test "x$ac_cv_header_sys_wait_h" = xyes &&
- test "x$ac_cv_header_unistd_h" = xyes], [SUPPORTS_CONCURRENCY=1])
- AC_SUBST(SUPPORTS_CONCURRENCY)
+ test "x$ac_cv_header_unistd_h" = xyes], [TEST_DEFINES="-DSUPPORTS_CONCURRENCY=1"], TEST_DEFINES="")
+ AC_SUBST(TEST_DEFINES)
fi
### |
9a0214b
to
6aebdb4
Compare
Thanks @hebasto! Updated with autotools patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK modulo #1734 (comment) (sorry for overlooking that earlier).
This discussion can be continued in #1724.
This comment can be addressed in a follow-up PR.
6aebdb4
to
b209c65
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK b209c65.
UPD. See #1734 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
Two overall comments:
- I saw some discussion about it, but I'm unconvinced about ignoring unknown arguments. I think the risk of silently ignoring a misspelled option is worse than the complexity of dealing with testing across versions.
- A big feature I'm missing is integration into the build framework, such that e.g. test invocations get automatically split by module, and have
ctest --test_dir=build -j16
run thetests
binary once for each module. This would automatically give well-balanced parallellism, that's automatically integrated into CI, even for projects that include it (Bitcoin Core). Is something like that planned?
src/unit_test.c
Outdated
} | ||
} | ||
|
||
/* Now that we have all sub-processes, distribute workload in round-robin */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "test: introduce (mini) unit test framework"
As long as there are no more than 256 tests (or some other metric to break them up by), you can use the make
parallellism trick.
Have a single pipe, which the master process writes to, and all child workers read from. The master just writers single-byte test identifiers to the pipe, and the workers read from it. Every byte will be read by exactly one worker process, which executes the test, and then reads another byte until done.
This gives you a super cheap and portable way of actually distributing the jobs equitably (round robin risks giving all long-running tests to the same process for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super neat! Thanks for the suggestion. It’s pretty amazing that the kernel can handle the pipe synchronization for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you've implemented it already. It's indeed neat. We currently have 89 tests [1], so there's some room for the foreseeable future (unless we start splitting the cases up too granularly).
[1] It is probably an attribution to the wonderful C89, which popped up here multiple times. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently have 89 tests [1] ...
At the high risk of totally ruining the C joke, I'm only seeing 88 tests on the latest push (2f4546c):
$ ./build/bin/tests -l | head -n 5
Available tests (16 modules):
========================================
Module: general (5 tests)
[ 1] selftest_tests
$ ./build/bin/tests -l | tail -n 7
[ 87] secp256k1_byteorder_tests
[ 88] cmov_tests
----------------------------------------
Run specific module: ./tests -t=<module_name>
Run specific test: ./tests -t=<test_name>
Did you indeed have 89 earlier (or, my best guess, had the same output, but assumed the test list starts counting at zero?). Just checking that all tests are covered, though I'm pretty certain they are, based on manual verification from an earlier review round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably just miscounted. I had counted manually using grep [-c] CASE(
and grep [-c] CASE1(
, trying to ignore false positives.
src/tests.c
Outdated
free(STATIC_CTX); | ||
secp256k1_context_destroy(CTX); | ||
|
||
testrand_finish(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "test: introduce (mini) unit test framework"
The testrand_finish()
call in here, when in parallel mode, isn't really useful, because the actually used randomness is in the child processes, without communication to the parent where this runs.
The point of this "random run ="
output line is verifying if two test runs are actually identical (not just the seed was the same, but all output of the RNG was the same). I don't think it's all that important, so if it breaks with this, it might be worth just removing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great catch. Concept ACK on removing the line.
But this reminds me of the purpose of outputting the initial seed, namely reproducibility. With that in mind, the output should also contain the target parameter. I think seed, target, and jobs together should make it possible to reproduce the run exactly. Perhaps it's nicer to simply output an entire command line like To reproduce, run ./tests --jobs ... --target ... --seed ...
or something like that.
See #1734 (comment). |
b209c65
to
eed1390
Compare
Lightweight unit testing framework, providing a structured way to define, execute, and report tests. It includes a central test registry, a flexible command-line argument parser of the form "--key=value" / "-k=value" / "-key=value" (facilitating future framework extensions), ability to run tests in parallel and accumulated test time logging reports. So far the supported command-line args are: - "--jobs=<num>" or "-j=<num>" to specify the number of parallel workers. - "--seed=<hex>" to specify the RNG seed (random if not set). - "--iterations=<num>" or "-i=<num>" to specify the number of iterations. Compatibility Note: To stay compatible with previous versions, the framework also supports the two original positional arguments: the iterations count and the RNG seed (in that order).
This not only provides a structural improvement but also allows us to (1) specify individual tests to run and (2) execute each of them concurrently.
Add a help message for the test suite, documenting available options, defaults, and backward-compatible positional arguments.
Add support for specifying single tests or modules to run via the "--target" or "-t" command-line option. Multiple targets can be provided; only the specified tests or all tests in the specified module/s will run instead of the full suite. Examples: -t=<test name> runs an specific test. -t=<module name> runs all tests within the specified module. Both options can be provided multiple times.
Useful option to avoid opening the large tests.c file just to find the test case you want to run.
When enabled (--log=1), shows test start, completion, and execution time.
eed1390
to
2f4546c
Compare
Updated per feedback. Thanks!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK 2f4546c
Sequential vs. parallel test runs on my arm64 workstation with 12 cores, seeing a nice ~2.7x speedup (I guess much more is possible in the futureif the longest-running tests are split up further):
$ uname -m -o -s -r -v
Linux 6.17.0-8-qcom-x1e #8-Ubuntu SMP PREEMPT_DYNAMIC Sun Aug 31 21:03:54 UTC 2025 aarch64 GNU/Linux
$ ./build/bin/tests --jobs=0
Tests running silently. Use '-log=1' to enable detailed logging
iterations = 16
jobs = 0. Sequential execution.
random seed = 09e628902f7e42bc5be35e96ebf5edee
Total execution time: 24.329 seconds
$ ./build/bin/tests --jobs=$(nproc)
Tests running silently. Use '-log=1' to enable detailed logging
iterations = 16
jobs = 12. Parallel execution.
random seed = 08636e4354f2ffa33260a6506827c00d
Total execution time: 9.002 seconds
(EDIT: noted after posting that I didn't compile in the recovery module, i.e. its 4 tests were not executed. But this doesn't change the numbers significantly.)
Early Note:
Don’t be scared by the PR’s line changes count — most of it’s just doc or part of the test framework API.
Context:
Currently, all tests run single-threaded sequentially and the library lacks the ability to specify which test (or group of tests) you would like to run. This is not only inconvenient as more tests are added but also time consuming during development and affects downstream projects that may want to parallelize the workload (such as Bitcoin-Core CI).
PR Goal:
Introduce a lightweight, extensible C89 unit test framework with no dynamic memory allocations, providing a structured way to register, execute, and report tests. The framework supports named command-line arguments in
-key=value
form, parallel test execution across multiple worker processes, granular test selection (selecting tests either by name or by module name), and time accumulation reports.The introduced framework supports:
-help
or-h
: display list of available commands along with their descriptions.-jobs=<num>
: distribute tests across multiple worker processes (default: sequential if 0).-target=<name>
or-t=<name>
: run only specific tests by name; can be repeated to select multiple tests.-target=<module name>
,-t=<module>
Run all tests within a specific module (can be provided multiple times)-seed=<hex>
: set a specific RNG seed (defaults to random if unspecified).-iterations=<n>
: specify the number of iterations.-list_tests
: display list of available tests and modules you can run.-log=<0|1>
: enable or disable test execution logging (default: 0 = disabled).Beyond these features, the idea is to also make future developments smoother, as adding new tests require only a single entry in the central test registry, and new command-line options can be introduced easily by extending the framework’s
parse_arg()
function.Compatibility Note:
The framework continues accepting the two positional arguments previously supported (iterations and seed), ensuring existing workflows remain intact.
Testing Notes:
Have fun. You can quickly try it through
./tests -j=<workers_num>
for parallel execution or./tests -t=<test_name>
to run a specific test (call./tests -print_tests
to display all available tests and modules).Extra Note:
I haven't checked the exhaustive tests file so far, but I will soon. For now, this only runs all tests declared in the
tests
binary.Testing Results: (Current master branch vs PR in seconds)