SortChecker is a tool for detecting violations
of ordering axioms
in comparison functions passed to qsort
(also bsearch
, lfind
, etc.). For complex data structures it's very
easy to violate one of the requirements. Such violations cause
undefined behavior and may lead to all sorts of runtime
errors (including unexpected results,
inconsistent results across different platforms
or even aborts) (also here, see this answer for explanations).
The tool works by intercepting qsort
and friends through LD_PRELOAD
and performing various checks prior to passing control to libc.
It could be applied to both C and C++ programs although for the
latter std::sort
and std::binary_search
are more typical
(use my SortChecker++ tool
to diagnose errors in them).
The tool is quite robust - I've successfully booted stock Ubuntu 14, Fedora 22 and Debian chroot and bootstrapped GCC 4.9.
The project is MIT-licensed. It has no fancy dependencies, just Glibc and Bash.
I've done some basic testing of Ubuntu 14.04 and Fedora 22 distro under SortChecker (open file/web browsers, navigate system menus, install various apps, etc.).
The tool has found errors in many programs. Here are some trophies:
- Libxt6: Invalid comparison function
- Libharfbuzz: Invalid comparison function (fixed)
- Libharfbuzz: Unsorted array used in bsearch (fixed)
- Cpio: HOL_ENTRY_PTRCMP triggers undefined behavior
- GCC: reload_pseudo_compare_func violates qsort requirements (fixed)
- GCC: libbacktrace: bsearch over unsorted array in unit_addrs_search (intentional)
- GCC: Fix intransitive comparison in dr_group_sort_cmp (was already fixed on trunk)
- GCC: Fix qsort ordering violation in tree-vrp.c (confirmed)
- dpkg: pkg_sorter_by_listfile_phys_offs violates qsort requirements (fixed)
- Fontforge: line_pt_cmp violates qsort ordering axioms
- Flexible I/O Tester: Invalid comparison function (fixed)
- Infernal: Inconsistent results from qsort callback (fixed)
- Graphicsmagick: Inconsistent results from qsort callback (fixed)
- PostGIS: Inconsistent results from qsort callback (fixed)
- Grass: Inconsistent results from qsort callback in g.mkfontcap (fixed)
I haven't seen a noticeable slowdown when working in a fully checked distro or building C++ projects with a checked compiler.
You do not need to rebuild your app to test it under SortChecker.
Just run with preloaded libsortcheck.so
:
$ LD_PRELOAD=libsortcheck.so myapp ...
(you'll probably want to combine this with some kind of regression or random/fuzz testing to achieve good coverage).
You could also use a helper script sortcheck
to do this for you:
$ sortcheck myapp ...
To debug the issue, you can run with
$ SORTCHECK_OPTIONS=raise=1 sortcheck myapp ...
and then examine generated coredump in gdb
.
By default SortChecker enables a set of common checks which should
be enough for most users. You can also customize it's behavior
through SORTCHECK_OPTIONS
environment variable which is
a colon-separated list of option assignments e.g.
$ export SORTCHECK_OPTIONS=debug=1:max_errors=10
You can also put option string to /SORTCHECK_OPTIONS
file
(this is particularly useful for testing of daemon processes).
Supported options are
max_errors
- maximum number of errors to report (default 10)debug
- print debug info (default false)print_to_file
- print warnings to specified file (rather than default stderr)print_to_syslog
- print warnings to syslog instead of stderr (default false)do_report_error
- print reports (only used for benchmarking, default true)raise
- raise signal on detecting violation (useful for inspecting issues in debugger)sleep
- sleep for N seconds before printing error and continuing (may be useful for attaching with gdb and examining the situation)check
- comma-separated list of checks to perform; available options aredefault
- default set of checks (see below)basic
- check that comparison functions return stable results and does not modify inputs (enabled by default)sorted
- check that arrays passed to bsearch are sorted (enabled by default)symmetry
- check that cmp(x,y) == -cmp(y,x) (enabled by default)transitivity
- check that if x < y && y < z, then x < z (enabled by default)reflexivity
- check that cmp(x,x) == 0 (usually not very important so disabled by default, on the other hand may trigger on otherwise undetected asymmetry bugs)unique
- check that cmp does not compare different objects as equal (to avoid random orderings on different platforms)good_bsearch
- bsearch uses a restricted (non-symmetric) form of comparison function so some checks are not generally applicable; this option tells SortChecker that it should test bsearch more aggressively (unsafe so disabled by default). Note that this option may cause runtime errors or crashes if applied inappropriately.- for each option
XYZ
there's a dualno_XYZ
(which disables corresponding check)
start
- check thestart
-th group of 32 leading elements (default 0); a value ofrand
will select random group.
You can run full Linux distro under SortChecker:
-
add full path to
libsortcheck.so
to/etc/ld.so.preload
-
create a global config:
$ echo print_to_syslog=1:check=default:start=rand | sudo tee /SORTCHECK_OPTIONS $ sudo chmod a+r /SORTCHECK_OPTIONS
-
reboot
Due to randomized order of checks it makes sense to check for errors and reboot several times to detect more errors.
Disclaimer: in this mode libsortcheck.so will be preloaded to all your processes so any malfunction may permanently break your system. It's highly recommended to backup the disk or make VM snapshot.
To build the tool, simply run make from project top directory.
Makefile supports various candies (e.g. AddressSanitizer,
debug build, etc.) - run make help
for mode details.
If you enable AddressSanitizer you'll need to add libasan.so
to LD_PRELOAD
(before libsortcheck.so
).
To test the tool, run make check
. Note that I've myself only
tested SortChecker on Ubuntu and Fedora.
- SortChecker is not fully thread-safe yet (should be easy to fix though)
- SortChecker is currently Linux-only (relies on
LD_PRELOAD
)
The tool only supports C now which rules out most of C++ code
because it uses (inline) std::sort
and std::binary_search
(and other similar APIs). For those see another tool
SortChecker++
which does a simple compile-time instrumentation via Clang.
It would be great to make SortChecker a part of standard debuggin tool like UBsan. Here's a discussion in LLVM mailing list which unfortunately didn't go too far.
It may also make sense to check other popular sorting APIs:
qsort_s
,bsearch_s
(are they availabile/used?)fts_open
,scandir
- Berkeley DB's
set_bt_compare
,set_dup_compare
, etc. - Glib2's
g_qsort_with_data
and other users of GCompareFunc/GCompareDataFunc - Gnulib's
gl_listelement_compar_fn
and friends - Libiberty's
splay_tree
API - OpenSSL's
objects.h
API - etc.
Here's less high-level stuff (sorted by priority):
- ensure that code is thread-safe (may need lots of platform-dependent code for atomics...)
- print complete backtrace rather than just address of caller (libunwind?)
- print array elements which triggered errors (i.e. hex dumps)
- use random array subsets for testing
- other minor TODO/FIXME are scattered all over the codebase