We split this into three steps:
- Getting the software environment.
- Running the benchmarks.
- Running the analysis.
Nix package manager^[See https://nixos.org/guides/how-nix-works] is a user-level package manager available on many platforms. Nix uses build-from-source and a binary cache; this is more resilient than Dockerhub because if the binary cache goes down or removes packages for any reason, the client can always build them from source, so long as the projects don't disappear from GitHub. We considered creating a Docker image, but BenchExec, the tool we use to get consistent running times, manipulates cgroups and systemd, and we did not have enough time to figure out how to run in Docker or Podman.
Install Nix with:
$ curl -f -L https://install.determinate.systems/nix | sh -s -- install
This installer also enables "flakes" and the "nix-command". If you installed Nix by another method, see this page to enable flakes and the nix-command.
$ git clone https://github.com/charmoniumQ/PROBE
$ cd PROBE/benchmark
Then use Nix to activate the development; this may take ten minutes as it builds the necessary dependencies. This will start a new interactive shell in the proper environment. From here, you will have all the dependencies to run the steps below.
$ nix develop
-
Test
nix shell '.#rr' --command rr record result/bin/ls
. If this issues an error regardingkernel.perf_event_paranoid
, follow its advice and confirm that resolves the error.-
AMD Zen CPUs may require extra setup
-
Note that RR may require extra setup for virtual machines.
-
Finally, AMD CPUs are hit-or-miss with rr, due to its low-level machine-code operations. This is a known weakness of
rr
. If you can't getrr
to work, and you can't get an Intel CPU, you may disablerr
by removing it fromworking
at the bottom ofprov_collectors.py
. You will still be able to collect statistics on the other (more portable) provenance collectors.
-
Note that we use the Python from our software environment, not from the system, to improve determinism.
We wrote a front-end to run the scripts called runner.py
.
Run with --help
for more information.
Briefly, it takes a --collectors
, --workloads
, and --iterations
, which specify what to run.
For the paper, we ran
$ ./runner.py \
--collectors run-for-usenix \
--workloads run-for-usenix \
--iterations 5 \
--verbose
Multiple --collectors
and --workloads
can be given, for example,
$ ./result/bin/python runner.py \
--collectors noprov \
--collectors strace \
--workloads lmbench \
--workloads postmark
See the bottom of prov_collectors.py
and workloads.py
for the name-mapping.
The analysis is written in a Jupyter notebook called ``notebooks/cross-val.ipynb''. It begins by checking for anomalies in the data, which we've automated as much as possible, but please sanity check the graphs before proceeding.
The notebook can be launched from our software environment by:
cd notebooks
jupyter notebook
For new benchmark items,
- Go to
workload.py
. - Write a new workload class that implements the
Workload
interface. - Add an instance of your workload class to
WORKLOAD_GROUPS
. - Call
./result/bin/runner
(see above) with--workloads workload_name
whereworkload_name
is the lowercased name of your workload class or name of a group containing your workload class.
For new provenance collectors:
- Go to
collectors.py
. - Write a new class that implements the
Collector
interface. - Add an instance of your workload class to
COLLECTORS
. - Call
./result/bin/runner
(see above) with--collectors collector_name
wherecollector_name
is the lowercased name of your collector class or name of a group containing your collector class.
Note that the attribute nix_packages
, in both cases, contains a list of strings that reference packages defined as in package outputs for the current architecture the flake.nix
.
Using Nix to build our software environment ensures that all architectures and POSIX platforms can reproducibly build the relevant software environments.
./runner.py --workloads small-calib --collectors fast --iterations 1 --warmups 30
./runner.py --workloads big-calib --collectors noprov --iterations 1 --warmups 5 --append
./runner.py --workloads fast --collectors fast --iterations 3 --warmups 3 --append