Prev | Table of Contents | Next
After you have configured your test environment
and written your test files, you can autogenerate a
Makefile using flit update
. At this point, the easiest thing to do is to use
the flit make
tool to run all of your tests and generate the sqlite database.
However, you can do things manually and gain more control over things.
Before talking about the executable generated for running the tests, let's
briefly talk about the generated Makefile. There are a few targets worth
mentioning. This is not a comprehensive list. A comprehensive list can be
obtained by calling make help
after generating the Makefile using flit update
.
- help: Shows some documentation of available Makefile targets
- dev: Create the development executable called
devrun
. This is intended to help you develop and debug your tests. It is much quicker to compile and run this one target than the hundreds of executables generated from a full flit run. - gt: Similar to the dev target, but compiled with the ground-truth
compilation, called
gtrun
. This executable is used in the full run to compare against the results of the other compiled executables. - runbuild: Only do all of the building necessary for doing the run.
Separated from doing the run because you may not want to execute tests in
parallel in case (1) there is contention between them, or (2) they would
interfere with proper timing measurements. The compiled executables will be
placed in the
bin
directory. You may build in parallel using this target and then run sequentially using therun
target. - run: Build and execute all of the tests under all combinations of
compilers, optimization levels, and flags. If the runbuild target has
already been invoked, then this only runs the tests and generates results
into csv files located in the
results
directory. - clean: Remove all intermediate files such as object files. Does not remove executables or results.
- distclean: Remove everything including executables and results (but not the results database).
The output from make
is very brief by default. If you want to see all of the
compilation details, define VERBOSE=1
or VERBOSE=true
in as an argument to
make
or as an environment variable (e.g., make VERBOSE=1 ...
)
The test executables that are generated all have the same command-line
interface. To get the comprehensive documentation, send the --help
option to
it. For example:
./devrun --help
One thing to note is that you can execute a single test or a sequence of specified tests simply by specifying them on the command-line. For example,
$ flit init --directory flit-litmus --litmus-tests
$ cd flit-litmus
$ make dev -j10
$ ./devrun Paranoia
name,host,compiler,optl,switches,precision,score_hex,score,resultfile,comparison_hex,comparison,file,nanosec
Paranoia,bihexal,g++,-O2,-funsafe-math-optimizations,d,0x4002a000000000000000,10,NULL,NULL,NULL,devrun,1000028414
Paranoia,bihexal,g++,-O2,-funsafe-math-optimizations,e,0x4002a000000000000000,10,NULL,NULL,NULL,devrun,1000030686
Paranoia,bihexal,g++,-O2,-funsafe-math-optimizations,f,0x4002a000000000000000,10,NULL,NULL,NULL,devrun,1000043012
The above creates a new directory called flit-litmus
that is populated with
the flit litmus tests. It then compiles the dev build and
executes only the Paranoia
test instead of running all
of them. You can list all available tests with the --litmus-tests
option.
$ ./devrun --list-tests
DistributivityOfMultiplication
DoHariGSBasic
DoHariGSImproved
DoMatrixMultSanity
DoOrthoPerturbTest
DoSimpleRotate90
DoSkewSymCPRotationTest
Empty
FMACancel
FtoDecToF
InliningProblem
KahanSum
Paranoia
ReciprocalMath
RotateAndUnrotate
RotateFullCircle
ShewchukSum
SinInt
TrianglePHeron
TrianglePSylv
aPbPc
aXbDivC
aXbXc
addSub
addTOL
divc
dotProd
langCompDot
langCompDotFMA
langDotFMA
negAdivB
negAminB
negAplusB
simpleReduction
subnormal
xDivNegOne
xDivOne
xMinusX
xMinusZero
xPc1EqC2
xPc1NeqC2
zeroDivX
zeroMinusX
In addition to only executing a particular test, you can limit which precision
to execute instead of doing all of them. This is with the option of
--precision
. There are four potential values for this option,
float
: 32-bit floating-point onlydouble
: 64-bit floating-point onlylong double
: 80-bit floating-point onlyall
(default): run all of the above precisions
In the following example, the TrianglePHeron example is executed only for 32-bit floats:
$ ./devrun --precision float TrianglePHeron
name,host,compiler,optl,switches,precision,score_hex,score,resultfile,comparison_hex,comparison,file,nanosec
TrianglePHeron,bihexal,g++,-O2,-funsafe-math-optimizations,f,0x3ff3e400000000000000,0.00043487548828125,NULL,NULL,NULL,devrun,6137
When making your tests, you can use the flit::info_stream
stream for
outputting useful debug information, such as intermediate answers and expected
results. By default, everything sent into flit::info_stream
is suppressed
and ignored. In order to get it to print these messages to the console, you
pass the --verbose
option to the test executable:
$ ./devrun --no-timing --verbose --precision double SinInt
SinInt: score = 1
SinInt: score - 1.0 = 0
SinInt-d: # runs = 1
name,host,compiler,optl,switches,precision,score_hex,score,resultfile,comparison_hex,comparison,file,nanosec
SinInt,bihexal,g++,-O2,-funsafe-math-optimizations,d,0x3fff8000000000000000,1,NULL,NULL,NULL,devrun,0
Each test is timed in order to profile the performance benefits of each of the
compilations. This is turned on by default, but can be turned off with the
--no-timing
command-line option.
The timing functionality is implemented in the same way the the timeit
module
from python is implemented. In essence, it will run the test in a loop until
the total time is at least 0.2 seconds and the average runtime will be computed
and returned. It will repeat this procedure three times and report the lowest
of the three returned averages. This method produces a pretty accurate method
of timing.
Because timing can be influenced if many things are executing together, when
you execute your tests with flit make
, the compilations will be done in
parallel, but the tests will be executed sequentially so that they do not
interfere with timing measurements. This behavior can be overridden.
Instead of having the number of loops for your tests be automatically determined or the number of times to repeat the timing be set to the default of three, you can specify exactly how many times you would like to run them. This is controlled with the following options:
-l LOOPS
,--timing-loops LOOPS
: specify how many times to loop over running tests instead of automatically determining how many loops will make it take at least 0.2 seconds.-r REPEATS
,--timing-repeats REPEATS
: specify how many times to repeat the timing. The final time is the smallest average time returned from the looping.
Here is some pseudo-code of how this works:
if timing:
for _ in range(repeats):
avg_times = []
if loops_are_specified:
start = time()
for _ in range(loops):
run_test()
end = time()
avg_times.append((end - start) / loops)
else:
loops = 1
avg_time = 0.0 seconds
while avg_time < 0.2 seconds:
start = time()
for _ in range(loops):
run_test()
end = time()
avg_time = (end - start) / loops
loops *= 10
avg_times.append(avg_time)
min_avg_time = min(avg_times)
else:
run_test()
min_avg_time = 0.0 seconds