Skip to content

Latest commit

 

History

History
227 lines (196 loc) · 8.07 KB

test-executable.md

File metadata and controls

227 lines (196 loc) · 8.07 KB

Test Executable

Prev | Table of Contents | Next

After you have configured your test environment and written your test files, you can autogenerate a Makefile using flit update. At this point, the easiest thing to do is to use the flit make tool to run all of your tests and generate the sqlite database. However, you can do things manually and gain more control over things.

Autogenerated Makefile

Before talking about the executable generated for running the tests, let's briefly talk about the generated Makefile. There are a few targets worth mentioning. This is not a comprehensive list. A comprehensive list can be obtained by calling make help after generating the Makefile using flit update.

  • help: Shows some documentation of available Makefile targets
  • dev: Create the development executable called devrun. This is intended to help you develop and debug your tests. It is much quicker to compile and run this one target than the hundreds of executables generated from a full flit run.
  • gt: Similar to the dev target, but compiled with the ground-truth compilation, called gtrun. This executable is used in the full run to compare against the results of the other compiled executables.
  • runbuild: Only do all of the building necessary for doing the run. Separated from doing the run because you may not want to execute tests in parallel in case (1) there is contention between them, or (2) they would interfere with proper timing measurements. The compiled executables will be placed in the bin directory. You may build in parallel using this target and then run sequentially using the run target.
  • run: Build and execute all of the tests under all combinations of compilers, optimization levels, and flags. If the runbuild target has already been invoked, then this only runs the tests and generates results into csv files located in the results directory.
  • clean: Remove all intermediate files such as object files. Does not remove executables or results.
  • distclean: Remove everything including executables and results (but not the results database).

The output from make is very brief by default. If you want to see all of the compilation details, define VERBOSE=1 or VERBOSE=true in as an argument to make or as an environment variable (e.g., make VERBOSE=1 ...)

Test Executable Details

The test executables that are generated all have the same command-line interface. To get the comprehensive documentation, send the --help option to it. For example:

./devrun --help

Execute Only Particular Tests

One thing to note is that you can execute a single test or a sequence of specified tests simply by specifying them on the command-line. For example,

$ flit init --directory flit-litmus --litmus-tests
$ cd flit-litmus
$ make dev -j10
$ ./devrun Paranoia
name,host,compiler,optl,switches,precision,score_hex,score,resultfile,comparison_hex,comparison,file,nanosec
Paranoia,bihexal,g++,-O2,-funsafe-math-optimizations,d,0x4002a000000000000000,10,NULL,NULL,NULL,devrun,1000028414
Paranoia,bihexal,g++,-O2,-funsafe-math-optimizations,e,0x4002a000000000000000,10,NULL,NULL,NULL,devrun,1000030686
Paranoia,bihexal,g++,-O2,-funsafe-math-optimizations,f,0x4002a000000000000000,10,NULL,NULL,NULL,devrun,1000043012

The above creates a new directory called flit-litmus that is populated with the flit litmus tests. It then compiles the dev build and executes only the Paranoia test instead of running all of them. You can list all available tests with the --litmus-tests option.

$ ./devrun --list-tests
DistributivityOfMultiplication
DoHariGSBasic
DoHariGSImproved
DoMatrixMultSanity
DoOrthoPerturbTest
DoSimpleRotate90
DoSkewSymCPRotationTest
Empty
FMACancel
FtoDecToF
InliningProblem
KahanSum
Paranoia
ReciprocalMath
RotateAndUnrotate
RotateFullCircle
ShewchukSum
SinInt
TrianglePHeron
TrianglePSylv
aPbPc
aXbDivC
aXbXc
addSub
addTOL
divc
dotProd
langCompDot
langCompDotFMA
langDotFMA
negAdivB
negAminB
negAplusB
simpleReduction
subnormal
xDivNegOne
xDivOne
xMinusX
xMinusZero
xPc1EqC2
xPc1NeqC2
zeroDivX
zeroMinusX

Execute Only Particular Precisions

In addition to only executing a particular test, you can limit which precision to execute instead of doing all of them. This is with the option of --precision. There are four potential values for this option,

  • float: 32-bit floating-point only
  • double: 64-bit floating-point only
  • long double: 80-bit floating-point only
  • all (default): run all of the above precisions

In the following example, the TrianglePHeron example is executed only for 32-bit floats:

$ ./devrun --precision float TrianglePHeron
name,host,compiler,optl,switches,precision,score_hex,score,resultfile,comparison_hex,comparison,file,nanosec
TrianglePHeron,bihexal,g++,-O2,-funsafe-math-optimizations,f,0x3ff3e400000000000000,0.00043487548828125,NULL,NULL,NULL,devrun,6137

Verbose Output

When making your tests, you can use the flit::info_stream stream for outputting useful debug information, such as intermediate answers and expected results. By default, everything sent into flit::info_stream is suppressed and ignored. In order to get it to print these messages to the console, you pass the --verbose option to the test executable:

$ ./devrun --no-timing --verbose --precision double SinInt
SinInt: score       = 1
SinInt: score - 1.0 = 0
SinInt-d: # runs = 1
name,host,compiler,optl,switches,precision,score_hex,score,resultfile,comparison_hex,comparison,file,nanosec
SinInt,bihexal,g++,-O2,-funsafe-math-optimizations,d,0x3fff8000000000000000,1,NULL,NULL,NULL,devrun,0

Timing

Each test is timed in order to profile the performance benefits of each of the compilations. This is turned on by default, but can be turned off with the --no-timing command-line option.

The timing functionality is implemented in the same way the the timeit module from python is implemented. In essence, it will run the test in a loop until the total time is at least 0.2 seconds and the average runtime will be computed and returned. It will repeat this procedure three times and report the lowest of the three returned averages. This method produces a pretty accurate method of timing.

Because timing can be influenced if many things are executing together, when you execute your tests with flit make, the compilations will be done in parallel, but the tests will be executed sequentially so that they do not interfere with timing measurements. This behavior can be overridden.

Instead of having the number of loops for your tests be automatically determined or the number of times to repeat the timing be set to the default of three, you can specify exactly how many times you would like to run them. This is controlled with the following options:

  • -l LOOPS, --timing-loops LOOPS: specify how many times to loop over running tests instead of automatically determining how many loops will make it take at least 0.2 seconds.
  • -r REPEATS, --timing-repeats REPEATS: specify how many times to repeat the timing. The final time is the smallest average time returned from the looping.

Here is some pseudo-code of how this works:

if timing:
    for _ in range(repeats):
        avg_times = []
        if loops_are_specified:
            start = time()
            for _ in range(loops):
                run_test()
            end = time()
            avg_times.append((end - start) / loops)
        else:
            loops = 1
            avg_time = 0.0 seconds
            while avg_time < 0.2 seconds:
                start = time()
                for _ in range(loops):
                    run_test()
                end = time()
                avg_time = (end - start) / loops
                loops *= 10
            avg_times.append(avg_time)
    min_avg_time = min(avg_times)
else:
    run_test()
    min_avg_time = 0.0 seconds

Prev | Table of Contents | Next