Skip to content

Debugging Strategies

robertjharrison edited this page Nov 13, 2014 · 2 revisions

Table of Contents

Tips to find low-level bugs

The basic programming model is software data flow with each sub-task (a node on the execution graph) guaranteed to complete and with all dependencies (edges on the execution graph) managed by Futures as arguments and return value. Thus, debugging mostly revolves around ensuring that

  • data structures are being maintained in a consistent state,
  • any task once started is guaranteed to complete independent of any other activity (it will not be started until its arguments are ready), and that
  • all dependencies will eventually be satisfied.
By thinking about the execution as a DAG (directed acyclic graph) these conditions are pretty easy to satisfy as long as the pertinent data structures are in a consistent state. If you know how to write a sequential algorithm in this model, the parallel execution should be automatically correct.

If the computation is dependent upon information outside of the execution graph (i.e., the methods invoked are not pure functions) then more factors must be considered. For instance, if local computation is being driven off a (local or global) data structure that someone else might modify. This is the usual shared-memory problem but I am seeking the right set of abstractions and iterators may let us squeeze it into an acceptable and avoidable corner.

Command line options

  • "-dx" — on machines (e.g., Linux clusters) with both gdb and X-windows this will launch an xterm for each process with gdb attached to it. (see below). (You need to have compiled with the -DHAVE_XTERM_DEBUG flag for this to work.)
  • "-rio" — redirects I/O to standard output and error from each process to its own file named "log.<processID>". This stops the output from multiple processes interleaving and also avoids the problem of output to the terminal being lost upon error termination due to buffers in MPI, etc., not being flushed.
  • "-dam" — prints messages when active messages are sent or received. Be aware that this generates a lot of output.

Using debuggers

  • "catch throw" — is very useful inside GDB to catch C++ exceptions at the point where they are thrown.
  • For debuggers that cannot catch C++ exceptions configure MADNESS with "MADNESS_ASSERTIONS ABORT" on the doconfig command line. This will cause MADNESS ASSERTIONS to print a message and then call a routine with address 0x0 to force a SEGV.
  • GDB has some useful commands for threads
    • thread thread no switches to thread threadno (numbered from 1)
    • info threads prints current location of each thread
    • thread apply [threadno] [all] args applies command to specified threads

Translating addresses and symbols

GNU binutils under Linux have several very useful tools

  • "addr2line -e executable hexaddress" --- translates an address inside the executable into source file and line number. Particularly useful if you have the address of a function to be invoked by an active message handler.
  • "c++filt" -- demangles C++ symbol names. C++ symbols are given complex names in the actual executable file to support templates, type safety, and orthogonality of name spaces. This translates them back into something more familiar.