Skip to content
Justus Calvin edited this page May 24, 2014 · 1 revision

Table of Contents

M-A-D-N-E-S-S Task Profiler

MADNESS includes a task profiler, which is an event-based profiler. It records submit, start, and stop times for all tasks run by the program, which is used to create a task trace files. Task profiles can then be generated with these trace files.

Enabling Task Profiler

To enable the task profiler, configure MADNESS with the --enable-task-profiler option.

Generating Task Trace Files

When the task profiler is enabled, the task manager will automatically collect timing information for all tasks. The data is written to a text file before the program exits. The output file is specified with the MAD_TASKPROFILER_NAME environment variable. For example:

export MAD_TASKPROFILER_NAME=<file name>
or for C shell
setenv MAD_TASKPROFILER_NAME <file name>
If this environment variable is not specified, the timing data is still collected but no task profile data will be output. You will also see the following warning at the beginning of your program.
!!! WARNING: MAD_TASKPROFILER_NAME not set.
!!! WARNING: There will be no task profile output.
The output file specified will be appended with MPI rank and the number of threads in your application. For example, if your file name is “world_profile” and you are using 8 threads on 2 nodes, the following files will be generated:
world_profile_0x8
world_profile_1x8
one on each node.

Generate task profiles

You can easily generate a flat profile with the taskprofile.pl script (located in src/bin):

$ perl taskprofile.pl world_profile_*
Which will parse all files that match world_profile_* and combine the data. A sample profile from src/lib/world/world is given below.
Wall time:         5.122230 (s)
Total task time:   1.583669 (s)

                                Time (s)
name   counts   total    average      %      latency
val1d_func(int)
     100000   1.4894500  0.0000149    29    0.0000174
val_func()
     100000   0.0634100  0.0000006     1    0.0000123
null_func()
     100000   0.0158480  0.0000002     0    0.0000205
pounder(madness::WorldContainer<int, Mary, madness::Hash<int> > const&, int)
          7   0.0142600  0.0020371     0    0.0000500
TTT::kate(madness::World*, std::string const&, double)
          2   0.0002130  0.0001065     0    0.0000790
TTT::mary()
          1   0.0001820  0.0001820     0    0.0000540
TTT::carl()
          1   0.0001450  0.0001450     0    0.0000610
UNKNOWN
          1   0.0000430  0.0000430     0    0.0000420
TTT::fred()
          1   0.0000260  0.0000260     0    0.0000590
madness::tr1::detail::memfunc_traits<double (Mary::*)(std::string const&, int, double)>::result_type madness::WorldContainerImpl<int, Mary, madness::Hash<int> >::itemfun<double (Mary::*)(std::string const&, int, double), std::string, int, double>(int const&, double (Mary::*)(std::string const&, int, double), std::string const&, int const&, double const&)
          1   0.0000200  0.0000200     0    0.0000100
madness::tr1::detail::memfunc_traits<std::string (Mary::*)(int, int)>::result_type madness::WorldContainerImpl<int, Mary, madness::Hash<int> >::itemfun<std::string (Mary::*)(int, int), int, int>(int const&, std::string (Mary::*)(int, int), int const&, int const&)
          1   0.0000200  0.0000200     0    0.0000400
TTT::jody(double, double, double)
          1   0.0000050  0.0000050     0    0.0000580
TTT::hugh(std::vector<madness::Future<int>, std::allocator<madness::Future<int> > >&)
          1   0.0000050  0.0000050     0    0.0000170
Foo::get3c(int, char, short) const
          1   0.0000040  0.0000040     0    0.0000140
TTT::dave(madness::World*)
          1   0.0000040  0.0000040     0    0.0000150
TTT::sara(double, std::complex<double> const&)
          1   0.0000040  0.0000040     0    0.0000690
Foo::get0f()
          1   0.0000030  0.0000030     0    0.0000170
dumb(int, int, int, int, int, int, int)
          1   0.0000030  0.0000030     0    0.0000080
Foo::get0()
          1   0.0000030  0.0000030     0    0.0000140
TTT::bert(int)
          1   0.0000030  0.0000030     0    0.0000380
Foo::get1c(int) const
          1   0.0000020  0.0000020     0    0.0000130
Foo::get4c(int, char, short, long) const
          1   0.0000020  0.0000020     0    0.0000160
Foo::get4(int, char, short, long)
          1   0.0000020  0.0000020     0    0.0000130
Foo::get5c(int, char, short, long, short) const
          1   0.0000020  0.0000020     0    0.0000130
Foo::get3(int, char, short)
          1   0.0000020  0.0000020     0    0.0000130
Foo::get1(int)
          1   0.0000020  0.0000020     0    0.0000220
Foo::get0c() const
          1   0.0000020  0.0000020     0    0.0000160
Foo::get2(int, char)
          1   0.0000020  0.0000020     0    0.0000130
Foo::get2c(int, char) const
          1   0.0000010  0.0000010     0    0.0000140
Foo::get5(int, char, short, long, short)
          1   0.0000010  0.0000010     0    0.0000130
0x0
          7   0.0000000  0.0000000     0    0.0000757