Skip to content
bartoszbosak edited this page Oct 12, 2016 · 1 revision

The C API of MUSCLE is a subset of the C++ API, the C API is again a subset of the Java API. Because the core of MUSCLE is written in Java, any C++ executable will communicate with Java code, but this is hidden in the API. In both cases, the include path is `$MUSCLE_HOME/include`.

C API

The header file for the C API is `muscle2/cmuscle.h`. The functions there are limited, but they should be sufficient for most needs:

muscle_error_t MUSCLE_Init(int* argc, char*** argv);
void MUSCLE_Finalize(void);

const char* MUSCLE_Kernel_Name();
const char* MUSCLE_Get_Property(const char* name);
int MUSCLE_Will_Stop();

muscle_error_t MUSCLE_Send(const char *exit_name, void *array, size_t size, muscle_datatype_t type);
void* MUSCLE_Receive(const char *entrance_name, void *array, size_t *size, muscle_datatype_t type);

int MUSCLE_Barrier_Init(char **barrier, int *len, int num_mpi_ranks);
int MUSCLE_Barrier(const char *barrier);
void MUSCLE_Barrier_Destroy(char *barrier);
The `MUSCLE_Init` function must be called before any of the other MUSCLE functions is used. It initializes MUSCLE given the current arguments of main. After all MUSCLE calls have been made, MUSCLE_Finalize should be called. Usually:
#include <muscle2/cmuscle.h>

int main(int argc, char *argv[])
{
    MUSCLE_Init(&argc, &argv);
    ...
    MUSCLE_Finalize();
    return 0;
}
After this, any code can be written. The `MUSCLE_Kernel_Name` function gives the current kernel name. This must NOT be freed after use. Similarly, MUSCLE_Get_Property gives a property as a string, which must NOT be freed afterwards.

The send and receive functions can transport several datatypes, which are enumerated in `muscle2/muscle_types.h`. For reference, it takes the following datatypes

`muscle_datatype_t` C/C++ data type Java data type maximum size
`MUSCLE_DOUBLE` `double *` `double[]` 134e6 values
`MUSCLE_FLOAT` `float *` `float[]` 268e6 values
`MUSCLE_INT32` `int *` `int[]` 268e6 values
`MUSCLE_INT64` `long *` `long[]` 134e6 values
`MUSCLE_STRING` `const char *` `String` 64e3 characters
`MUSCLE_RAW` `unsigned char *` `byte[]` 268e6 values
`MUSCLE_COMPLEX` `muscle::ComplexData *` any other object 1 GiB

The type `MUSCLE_COMPLEX` is only available for C++ code. `MUSCLE_STRING` will send data up to the first nul-terminator or given length, whichever is first.

Receiving a message can be done in two ways: either the memory is initialized beforehand and the number of elements in the array is given as the third argument, or a 0-pointer is passed, in which case MUSCLE will allocate the memory. In both cases, the memory must be freed by the user. For example:

size_t sz = 100;
double *data = (double *)malloc(sz*sizeof(double));
if (data) {
    for (int i = 0; !MUSCLE_Will_Stop(); i++) {
        MUSCLE_Receive("exitName", data, &sz, MUSCLE_DOUBLE);
        // sz will be set to the actual size of the received message
        // do something with the data
    }
    free(data);
}
or
size_t sz = 0;
for (int i = 0; !MUSCLE_Will_Stop(); i++) {
    double *data = (double *)MUSCLE_Receive("exitName", (void *)0, &sz, MUSCLE_DOUBLE);
    // do something with the data
    // data will have length sz;
    free(data);
}
The first example will only work if the upper bound of received message size is known in advance. The second example is safer since MUSCLE will allocate the necessary data for you based on the message size. However, it also means that new memory is allocated for each message.

The `MUSCLE_Send` command can only be used one way, and again starts with the conduit entrance name, then the data, the number of elements in the array and finally the type of data.

double data[100];
size_t sz = 100;
for (int i = 0; !MUSCLE_Will_Stop(); i++) {
    // do something with the data...
    // send the data
    MUSCLE_Send("entranceName", data, sz, MUSCLE_DOUBLE);
}
// data is on the stack; it will be freed automatically

MUSCLE and MPI

Depending on the MPI implementation, MUSCLE_Init should be called before MPI_Init. This may generate a warning about forking, but as long as the submodel produces correct results this can be ignored. When using MPI, MUSCLE calls only do their operations in rank 0, calls from other ranks are ignored. This means that the data should be gathered with MPI before a `MUSCLE_Send` and broadcasted after a `MUSCLE_Receive`. The functions `MUSCLE_Kernel_Name`, `MUSCLE_Get_Property`, and `MUSCLE_Will_Stop` cannot give a meaningful result in other ranks than rank 0 so calling them from other ranks results in undefined behavior (and should be prevented). Their result can then be propagated with MPI in your code, if needed.

The `MUSCLE_Barrier` set of functions ease the integration of MUSCLE with MPI. Most MPI functions (including barrier, broadcast and gather) use a polling mechanism when they wait for communication to happen. This will use all the available CPU power but will somewhat reduce the latency of the operation. However, with MUSCLE, often other submodels than the MPI submodel should do some computing, and while the MPI operation waits this will slow down other submodels immensely. Therefore, MUSCLE has its own barrier operation, which has a higher latency than MPI_Barrier, but will not use any CPU resources. Since only rank 0 of the process ever receives data from MUSCLE, and a receive must wait for another submodel to send the message, that is a good point for calling a barrier. If multiple receives follow each other, barrier only needs to be called after the last one. Note that only ranks other than 0 are stopped by the barrier, rank 0 will not stop.

The MUSCLE_Barrier API has three parts: `MUSCLE_Barrier_Init(char **barrier, size_t *len, int num_mpi_procs)`, `MUSCLE_Barrier(const char *barrier)` and `MUSCLE_Barrier_Destroy(char *barrier)`. In Init the barrier is created. It needs the number of MPI processes since MUSCLE itself does not do any MPI calls and has no other way to find out. The barrier data structure should then be broadcasted with MPI by the user so that each process uses the same barrier. Each time MUSCLE_Barrier is called, it waits until all ranks have called it. `MUSCLE_Barrier_Init` returns -1 in rank 0 if it fails, and 0 otherwise. `MUSCLE_Barrier` returns -1 in any rank that it fails in, but that does not guarantee that it failed in other ranks. In Destroy the resources of the barrier are cleaned up.

A typical piece of code could look as follows:

int mpi_size;
MPI_Comm_size(&mpi_size, MPI_COMM_WORLD);

char *barrier;
size_t barrier_len;
MUSCLE_Barrier_Init(&barrier, &barrier_len, mpi_size);
MPI_Bcast(barrier, barrier_len, MPI_CHAR, 0, MPI_COMM_WORLD);

while (!EC) {
    // Only receives in rank 0, it is ignored for other ranks
    MUSCLE_Receive(data, ...);
    MUSCLE_Barrier(barrier);
    MPI_Bcast(data, ...)

    // do something

    // Only sends in rank 0, it is ignored in other ranks
    MUSCLE_Send(data, ...)
}
MUSCLE_Barrier_Destroy(barrier);
This paradigm is used in `src/cpp/examples/simplempi/sender.c`.

C++ API

The C++ API resembles the C API but it is more extensive. Relevant header files are `muscle2/cppmuscle.hpp` and `muscle2/complex_data.hpp`. All functions included `cppmuscle.hpp` are static, so the API can only be used for one submodel per executable. The following functions are available:

muscle_error_t muscle::env::init(int* argc, char ***argv);
void muscle::env::finalize(void);

bool muscle::env::will_stop(void);

void muscle::env::send(std::string entrance_name, const void *data, size_t count, muscle_datatype_t type);
void muscle::env::sendDoubleVector(std::string entrance_name, const std::vector<double>& data);

void* muscle::env::receive(std::string exit_name, void *data, size_t &count, muscle_datatype_t type);
std::vector<double> muscle::env::receiveDoubleVector(std::string exit_name);

void muscle::env::free_data(void *ptr, muscle_datatype_t type);

std::string muscle::env::get_tmp_path(void);

std::string muscle::cxa::get_property(std::string name);
std::string muscle::cxa::get_properties(void);
std::string muscle::cxa::kernel_name(void);

int muscle::util::barrier_init(char **barrier, size_t *len, int num_mpi_procs);
int muscle::util::barrier(const char *barrier);
void muscle::util::barrier_destroy(char *barrier);
`init`, `finalize`, `will_stop`, `send`, `receive`, `get_property`, `kernel_name`, `barrier_init`, `barrier`, and `barrier_destroy` behave exactly as their C counterparts. Since scientific data is often vectors of doubles, there are two new convenience functions to send or receive only double vectors. Vectors are safer than arrays memory-wise and they do range-checking. If other vector methods are of interest to you, send us mailto:j.borgdorff@uva.nl. Except for vectors, the data that is received must be freed by MUSCLE by calling `free_data` with the received pointer and the datatype of the received pointer.

Complex data

In addition to the datatypes that are available in the C API, the C++ API can also receive a `MUSCLE_COMPLEX` type. This will receive a `muscle::ComplexData` object, defined in the `muscle2/complex_data.hpp` header file. This `ComplexData` is used when the sender of a message sends different data than an array. A Java sender might send a 2-D array of doubles, for instance. In the enum `muscle_complex_t` all possible types are listed. For the moment only arrays and matrices are supported. Working with `ComplexData` goes as follows, for example with a two-dimensional double array

#include <vector>
#include <stdexcept>
#include <muscle2/cppmuscle.hpp>
#include <muscle2/complex_data.hpp>

using namespace std;
using namespace muscle;

double processMessage()
{
    size_t sz;
    ComplexData *cdata = (ComplexData *) muscle::env::receive("matrixIn", (void *)0, &sz, MUSCLE_COMPLEX);
    if (cdata->getType() != COMPLEX_DOUBLE_MATRIX_2D) {
        throw new runtime_error("Expecting double 2D matrix; received something else.");
    }
    vector<int> dims = cdata->getDimensions();

    // get the matrix as a vector
    double *data = (double *) cdata->getData();

    // Do something with the data
    double result = 0;
    for (int x = 0; x < dims[0]; x++) {
        for (int y = 0; y < dims[1]; y++) {
           // the matrix is indexed according to the different dimensions
           result += data[cdata->fidx(x,y)];
        }
    }
    muscle::env::free_data(cdata, MUSCLE_COMPLEX);
    return result;
}
The indices of the 2-D array are retrieved using the `index` or `fidx` function, where the latter is faster but does not do range or dimensionality checking. To send a `ComplexData` object it must first be constructed. The easiest way is to let `ComplexData` do the memory allocation and destruction, and getting the allocated memory from `ComplexData` to perform actual actions with.
vector<int> dims(2);
// Set dimensions x and y
dims[0] = 10; dims[1] = 14;
ComplexData cdata(COMPLEX_DOUBLE_MATRIX_2D, &dims);
double *data = (double *)cdata.getData();

// Do something with the data
double result = 0;
for (int x = 0; x < dims[0]; x++) {
    for (int y = 0; y < dims[1]; y++) {
       // the matrix is indexed according to the different dimensions, using fidx
       result += data[cdata.fidx(x,y)];
    }
}
muscle::env::send("matrixOut", &cdata, cdata.length(), MUSCLE_COMPLEX);

Logger

MUSCLE provides logging facilities for C++. The functions are all static.

void muscle::logger::severe(char *message, ...);
void muscle::logger::warning(char *message, ...);
void muscle::logger::info(char *message, ...);
void muscle::logger::config(char *message, ...);
void muscle::logger::fine(char *message, ...);
void muscle::logger::finer(char *message, ...);
void muscle::logger::finest(char *message, ...);
Each takes a message, and optionally more arguments provided in printf syntax. For instance:
muscle::logger::info("the temporary path is %s", muscle::env::get_tmp_path().c_str());
This will print the message to stdout and prepend the current kernel name and the log level. The log is written to `TMP_DIR/[instance]/[instance].native.log`.
Clone this wiki locally