COSTA v1.0
This is the very first release of COSTA, bringing the following features:
- scalapack wrappers: for redistribute (
pxgemr2d
) and transpose (pxtran(u)
). - different layouts support: added representation for block-cyclic and arbitrary matrix layouts.
- multiple layouts: can transform multiple layouts at once, i.e. in the same communication round.
- comm-optimal: can minimize the communication volume.
- scaling & transpose: in addition to redistributing the matrix, can also scale initial and final layouts and also transpose them.
- highly optimized: optimized for distributed and multithreaded settings.