Skip to content

Commit

Permalink
Merge pull request #1123 from DARMA-tasking/875-phase-manager-component
Browse files Browse the repository at this point in the history
875 Implement phase manager component
  • Loading branch information
lifflander authored Oct 23, 2020
2 parents 386023a + 9ee2b0b commit 01ccf10
Show file tree
Hide file tree
Showing 50 changed files with 875 additions and 1,283 deletions.
1 change: 1 addition & 0 deletions cmake/define_build_types.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ set(
CatEnum::lb | \
CatEnum::vrt_coll | \
CatEnum::group | \
CatEnum::phase | \
CatEnum::broadcast \
"
)
Expand Down
7 changes: 4 additions & 3 deletions docs/md/lb-manager.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@

The LB manager component `vt::vrt::collection::balance::LBManager`, accessed via
`vt::theLBManager()` manages and coordinates instances of load balancers. It
counts collections as they call `nextPhase` to ensure they are all ready before
load balancing begins. It reads the command-line arguments or LB specification
file to determine which load balancer to run.
will potentially start load balancing after a "phase" is completed; refer to
\ref phase for details about how to delineate phases in an application. The LB
manager reads command-line arguments or an LB specification file to determine
which load balancer to run at a given phase.

To enable load balancing, the cmake flag \code{.cmake} -Dvt_lb_enabled=1
\endcode should be passed during building. This also enables automatic
Expand Down
17 changes: 17 additions & 0 deletions docs/md/phase.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
\page phase Phase Manager
\brief Manage phases of time

The phase manager component `vt::phase::PhaseManager`, accessed via
`vt::thePhase()` allows the delineation of collective intervals of time across
all nodes. Load balancing, as well as other components, use phases as a boundary
to perform many operations over an application's execution such as work
redistribution, outputting of statistical data, or flushing trace data.

The main user interface is a call to `thePhase()->nextPhaseCollective()` which
starts the next phase after performing a reduction. Thus, any work that belongs
in the preceding phase should be synchronized by the user before this is called
(e.g., by calling `vt::runInEpochCollective`).

System components along with applications can register hooks with the phase
manager to determine when a new phase is starting, ending, and after migrations
have occurred.
1 change: 1 addition & 0 deletions docs/md/vt.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ management.
| \subpage param | `vt::theParam()` | \copybrief param | @m_class{m-label m-danger} **Experimental** |
| \subpage pipe | `vt::theCB()` | \copybrief pipe | @m_class{m-label m-success} **Core** |
| \subpage node-stats | `vt::theNodeStats()` | \copybrief node-stats | @m_class{m-label m-warning} **Optional** |
| \subpage phase | `vt::thePhase()` | \copybrief phase | @m_class{m-label m-success} **Core** |
| \subpage pool | `vt::thePool()` | \copybrief pool | @m_class{m-label m-success} **Core** |
| \subpage rdma | `vt::theRDMA()` | \copybrief rdma | @m_class{m-label m-danger} **Experimental** |
| \subpage rdmahandle | `vt::theHandleRDMA()` | \copybrief rdmahandle | @m_class{m-label m-warning} **Optional** |
Expand Down
19 changes: 2 additions & 17 deletions examples/collection/lb_iter.cc
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,6 @@ static int32_t num_iter = 8;
struct IterCol : vt::Collection<IterCol, vt::Index1D> {
IterCol() = default;

using EmptyMsg = vt::CollectionMessage<IterCol>;

struct IterMsg : vt::CollectionMessage<IterCol> {
IterMsg() = default;
explicit IterMsg(int64_t const in_work_amt, int64_t const in_iter)
Expand All @@ -64,14 +62,6 @@ struct IterCol : vt::Collection<IterCol, vt::Index1D> {

void iterWork(IterMsg* msg);

void runLB(EmptyMsg* msg) {
auto const idx = getIndex();
auto proxy = getCollectionProxy();
proxy[idx].LB<EmptyMsg,&IterCol::doneLB>();
}

void doneLB(EmptyMsg* msg) { }

template <typename SerializerT>
void serialize(SerializerT& s) {
vt::Collection<IterCol, vt::Index1D>::serialize(s);
Expand Down Expand Up @@ -135,20 +125,15 @@ int main(int argc, char** argv) {
auto cur_time = vt::timing::Timing::getCurrentTime();

vt::runInEpochCollective([=]{
if (this_node == 0)
proxy.broadcast<IterCol::IterMsg,&IterCol::iterWork>(10, i);
proxy.broadcastCollective<IterCol::IterMsg,&IterCol::iterWork>(10, i);
});

auto total_time = vt::timing::Timing::getCurrentTime() - cur_time;
if (this_node == 0) {
fmt::print("iteration: iter={},time={}\n", i, total_time);
}

vt::runInEpochCollective([=]{
if (this_node == 0)
proxy.broadcast<IterCol::EmptyMsg,&IterCol::runLB>();
});

vt::thePhase()->nextPhaseCollective();
}

vt::finalize();
Expand Down
1 change: 1 addition & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ set(
termination/interval
termination/graph
messaging/envelope messaging/message
phase
pool/static_sized pool/header
rdma/channel rdma/collection rdma/group rdma/state
rdmahandle
Expand Down
1 change: 1 addition & 0 deletions src/vt/configs/arguments/app_config.h
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,7 @@ struct AppConfig {
bool vt_debug_group = false;
bool vt_debug_broadcast = false;
bool vt_debug_objgroup = false;
bool vt_debug_phase = false;

bool vt_debug_print_flush = false;

Expand Down
3 changes: 3 additions & 0 deletions src/vt/configs/arguments/args.cc
Original file line number Diff line number Diff line change
Expand Up @@ -245,6 +245,7 @@ void ArgConfig::addDebugPrintArgs(CLI::App& app) {
auto bbp = "Enable debug_group = \"" debug_pp(group) "\"";
auto cbp = "Enable debug_broadcast = \"" debug_pp(broadcast) "\"";
auto dbp = "Enable debug_objgroup = \"" debug_pp(objgroup) "\"";
auto dcp = "Enable debug_phase = \"" debug_pp(phase) "\"";

auto r = app.add_flag("--vt_debug_all", config_.vt_debug_all, rp);
auto r1 = app.add_flag("--vt_debug_verbose", config_.vt_debug_verbose, rq);
Expand Down Expand Up @@ -279,6 +280,7 @@ void ArgConfig::addDebugPrintArgs(CLI::App& app) {
auto bb = app.add_flag("--vt_debug_group", config_.vt_debug_group, bbp);
auto cb = app.add_flag("--vt_debug_broadcast", config_.vt_debug_broadcast, cbp);
auto db = app.add_flag("--vt_debug_objgroup", config_.vt_debug_objgroup, dbp);
auto dc = app.add_flag("--vt_debug_phase", config_.vt_debug_phase, dcp);
auto debugGroup = "Debug Print Configuration (must be compile-time enabled)";
r->group(debugGroup);
r1->group(debugGroup);
Expand Down Expand Up @@ -313,6 +315,7 @@ void ArgConfig::addDebugPrintArgs(CLI::App& app) {
bb->group(debugGroup);
cb->group(debugGroup);
db->group(debugGroup);
dc->group(debugGroup);

auto dbq = "Always flush VT runtime prints";
auto eb = app.add_flag("--vt_debug_print_flush", config_.vt_debug_print_flush, dbq);
Expand Down
4 changes: 3 additions & 1 deletion src/vt/configs/debug/debug_config.h
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,8 @@ enum CatEnum : uint64_t {
group = 1ull<<27,
broadcast = 1ull<<28,
objgroup = 1ull<<29,
gossiplb = 1ull<<30
gossiplb = 1ull<<30,
phase = 1ull<<31,
};

enum CtxEnum : uint64_t {
Expand Down Expand Up @@ -129,6 +130,7 @@ vt_option_category_pretty_print(lb, "lb")
vt_option_category_pretty_print(location, "location")
vt_option_category_pretty_print(objgroup, "objgroup")
vt_option_category_pretty_print(param, "parameterization")
vt_option_category_pretty_print(phase, "phase")
vt_option_category_pretty_print(pipe, "pipe")
vt_option_category_pretty_print(pool, "pool")
vt_option_category_pretty_print(reduce, "reduce")
Expand Down
64 changes: 64 additions & 0 deletions src/vt/phase/phase_hook_enum.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
/*
//@HEADER
// *****************************************************************************
//
// phase_hook_enum.h
// DARMA Toolkit v. 1.0.0
// DARMA/vt => Virtual Transport
//
// Copyright 2019 National Technology & Engineering Solutions of Sandia, LLC
// (NTESS). Under the terms of Contract DE-NA0003525 with NTESS, the U.S.
// Government retains certain rights in this software.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are met:
//
// * Redistributions of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer.
//
// * Redistributions in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution.
//
// * Neither the name of the copyright holder nor the names of its
// contributors may be used to endorse or promote products derived from this
// software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
// ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
// LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
// CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
// SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
// INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
// CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
// ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
// POSSIBILITY OF SUCH DAMAGE.
//
// Questions? Contact darma@sandia.gov
//
// *****************************************************************************
//@HEADER
*/

#if !defined INCLUDED_VT_PHASE_PHASE_HOOK_ENUM_H
#define INCLUDED_VT_PHASE_PHASE_HOOK_ENUM_H

namespace vt { namespace phase {

/**
* \enum PhaseHook
*
* \brief Different times in phase execution one can hook triggered actions into
* the \c PhaseManager
*/
enum struct PhaseHook : int8_t {
Start, /**< Before a phase starts */
End, /**< After a phase ends */
EndPostMigration /**< After a phase ends after all migrations */
};

}} /* end namespace vt::phase */

#endif /*INCLUDED_VT_PHASE_PHASE_HOOK_ENUM_H*/
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
//@HEADER
// *****************************************************************************
//
// lbable.h
// phase_hook_id.h
// DARMA Toolkit v. 1.0.0
// DARMA/vt => Virtual Transport
//
Expand Down Expand Up @@ -42,50 +42,67 @@
//@HEADER
*/

#if !defined INCLUDED_VT_VRT_COLLECTION_BALANCE_PROXY_LBABLE_H
#define INCLUDED_VT_VRT_COLLECTION_BALANCE_PROXY_LBABLE_H
#if !defined INCLUDED_VT_PHASE_PHASE_HOOK_ID_H
#define INCLUDED_VT_PHASE_PHASE_HOOK_ID_H

#include "vt/config.h"
#include "vt/messaging/message/smart_ptr.h"
#include "vt/phase/phase_hook_enum.h"

#include <functional>
namespace vt { namespace phase {

namespace vt { namespace vrt { namespace collection {
// forward-decl for friendship
struct PhaseManager;

template <typename ColT, typename IndexT, typename BaseProxyT>
struct LBable : BaseProxyT {
using FinishedLBType = std::function<void()>;
/**
* \struct PhaseHookID
*
* \brief A registered phase hook used to identify it and unregister it.
*/
struct PhaseHookID {

LBable() = default;
LBable(
typename BaseProxyT::ProxyType const& in_proxy,
typename BaseProxyT::ElementProxyType const& in_elm
);
private:
/**
* \internal
* \brief Used by the system to create a new phase hook ID
*
* \param[in] in_type the type of hook
* \param[in] in_id the registered ID
*/
PhaseHookID(PhaseHook in_type, std::size_t in_id, bool in_is_collective)
: type_(in_type),
id_(in_id),
is_collective_(in_is_collective)
{ }

template <typename SerializerT>
void serialize(SerializerT& s);
friend struct PhaseManager;

template <
typename MsgT, ActiveColMemberTypedFnType<MsgT,ColT> f, typename... Args
>
void LBsync(Args&&... args) const;
template <typename MsgT, ActiveColMemberTypedFnType<MsgT,ColT> f>
void LBsync(MsgT* msg, PhaseType p = no_lb_phase) const;
template <typename MsgT, ActiveColMemberTypedFnType<MsgT,ColT> f>
void LBsync(MsgSharedPtr<MsgT> msg, PhaseType p = no_lb_phase) const;
void LBsync(FinishedLBType cont, PhaseType p = no_lb_phase) const;
public:
/**
* \brief Get the type of hook
*
* \return the type of hook
*/
PhaseHook getType() const { return type_; }

template <
typename MsgT, ActiveColMemberTypedFnType<MsgT,ColT> f, typename... Args
>
void LB(Args&&... args) const;
template <typename MsgT, ActiveColMemberTypedFnType<MsgT,ColT> f>
void LB(MsgT* msg, PhaseType p = no_lb_phase) const;
template <typename MsgT, ActiveColMemberTypedFnType<MsgT,ColT> f>
void LB(MsgSharedPtr<MsgT> msg, PhaseType p = no_lb_phase) const;
void LB(FinishedLBType cont, PhaseType p = no_lb_phase) const;
/**
* \brief Get the ID of the registered hook
*
* \return the registered hook ID
*/
std::size_t getID() const { return id_; }

/**
* \brief Get whether the hook is collective or not
*
* \return whether it is collective
*/
std::size_t getIsCollective() const { return is_collective_; }

private:
PhaseHook type_;
std::size_t id_ = 0;
bool is_collective_ = false;
};

}}} /* end namespace vt::vrt::collection */
}} /* end namespace vt::phase */

#endif /*INCLUDED_VT_VRT_COLLECTION_BALANCE_PROXY_LBABLE_H*/
#endif /*INCLUDED_VT_PHASE_PHASE_HOOK_ID_H*/
Loading

0 comments on commit 01ccf10

Please sign in to comment.