Skip to content

Commit

Permalink
Record comms input and output tensor information (#1014)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #1014

Just copy the output from NCCL metadata about tensor information. Allows for easier analysis of kernel memory access patterns in downstream tools.

Reviewed By: sraikund16

Differential Revision: D65785010

fbshipit-source-id: 862fc3df3ea50a3a9d8f2f693261ef043c3b35a7
  • Loading branch information
sanrise authored and facebook-github-bot committed Nov 15, 2024
1 parent 338140f commit d9673d2
Showing 1 changed file with 20 additions and 0 deletions.
20 changes: 20 additions & 0 deletions libkineto/src/output_json.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ static constexpr const char* kOutSplit = "Out split size";
static constexpr const char* kProcessGroupName = "Process Group Name";
static constexpr const char* kProcessGroupDesc = "Process Group Description";
static constexpr const char* kGroupRanks = "Process Group Ranks";
static constexpr const char* kInTensorsStart = "Input Tensors start";
static constexpr const char* kOutTensorsStart = "Output Tensors start";
static constexpr const char* kRank = "Rank";
static constexpr const char* kP2pSrc = "Src Rank";
static constexpr const char* kP2pDst = "Dst Rank";
Expand Down Expand Up @@ -419,6 +421,24 @@ void ChromeTraceLogger::handleActivity(const libkineto::ITraceActivity& op) {
kDtype,
dtype));
}
const auto& input_tensor_starts =
collectiveRecord->getMetadataValue(kInTensorsStart);
const auto output_tensor_starts =
collectiveRecord->getMetadataValue(kOutTensorsStart);
if (!input_tensor_starts.empty()) {
if (!arg_values.empty()) {
arg_values.append(",");
}
arg_values.append(
fmt::format(" \"{}\": {}", kInTensorsStart, input_tensor_starts));
}
if (!output_tensor_starts.empty()) {
if (!arg_values.empty()) {
arg_values.append(",");
}
arg_values.append(
fmt::format(" \"{}\": {}", kOutTensorsStart, output_tensor_starts));
}
// In/out split size are valid for all_to_all
const auto& inSplitSize = collectiveRecord->getMetadataValue(kInSplit);
const auto& outSplitSize = collectiveRecord->getMetadataValue(kOutSplit);
Expand Down

0 comments on commit d9673d2

Please sign in to comment.