Skip to content
This repository has been archived by the owner on Apr 28, 2023. It is now read-only.

Commit

Permalink
do not demote from shared before promoting to register
Browse files Browse the repository at this point in the history
In cases where a reference group promoted to registers covered exactly
the same accesses as another group promoted to shared memory, the second
group was demoted to save up shared memory space.  However, this led to
adverse effects since copying global->shared is performed in the
beginning of the block while copying global->register deeper in the tree,
which does not allow to hide latency from loads.  Keep the group
promoted to shared memory and perform a copy shared->register.

Alternative solution would be to decrease the promotion scope depth for
register promotion.  This would require to ensure that loops indices of
which are present in subscripts of "register" arrays are fully unrolled
so that the elements of that array are effectively mapped to registers.
Since unrolling is expensive in compilation time and is exposed to the
autotuner, we would prefer to also expose the register promotion depth
in the future.
  • Loading branch information
ftynse committed Mar 26, 2018
1 parent 814d33f commit 44a7708
Showing 1 changed file with 0 additions and 27 deletions.
27 changes: 0 additions & 27 deletions src/core/polyhedral/memory_promotion_heuristic.cc
Original file line number Diff line number Diff line change
Expand Up @@ -626,21 +626,6 @@ void promoteToRegistersBelowThreads(
}
}

// Compute the set of active points without constraints introduced by
// thread mapping.
auto mappingTree = band;
while (mappingTree &&
!mappingTree->elemAs<ScheduleTreeElemMappingFilter>()) {
mappingTree = mappingTree->ancestor(scop.scheduleRoot(), 1);
}
CHECK(mappingTree);
auto mappingElem = mappingTree->elemAs<ScheduleTreeElemMappingFilter>();
auto pointsNoThreadMapping = points.gist(mappingElem->filter_);
for (size_t j = 0; j < mapping::ThreadId::kMaxDim; ++j) {
pointsNoThreadMapping = projectOutNamedParam(
pointsNoThreadMapping, mapping::ThreadId::makeId(j));
}

auto groupMap = TensorReferenceGroup::accessedBySubtree(band, scop);
for (auto& tensorGroups : groupMap) {
auto tensorId = tensorGroups.first;
Expand All @@ -665,18 +650,6 @@ void promoteToRegistersBelowThreads(
continue;
}

// If a group of references was promoted into shared memory, but it
// could be also promoted to registers while covering exactly the
// same statement instances accessing it, demote it from shared
// memory first.
auto outerScopePromotions = scop.activePromotions(points, tensorId);
if (outerScopePromotions.size() == 1 &&
outerScopePromotions[0]
.first.subtract(pointsNoThreadMapping)
.is_empty()) {
scop.demoteGroup(outerScopePromotions[0].second.groupId);
}

scop.promoteGroup(
Scop::PromotedDecl::Kind::Register,
tensorId,
Expand Down

0 comments on commit 44a7708

Please sign in to comment.