Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fin Perf Eval handles (reboot) #1548

Merged
merged 115 commits into from
Jun 22, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
115 commits
Select commit Hold shift + click to select a range
a30ec12
function for retrieving solutions
cderb Dec 2, 2021
6a2e3cc
redefine tunable test arount GetSolution
cderb Dec 2, 2021
f4d819e
getallconfigs function
cderb Dec 7, 2021
ea8f6cf
Merge remote-tracking branch 'origin/develop' into cderb/fin_perf_com…
cderb Jan 6, 2022
7b778a8
remove inline operators for perf Configs
JehandadKhan Jan 6, 2022
a796d83
fix compilation error, allow for calls to GetAllSolvers
cderb Jan 7, 2022
5f8dee1
rename function
cderb Jan 22, 2022
b61ef31
legacy solver boolean, getallsolutions mod, performance parameter getter
cderb Feb 3, 2022
f9d855e
debug prints for perf cfg string function, added db_update flag to ex…
cderb Feb 16, 2022
1bdd262
Add base class for searchable solvers
averinevg Feb 17, 2022
935b7cc
Fix formatting
averinevg Feb 17, 2022
f7f0a92
Merge branch 'develop' into searchable_solver_base
averinevg Feb 17, 2022
62ca5bd
Remove redundant 'virtual'
averinevg Feb 18, 2022
68a85f5
Remove default arguments on override methods
averinevg Feb 18, 2022
c5f7d14
Fix formatting
averinevg Feb 18, 2022
ea39f5e
Fix formatting
averinevg Feb 18, 2022
e6ab90b
Fix formatting
averinevg Feb 18, 2022
654bf37
Merge branch 'develop' into searchable_solver_base
averinevg Feb 22, 2022
8be64c2
Merge remote-tracking branch 'origin/develop' into cderb/fin_perf_com…
cderb Feb 24, 2022
35a79fe
remove comments
cderb Feb 25, 2022
a81e034
spacing
cderb Feb 25, 2022
d3e2345
formatting
cderb Feb 25, 2022
cbccc05
update call to SolverDbId
cderb Mar 1, 2022
7db661d
add forward declaration of PerformanceConfigConvOclBwdWrw2 template v…
cderb Mar 1, 2022
c8ef874
forward declaration of structs
cderb Mar 3, 2022
4586def
Merge remote-tracking branch 'origin/develop' into cderb/fin_perf_com…
cderb Mar 3, 2022
df156ae
revert movement of operator override
cderb Mar 4, 2022
de09e34
ocl tidy
cderb Mar 4, 2022
6942f72
clang-format
cderb Mar 4, 2022
3844958
Squashed 'fin/' changes from 53d2563fe..a30a51bc6
cderb Mar 8, 2022
c8a4d82
fin changes for perf eval
cderb Mar 8, 2022
885cbae
Merge remote-tracking branch 'origin/develop' into cderb/fin_perf_com…
cderb Mar 10, 2022
f3aa555
tidy
cderb Mar 11, 2022
48ca257
tidy
cderb Mar 14, 2022
2b4f9bc
tidy
cderb Mar 15, 2022
27fd794
format
cderb Mar 15, 2022
d02f581
revert IsTunable definition, hide definition of get perf config for n…
cderb Mar 15, 2022
b79456e
Merge branch 'develop' into searchable_solver_base
averinevg Mar 17, 2022
d4054e0
Merge branch 'develop' into cderb/fin_perf_compile
cderb Mar 22, 2022
60e33bb
Merge remote-tracking branch 'origin/develop' into cderb/fin_perf_com…
cderb Mar 22, 2022
0790521
Merge branch 'develop' into searchable_solver_base
averinevg Mar 29, 2022
917a96d
Merge remote-tracking branch 'origin/develop' into cderb/fin_perf_com…
cderb Apr 6, 2022
62893b1
Merge remote-tracking branch 'origin/develop' into cderb/fin_perf_com…
cderb Apr 7, 2022
b700918
spell fix
cderb Apr 8, 2022
fa91f56
Merge branch 'develop' into searchable_solver_base
averinevg Apr 13, 2022
985e31e
Merge branch 'develop' into searchable_solver_base
averinevg Apr 20, 2022
626f182
ConvAsm1x1U and ConvBiasActivAsm1x1U are inherited from SearchableSol…
averinevg Apr 20, 2022
9ce45e6
fix formatting
averinevg Apr 20, 2022
c36f85a
Fix performance config for ConvBiasActivAsm1x1U
averinevg Apr 20, 2022
588a98e
Fix warnings
averinevg Apr 20, 2022
ca60846
Merge branch 'develop' into searchable_solver_base
averinevg Apr 22, 2022
f900d17
Remove disableConfigOverrideFromEnv
averinevg Apr 22, 2022
c965113
Fix formatting
averinevg Apr 22, 2022
0a5e028
Rename base class for tunable solvers
averinevg Apr 24, 2022
5047b6b
Rename solver methods
averinevg Apr 24, 2022
5a24aa2
Fix formatting
averinevg Apr 24, 2022
8101962
ConvOclBwdWrW2NonTunable: add using to suppress warning
averinevg Apr 24, 2022
aa65bea
Fix FindSolution and GenericSearch
averinevg Apr 24, 2022
a6a8ddd
Fix formatting
averinevg Apr 25, 2022
ff6d73c
Fix AnySolver
averinevg Apr 25, 2022
614fec1
Fix FindSolution
averinevg Apr 25, 2022
201f5f1
Fix build errors
averinevg Apr 25, 2022
506b678
Fix formatting
averinevg Apr 25, 2022
c817a3a
Load config from env: keep code blocks to hide inner objects
averinevg Apr 26, 2022
1c84bf3
Fix formatting
averinevg Apr 26, 2022
12fe42b
Merge branch 'develop' into cderb/fin_perf_compile
cderb May 2, 2022
f3e37b3
Merge remote-tracking branch 'origin/develop' into cderb/fin_perf_com…
cderb May 2, 2022
0af9a40
merge and conflict resolution
cderb May 2, 2022
f4dcb6a
Merge branch 'develop' into searchable_solver_base
averinevg May 5, 2022
6c759b8
Remove CTS suffix from function names
averinevg May 6, 2022
a1b58f9
Use ConvTunableSolver as a base class for ConvOclBwdWrW2
averinevg May 6, 2022
de6b5e7
Remove redundant comments
averinevg May 6, 2022
c1e7a53
Mark tunable solvers as final
averinevg May 6, 2022
19d1321
Merge branch 'develop' into searchable_solver_base
averinevg May 11, 2022
0ee2510
Revert code move
averinevg May 11, 2022
d00883a
Revert code move
averinevg May 11, 2022
04ee25b
Merge remote-tracking branch 'origin/searchable_solver_base' into cde…
cderb May 16, 2022
0354cf4
Merge remote-tracking branch 'origin/develop' into cderb/fin_perf_sol…
cderb May 16, 2022
88bc334
mergefix
cderb May 16, 2022
034bbb7
format
cderb May 16, 2022
eae32e3
Squashed 'fin/' changes from a30a51bc6..722feea66
cderb May 16, 2022
eb9c1c2
Merge commit 'eae32e3e810a478c3f5205fb0a00df322f2e4a24' into cderb/fi…
cderb May 16, 2022
85c1469
Merge branch 'cderb/fin_perf_solver_mergres' of https://github.com/RO…
cderb May 16, 2022
98cca15
format
cderb May 16, 2022
c56cf5e
comment unknown function
cderb May 17, 2022
0a88708
revert fin changes
cderb May 18, 2022
5dc2bc3
Merge remote-tracking branch 'origin/develop' into cderb/fin_perf_sol…
cderb May 18, 2022
52795e0
name change
cderb May 18, 2022
618d8e2
Merge branch 'develop' into cderb/fin_perf_solver_mergres2
cderb May 18, 2022
fc970d5
assert for make analyze
cderb May 18, 2022
b9cdafa
fix
cderb May 18, 2022
5c831b2
move assert
cderb May 18, 2022
9249f38
try ignore warning
cderb May 18, 2022
478d6cb
try to fix clang with conditional
cderb May 19, 2022
9caa69b
format
cderb May 19, 2022
efa967a
specify type
cderb May 19, 2022
c28e5a9
try fix for clang
cderb May 19, 2022
17f4cef
format
cderb May 19, 2022
2fe6369
revert igemm clang fix
cderb May 19, 2022
0e9ca20
move declarations, re-add inlines
cderb May 20, 2022
3cb8a4f
format
cderb May 20, 2022
239b866
fix
cderb May 20, 2022
587b4cd
function rename
cderb May 23, 2022
e1a2a19
Merge remote-tracking branch 'origin/develop' into cderb/fin_perf_sol…
cderb May 26, 2022
e324cb2
Revert "move declarations, re-add inlines"
cderb May 31, 2022
9fb6719
cleanup
cderb May 31, 2022
d994195
Update src/include/miopen/generic_search.hpp
cderb Jun 1, 2022
cbdf4b3
fix clang format
junliume Jun 1, 2022
9563a5e
change to error throw, remove if clause
cderb Jun 8, 2022
2868d28
static analyzer ignores
cderb Jun 8, 2022
80630de
Update src/include/miopen/any_solver.hpp
cderb Jun 8, 2022
8daecf0
Merge remote-tracking branch 'origin/develop' into cderb/fin_perf_sol…
cderb Jun 8, 2022
59d1ec0
Merge branch 'cderb/fin_perf_solver_mergres2' of https://github.com/R…
cderb Jun 8, 2022
a5499d9
replace nolint with throws
cderb Jun 9, 2022
5fcf352
Merge remote-tracking branch 'origin/develop' into cderb/fin_perf_sol…
cderb Jun 22, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/binary_cache.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ std::string LoadBinary(const TargetProperties& target,
auto record = db.FindRecord(cfg);
if(record)
{
MIOPEN_LOG_I2("Sucessfully loaded binary for: " << verbose_name << "; args: " << args);
MIOPEN_LOG_I2("Successfully loaded binary for: " << verbose_name << "; args: " << args);
averinevg marked this conversation as resolved.
Show resolved Hide resolved
return record.get();
}
else
Expand Down
123 changes: 112 additions & 11 deletions src/include/miopen/any_solver.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@
#include <miopen/find_solution.hpp>
#include <miopen/mlo_internal.hpp>

#include <miopen/generic_search.hpp>

#include <cassert>
#include <memory>
#include <typeinfo>
Expand Down Expand Up @@ -61,6 +63,11 @@ struct AnySolver
assert(ptr_value != nullptr);
return ptr_value->TestSysDbRecord(record);
};
std::vector<ConvSolution> GetAllSolutions(const ConvolutionContext& ctx) const
{
assert(ptr_value != nullptr);
return ptr_value->GetAllSolutions(ctx);
};
bool IsDynamic() const
{
assert(ptr_value != nullptr);
Expand All @@ -84,6 +91,11 @@ struct AnySolver
assert(ptr_value != nullptr);
return ptr_value->FindSolution(ctx, db, invoke_ctx);
};
std::string GetPerfCfgParams(const ConvolutionContext& ctx, Db& db) const
{
assert(ptr_value != nullptr);
return ptr_value->GetPerfCfgParams(ctx, db);
};
std::string GetSolverDbId() const
{
assert(ptr_value != nullptr);
Expand All @@ -108,18 +120,20 @@ struct AnySolver
using ptr = std::shared_ptr<const AnySolver_base>;

virtual ~AnySolver_base(){};
virtual bool IsApplicable(const ConvolutionContext& ctx) const = 0;
virtual bool IsTunable() const = 0;
virtual bool TestSysDbRecord(const DbRecord& record) const = 0;
virtual bool IsDynamic() const = 0;
virtual float GetWti(const ConvolutionContext& ctx) const = 0;
virtual const std::type_info& Type() const = 0;
virtual std::string GetSolverDbId() const = 0;
virtual bool IsApplicable(const ConvolutionContext& ctx) const = 0;
virtual bool IsTunable() const = 0;
virtual bool TestSysDbRecord(const DbRecord& record) const = 0;
virtual std::vector<ConvSolution> GetAllSolutions(const ConvolutionContext& ctx) const = 0;
virtual bool IsDynamic() const = 0;
virtual float GetWti(const ConvolutionContext& ctx) const = 0;
virtual const std::type_info& Type() const = 0;
virtual std::string GetSolverDbId() const = 0;
virtual ConvSolution FindSolution(const ConvolutionContext& ctx,
Db& db,
const miopen::AnyInvokeParams& invoke_ctx) const = 0;
virtual size_t GetWorkspaceSize(const ConvolutionContext& ctx) const = 0;
virtual bool MayNeedWorkspace() const = 0;
const miopen::AnyInvokeParams& invoke_ctx) const = 0;
virtual std::string GetPerfCfgParams(const ConvolutionContext& ctx, Db& db) const = 0;
virtual size_t GetWorkspaceSize(const ConvolutionContext& ctx) const = 0;
virtual bool MayNeedWorkspace() const = 0;
};

// templated derived class
Expand All @@ -140,6 +154,22 @@ struct AnySolver
static constexpr bool Is = type::value;
};

struct LegacySolver
{
template <typename U>
static constexpr auto Test(U*) ->
typename std::is_same<ConvSolution,
decltype(std::declval<U>().GetSolution(
std::declval<const ConvolutionContext&>(),
std::declval<const LegacyPerformanceConfig&>()))>::type;

template <typename U>
static constexpr std::false_type Test(...);

using type = decltype(Test<T>(nullptr));
static constexpr bool Is = type::value;
};

bool TestSysDbRecord(const DbRecord& record, std::true_type) const
{
using PerformanceConfig = decltype(
Expand All @@ -149,7 +179,7 @@ struct AnySolver
}
bool TestSysDbRecord(const DbRecord& record, std::false_type) const
{
(void)(record);
std::ignore = record;
return false;
}

Expand All @@ -158,6 +188,43 @@ struct AnySolver
return TestSysDbRecord(record, std::integral_constant<bool, TunableSolver::Is>());
}

// tunable legacy solver
std::vector<ConvSolution>
GetAllSolutions(const ConvolutionContext&, std::true_type, std::true_type) const
{
MIOPEN_THROW("No solutions returned for Legacy Solvers.");
}

// tunable solver, not legacy
std::vector<ConvSolution>
GetAllSolutions(const ConvolutionContext& ctx, std::true_type, std::false_type) const
{
return miopen::solver::GetAllSolutions(value, ctx);
}

// non tunable solver
std::vector<ConvSolution>
GetAllSolutions(const ConvolutionContext& ctx, std::false_type, std::true_type) const
cderb marked this conversation as resolved.
Show resolved Hide resolved
{
std::vector<ConvSolution> solutions;
solutions.push_back(value.GetSolution(ctx));
return solutions;
}
std::vector<ConvSolution>
GetAllSolutions(const ConvolutionContext& ctx, std::false_type, std::false_type) const
{
std::vector<ConvSolution> solutions;
solutions.push_back(value.GetSolution(ctx));
return solutions;
}

std::vector<ConvSolution> GetAllSolutions(const ConvolutionContext& ctx) const override
{
return GetAllSolutions(ctx,
std::integral_constant<bool, TunableSolver::Is>(),
std::integral_constant<bool, LegacySolver::Is>());
}

AnySolver_tmpl(T obj) : value(std::move(obj)){};
bool IsApplicable(const ConvolutionContext& ctx) const override
{
Expand All @@ -166,12 +233,46 @@ struct AnySolver
bool IsTunable() const override { return TunableSolver::Is; }
bool IsDynamic() const override { return value.IsDynamic(); }
float GetWti(const ConvolutionContext& ctx) const override { return value.GetWti(ctx); }

ConvSolution FindSolution(const ConvolutionContext& ctx,
Db& db,
const miopen::AnyInvokeParams& invoke_ctx) const override
{
return miopen::solver::FindSolution(value, ctx, db, invoke_ctx);
};

std::string GetPerfCfgParams(const ConvolutionContext& ctx, Db& db, std::true_type) const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The body of this function looks like a modified copy of very similar code from FindSolutionImpl (src/include/miopen/find_solution.hpp). Can we generalize this?

Copy link
Contributor Author

@cderb cderb Jun 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The objective of this function is to return the parameter string from the performance config. I see the portion that is common here, and most likely it was the example for this addition. They are entirely similar in that they

  1. use the same type declaration
  2. load from the db
  3. check validity of the config loaded
  4. do a function x with the loaded config
  5. log progress

The main difference being (4) the way to minimize this code would be to create an additional function providing a callback in place of (4).
Disregarding log statements and syntactical spacing there are a total of 4 shared lines. Not sure if it's worth complicating with a callback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cderb

The objective of this function is to return the parameter string from the performance config

Excuse me, but AFAICS this function is not used, so it seems like it does not have an objective expressed in the code ;)

Which is not surprising, because from the architectural point of view, getting tuning config from the database is not a Solver's job.

I think that this method should be removed from AnySolver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove this function, I'll need a way to retrieve the serialized performance parameters. Could be the winner from the solver or from a solution. I'm searching for another method, but nothing yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add this function to a PR that will actually use it? This will allow us to expedite the review of this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. If you know SolverId, then it is possible to construct an instance of the Solver and then follow the design pattern of:
https://github.com/ROCmSoftwarePlatform/MIOpen/blob/72d89b40ee97666efaac0a0979a48680bd02f1da/src/include/miopen/find_solution.hpp#L123-L125
I can do this for you in a future PR where you will implement Fin changes. Or you can ask @DrizztDoUrden to assist.

But let's avoid adding unused code in this PR. Is it acceptable for you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the key issue is that in order to call serialize on PerformanceConfig, access to the fully substantiated type is required, but occluded behind AnySolver. There needs to be some way of extracting that performance parameter string so that the pdb entry data can be recorded by Fin.
A function producing the performance parameter string would be necessary for the current implementation of performance tuning in Fin to be complete.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please send me proposed Fin changes (where you call GetPerfCfgParams) and I will offer a substitute. This is necessary, because

from the architectural point of view, getting tuning config from the database is not a Solver's job.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll email the code section to your amd address. Do you still have access?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cderb No, unfortunately. But you can send me mail to DXC (please check your amd inbox).

{
using PerformanceConfig = decltype(value.GetDefaultPerformanceConfig(ctx));
PerformanceConfig config{};
if(db.Load(ctx, value.SolverDbId(), config))
{
MIOPEN_LOG_I2("Perf Db: Record Loaded: " << value.SolverDbId());
if(value.IsValidPerformanceConfig(ctx, config))
{
std::ostringstream ss;
config.Serialize(ss);
return ss.str();
}
MIOPEN_LOG_I2("Perf Db: Invalid Config: " << value.SolverDbId());
}
MIOPEN_LOG_I2("Perf Db: Failed Loading: " << value.SolverDbId());
return "";
}
std::string
GetPerfCfgParams(const ConvolutionContext& ctx, const Db& db, std::false_type) const
{
MIOPEN_LOG_I2("Perf Db: No Config: " << value.SolverDbId());
std::ignore = ctx;
std::ignore = db;
return "";
}

std::string GetPerfCfgParams(const ConvolutionContext& ctx, Db& db) const override
{
return GetPerfCfgParams(ctx, db, std::integral_constant<bool, TunableSolver::Is>());
}

size_t GetWorkspaceSize(const ConvolutionContext& ctx) const override
{
return value.GetWorkspaceSize(ctx);
Expand Down
1 change: 1 addition & 0 deletions src/include/miopen/execution_context.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ struct ExecutionContext
{
// Operation modes & environment
bool do_search = false;
bool db_update = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only initialized but not updated anywhere. Can we make this change when it's really needed (thus following the YAGNI principle)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in fin, at conv_fin.hpp:559

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the problem is that this PR does NOT include changes in conv_fin.hpp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be resolved by changes in #1588

bool save_srch_req = false;
bool use_asm_kernels = false;
bool use_hip_kernels = true;
Expand Down
3 changes: 2 additions & 1 deletion src/include/miopen/find_solution.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,8 @@ auto FindSolutionImpl(
}
else
{
if((context.do_search || enforce.IsSearch(context)) && enforce.IsDbUpdate(context))
if((context.do_search || enforce.IsSearch(context)) &&
(context.db_update || enforce.IsDbUpdate(context)))
{
MIOPEN_LOG_W("Perf Db: load skipped: " << s.SolverDbId() << ", enforce: " << enforce);
}
Expand Down
49 changes: 39 additions & 10 deletions src/include/miopen/generic_search.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -296,6 +296,43 @@ using RunAndMeasure_t =
std::declval<ConvSolution>(),
std::declval<float&>()));

template <class Solver, class Context>
auto GetAllConfigs(const Solver s, const Context& context)
-> ComputedContainer<decltype(s.GetDefaultPerformanceConfig(context)), Context>
{
using PerformanceConfig = decltype(s.GetDefaultPerformanceConfig(context));

ComputedContainer<PerformanceConfig, Context> primary(context);
const int primary_size = std::distance(primary.begin(), primary.end());
ComputedContainer<PerformanceConfig, Context> spare(context, true);
const int spare_size = std::distance(spare.begin(), spare.end());
const bool useSpare = (primary_size == 0);

ComputedContainer<PerformanceConfig, Context> all_configs = useSpare ? spare : primary;
const int n_runs_total = useSpare ? spare_size : primary_size;
MIOPEN_LOG_W(s.SolverDbId() << ": Searching the best solution among " << n_runs_total
<< (useSpare ? " (spare)" : "") << "...");

return all_configs;
}

template <class Solver, class Context>
std::vector<ConvSolution> GetAllSolutions(const Solver s, const Context& context_)
{
auto context = context_;
context.is_for_generic_search = true;

auto all_configs = GetAllConfigs(s, context);

std::vector<ConvSolution> solutions;
for(const auto& current_config : all_configs)
{
ConvSolution current_solution = s.GetSolution(context, current_config);
solutions.push_back(current_solution);
}
return solutions;
}

template <class Solver, class Context>
auto GenericSearch(const Solver s, const Context& context_, const AnyInvokeParams& invoke_ctx_)
-> decltype(s.GetDefaultPerformanceConfig(context_))
Expand All @@ -320,16 +357,8 @@ auto GenericSearch(const Solver s, const Context& context_, const AnyInvokeParam
auto& profile_h = context.GetStream();
AutoEnableProfiling enableProfiling{profile_h};

const ComputedContainer<PerformanceConfig, Context> main(context);
const int main_size = std::distance(main.begin(), main.end());
const ComputedContainer<PerformanceConfig, Context> spare(context, true);
const int spare_size = std::distance(spare.begin(), spare.end());
const bool useSpare = (main_size == 0);

const ComputedContainer<PerformanceConfig, Context> all_configs = useSpare ? spare : main;
const int n_runs_total = useSpare ? spare_size : main_size;
MIOPEN_LOG_W(s.SolverDbId() << ": Searching the best solution among " << n_runs_total
<< (useSpare ? " (spare)" : "") << "...");
auto all_configs = GetAllConfigs(s, context);
const int n_runs_total = std::distance(all_configs.begin(), all_configs.end());

bool is_passed = false; // left false only if all iterations failed.
float best_time = std::numeric_limits<float>::max();
Expand Down
7 changes: 7 additions & 0 deletions src/include/miopen/solver.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -2436,6 +2436,12 @@ struct ConvOclBwdWrW2 : ConvTunableSolver<PerformanceConfigConvOclBwdWrw2<N_BATC
#pragma clang diagnostic ignored "-Wweak-template-vtables"
#endif

extern template struct PerformanceConfigConvOclBwdWrw2<1>;
extern template struct PerformanceConfigConvOclBwdWrw2<2>;
extern template struct PerformanceConfigConvOclBwdWrw2<4>;
extern template struct PerformanceConfigConvOclBwdWrw2<8>;
extern template struct PerformanceConfigConvOclBwdWrw2<16>;
averinevg marked this conversation as resolved.
Show resolved Hide resolved

extern template struct ConvOclBwdWrW2<1>;
extern template struct ConvOclBwdWrW2<2>;
extern template struct ConvOclBwdWrW2<4>;
Expand Down Expand Up @@ -2463,6 +2469,7 @@ struct ConvOclBwdWrW2NonTunable final : ConvOclBwdWrW2<1>
private:
// This function dervied from ConvOclBwdWrW2 is declared private
// so that this solver is not marked searchable/tunable.
using ConvOclBwdWrW2<1>::GetDefaultPerformanceConfig;
using ConvOclBwdWrW2<1>::GetSolution;
};

Expand Down
3 changes: 1 addition & 2 deletions src/solver/conv_asm_1x1u.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -225,8 +225,7 @@ PerformanceConfigConvAsm1x1U::PerformanceConfigConvAsm1x1U(int read_size_,
{
}

inline bool
PerformanceConfigConvAsm1x1U::operator==(const PerformanceConfigConvAsm1x1U& other) const
bool PerformanceConfigConvAsm1x1U::operator==(const PerformanceConfigConvAsm1x1U& other) const
{
// clang-format off
return read_size == other.read_size
Expand Down
2 changes: 1 addition & 1 deletion src/solver/conv_asm_1x1u_bias_activ.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ bool PerformanceConfigConvBiasActivAsm1x1U::IsValid(const ConvolutionContext& co
return PerformanceConfigConvAsm1x1U::IsValid(config);
}

inline bool PerformanceConfigConvBiasActivAsm1x1U::operator==(
bool PerformanceConfigConvBiasActivAsm1x1U::operator==(
const PerformanceConfigConvBiasActivAsm1x1U& other) const
{
// clang-format off
Expand Down
3 changes: 1 addition & 2 deletions src/solver/conv_asm_1x1u_stride2.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -287,8 +287,7 @@ PerformanceConfigConvAsm1x1UV2::PerformanceConfigConvAsm1x1UV2(int chunk_size_,
{
}

inline bool
PerformanceConfigConvAsm1x1UV2::operator==(const PerformanceConfigConvAsm1x1UV2& other) const
bool PerformanceConfigConvAsm1x1UV2::operator==(const PerformanceConfigConvAsm1x1UV2& other) const
{
// clang-format off
return chunk_size == other.chunk_size
Expand Down
3 changes: 1 addition & 2 deletions src/solver/conv_asm_3x3u.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,7 @@ PerformanceConfigConvAsm3x3U::PerformanceConfigConvAsm3x3U(int lwc, int fpw, int
{
}

inline bool
PerformanceConfigConvAsm3x3U::operator==(const PerformanceConfigConvAsm3x3U& other) const
bool PerformanceConfigConvAsm3x3U::operator==(const PerformanceConfigConvAsm3x3U& other) const
junliume marked this conversation as resolved.
Show resolved Hide resolved
{
return PerfFieldRules().Compare(*this, other);
}
Expand Down
4 changes: 2 additions & 2 deletions src/solver/conv_asm_dir_BwdWrW1x1.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -274,8 +274,8 @@ PerformanceConfigConvAsmBwdWrW1x1::PerformanceConfigConvAsmBwdWrW1x1(int chunk_s
{
}

inline bool
PerformanceConfigConvAsmBwdWrW1x1::operator==(const PerformanceConfigConvAsmBwdWrW1x1& other) const
bool PerformanceConfigConvAsmBwdWrW1x1::operator==(
const PerformanceConfigConvAsmBwdWrW1x1& other) const
{
// clang-format off
return chunk_size == other.chunk_size
Expand Down
4 changes: 2 additions & 2 deletions src/solver/conv_asm_dir_BwdWrW3x3.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -111,8 +111,8 @@ PerformanceConfigAsmDirect3x3WrW::PerformanceConfigAsmDirect3x3WrW(
{
}

inline bool
PerformanceConfigAsmDirect3x3WrW::operator==(const PerformanceConfigAsmDirect3x3WrW& other) const
bool PerformanceConfigAsmDirect3x3WrW::operator==(
const PerformanceConfigAsmDirect3x3WrW& other) const
{
// clang-format off
return limit_wave_cnt == other.limit_wave_cnt
Expand Down
2 changes: 2 additions & 0 deletions src/solver/conv_hip_implicit_gemm_bwd_v4r1.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,8 @@ PerformanceImplicitGemmBwdDataV4R1::CalculateGemmABlockCopyPerformanceParameters

// decide threadwise copy lengths
const auto a_data_per_thread_copy_gemmm = SrcDataPerRead_GemmM;
if(a_data_per_thread_copy_gemmm == 0)
MIOPEN_THROW("DIV/0 with a_data_per_thread_copy_gemmm");
const auto a_data_per_thread_copy_gemmk =
a_data_per_thread_copy / a_data_per_thread_copy_gemmm;

Expand Down
2 changes: 2 additions & 0 deletions src/solver/conv_hip_implicit_gemm_fwd_v4r4.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -315,6 +315,8 @@ PerformanceImplicitGemmV4R4Fwd::CalculateGemmBBlockCopyPerformanceParameters(
// GemmBBlockCopyDstDataPerWrite_GemmN also bounded by size of threadwise copy
DstDataPerWrite_GemmN = gcd(DstDataPerWrite_GemmN, b_data_per_thread_copy_gemmn);

if(b_data_per_thread_copy_gemmk == 0)
MIOPEN_THROW("DIV/0 with b_data_per_thread_copy_gemmk");
// calculate blockwise copy thread cluster lengths
ClusterLengths_GemmK = GemmKPerBlock / b_data_per_thread_copy_gemmk;
ClusterLengths_GemmN = GemmNPerBlock / b_data_per_thread_copy_gemmn;
Expand Down
8 changes: 7 additions & 1 deletion src/solver/conv_ocl_dir2D_bwdWrW_2.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ ConvSolution ConvOclBwdWrW2NonTunable::GetSolution(const ConvolutionContext& par
}

template <int N_BATCH_LOOPS>
inline bool PerformanceConfigConvOclBwdWrw2<N_BATCH_LOOPS>::operator==(
bool PerformanceConfigConvOclBwdWrw2<N_BATCH_LOOPS>::operator==(
const PerformanceConfigConvOclBwdWrw2<N_BATCH_LOOPS>& other) const
{
// clang-format off
Expand Down Expand Up @@ -745,6 +745,12 @@ ConvOclBwdWrW2<N_BATCH_LOOPS>::Search(const ConvolutionContext& context,
/// We need to instantiate required classes implicitly.
/// The reason is that we do not define the whole template class
/// in the header, only declaring it there.
template struct PerformanceConfigConvOclBwdWrw2<1>;
template struct PerformanceConfigConvOclBwdWrw2<2>;
template struct PerformanceConfigConvOclBwdWrw2<4>;
template struct PerformanceConfigConvOclBwdWrw2<8>;
template struct PerformanceConfigConvOclBwdWrw2<16>;
averinevg marked this conversation as resolved.
Show resolved Hide resolved

template struct ConvOclBwdWrW2<1>;
template struct ConvOclBwdWrW2<2>;
template struct ConvOclBwdWrW2<4>;
Expand Down
Loading