Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Mi300 to 2.x #231

Merged
merged 31 commits into from
Jan 24, 2024
Merged
Changes from 1 commit
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
ce0ce01
Adding sbios, vbios, and partitioning info to specs. Also reorganizin…
coleramos425 Jan 10, 2024
d0dd2bc
Moving the join_prof func into parent class and adding output_headers…
coleramos425 Jan 10, 2024
c0b6d3e
Optimizations to run_prof() utility
coleramos425 Jan 11, 2024
388f0dd
Separate aggregation of TCC and TCC2 counters
coleramos425 Jan 11, 2024
116cd65
Fix bug in profiling for Mi100
coleramos425 Jan 11, 2024
b674209
Add options for --specs and --specs-correction
coleramos425 Jan 15, 2024
b26cc21
Implement specs correction function
coleramos425 Jan 15, 2024
8baa38f
Fix bug in coalescing TCC per channel counters
coleramos425 Jan 15, 2024
1a68376
Modify schema to reflect new l2 cache per channel
coleramos425 Jan 15, 2024
8420aef
Update config files for new l2 cache per channel structure and genera…
coleramos425 Jan 15, 2024
a1d2646
Add new plotille module
coleramos425 Jan 15, 2024
fccaa3f
Update rocprof config files for counter updates
coleramos425 Jan 15, 2024
ce3a199
Add new submodule for mem chart
coleramos425 Jan 15, 2024
f12b67a
Enable specs correction, detection of cli_style property, and hbm sta…
coleramos425 Jan 16, 2024
132784e
Enable Standalone GUI. Note L2 per channel graphics haven't been port…
coleramos425 Jan 17, 2024
64f1560
Re-enable TCP_TCC_READ_REQ_LATENCY_sum in mi200
coleramos425 Jan 17, 2024
90f4d6f
Decode specs using utf-8 for backwards compatibility in rocprof
coleramos425 Jan 17, 2024
0799fc1
Improved communication between SoC and profiler classes
coleramos425 Jan 19, 2024
ec374cf
Add config files for Mi300_A1 and modify run_prof utility to use XCC …
coleramos425 Jan 19, 2024
7335dc3
Build out gfx942 SoC class and improve SoC detection
coleramos425 Jan 19, 2024
4092b2a
Update headers based on latest rocprofv2 output. Overwrite old headers.
coleramos425 Jan 19, 2024
a702735
Fix small typo
coleramos425 Jan 19, 2024
e01c397
Fixing typo in KernelName header
coleramos425 Jan 19, 2024
3a69282
Improved --list-metric pretty print
coleramos425 Jan 19, 2024
f360d17
Adding support for Mi300X-A0
coleramos425 Jan 22, 2024
5920490
Add implementation for gfx940 arch
coleramos425 Jan 22, 2024
cd7198d
Catch ROCm-6.0.0 headers for replacement and standardization
coleramos425 Jan 22, 2024
4e256c8
Remove old omniperf_analyze implementation
coleramos425 Jan 22, 2024
3b3965f
Add license delimiter to new file
coleramos425 Jan 23, 2024
bd4060c
Responding to Karls review
coleramos425 Jan 24, 2024
ab1acb2
Patch specs enhancement for Mi300
coleramos425 Jan 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix bug in profiling for Mi100
Signed-off-by: colramos-amd <colramos@amd.com>
  • Loading branch information
coleramos425 committed Jan 24, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
commit 116cd65ae871ffef5c4e6691f6b5d71c9f9c0e63
6 changes: 3 additions & 3 deletions src/omniperf_profile/profiler_base.py
Original file line number Diff line number Diff line change
@@ -334,9 +334,9 @@ def run_profiling(self, version:str, prog:str):
logging.debug(output)
logging.info("\nCurrent input file: %s" % fname)

options = self.get_profiler_options(fname)
options += self._soc.get_profiler_options()
print("options are ", options)
# Fetch any SoC/profiler specific profiling options
options = self._soc.get_profiler_options()
options += self.get_profiler_options(fname)

if self.__profiler == "rocprofv1" or self.__profiler == "rocprofv2":
run_prof(
5 changes: 3 additions & 2 deletions src/omniperf_soc/soc_base.py
Original file line number Diff line number Diff line change
@@ -31,6 +31,7 @@
import re
import numpy as np
from utils.utils import demarcate
from pathlib import Path

class OmniSoC_Base():
def __init__(self,args):
@@ -57,8 +58,8 @@ def set_perfmon_config(self, config: dict):
self.__perfmon_config = config
def set_soc_param(self, param: dict):
self.__soc_params = param
def get_perfmon_dir(self):
return self.__perfmon_dir
def get_workload_perfmon_dir(self):
return str(Path(self.__perfmon_dir).parent.absolute())
def get_soc_param(self):
return self.__soc_params
def set_soc(self, soc: str):
2 changes: 1 addition & 1 deletion src/omniperf_soc/soc_gfx908.py
Original file line number Diff line number Diff line change
@@ -66,7 +66,7 @@ def __init__(self,args):
@demarcate
def get_profiler_options(self):
# Mi100 requires a custom xml config
return ["-m", self.get_perfmon_dir() + "/" + "metrics.xml"]
return ["-m", self.get_workload_perfmon_dir() + "/" + "metrics.xml"]

#-----------------------
# Required child methods
2 changes: 1 addition & 1 deletion src/utils/utils.py
Original file line number Diff line number Diff line change
@@ -192,7 +192,7 @@ def run_prof(fname, profiler_options):

# profile the app
success, output = capture_subprocess_output(
[ rocprof_cmd, "-i", fname ] + options
[ rocprof_cmd ] + options
)

if not success: