quickTunerPreproc.py, preprocessor script for quick tuner scripts #1575

ethansaurusrex · 2024-07-18T03:23:25Z

This script accepts the *.debug files that are generated from tuningRunner.py as input and concatenates them into a single tsv file for use in the next step of the quick tuning pipeline: generation.

…pts." This reverts commit 0489788.

pcf000 · 2024-07-19T20:07:58Z

mlir/utils/performance/quickTunerPreproc.py

+    def __init__(self, pargs):
+        self.input_dir = pargs.input_dir
+
+    @staticmethod


Why staticmethod? Not a criticism, I'm just less familiar with why that's done in python and I'm curious.

This was intended to be imported and called within the quickTunerGen.py set of classes but I could not rationalize each individual class having their own instance of this preprocessor class hanging out inside:
So, it was either have a standalone function or a class with a static method that would allow me to do something like:

qtPreprocessor.process( /* args */ )

From my experience with using the @staticmethod decorator, it has always been what to use when you want to package functions together that have similar functionality and can share data but do not share state.

pcf000 · 2024-07-19T20:08:41Z

mlir/utils/performance/quickTunerPreproc.py

+            df = pd.read_csv(file, sep='\t')
+            if normalize:
+                scaler = MinMaxScaler()
+                df['TFlops'] = scaler.fit_transform(df[['TFlops']])


Why one set of brackets on the left but two on the right??

I think that was a mess up on my end. Fortunately (or unfortunately) the syntax is forgiving.

Also, this will actually be changed to add a NormalizedTFlops column instead of overwriting the current one.

if normalize: scaler = MinMaxScaler() df['NormalizedTFlops'] = scaler.fit_transform(df[['TFlops']])

default)

krzysz00

Python nitpicking

krzysz00 · 2024-07-23T18:29:22Z

mlir/utils/performance/quickTunerPreproc.py

+
+to:
+
+    import uuid


Could this instead be a patch to rocmlir_worker?

Not in its current form; it cuts off the usual functionality of passing results to stdout and we don't have code to read results from the files and send it on. I do have an open issue about collecting all the results into the database and this may become part of it.

krzysz00 · 2024-07-23T18:30:02Z

mlir/utils/performance/quickTunerPreproc.py

+import glob
+from sklearn.preprocessing import MinMaxScaler
+
+class qtPreprocessor(object):


Nit class case, and I'm pretty sure Python 3 doesn't require the (object) anymore

mlir/utils/performance/quickTunerPreproc.py

krzysz00 · 2024-07-23T18:31:27Z

mlir/utils/performance/quickTunerPreproc.py

+
+        dfs = []
+        ct = 0
+        for file in tsv_files:


You can emumerate() instead of maintaining a ct

I will replace this with a len() call since it is not needed within the loop only for generating stats.

counter for processing loop.

krzysz00

Broadly approved

krzysz00 · 2024-07-31T23:32:21Z

mlir/utils/performance/quickTunerPreproc.py

+import glob
+from sklearn.preprocessing import MinMaxScaler
+
+class qtPreprocessor():


Standard complaining re style on class names

Adding quickTunerPreproc.py, 1/3 quick tuner perf config scripts.

0489788

This script accepts the *.debug files that are generated from tuningRunner.py as input and concatenates them into a single tsv file for use in the next step of the quick tuning pipeline: generation.

ethansaurusrex requested review from jerryyin and sjw36 as code owners July 18, 2024 03:23

ethansaurusrex requested review from djramic and pcf000 July 18, 2024 15:02

ethansaurusrex added 2 commits July 18, 2024 19:15

Column 'TFlops' -> 'NormalizedTFlops'

6a5a43a

Revert "Adding quickTunerPreproc.py, 1/3 quick tuner perf config scri…

e2d92bf

…pts." This reverts commit 0489788.

pcf000 reviewed Jul 19, 2024

View reviewed changes

Added new column 'NormalizedTFlops' if --normalize is used (true by

1cd8bda

default)

krzysz00 reviewed Jul 23, 2024

View reviewed changes

ethansaurusrex added 2 commits July 25, 2024 14:04

Corrected issues: using 'object' explicitly for parent class and using

5995293

counter for processing loop.

Fixed poor printing method

d44114d

krzysz00 approved these changes Jul 31, 2024

View reviewed changes

ethansaurusrex and others added 2 commits August 8, 2024 15:11

Added convulition columns

c7b9486

Added option to exclude splitK from the tuning data list

30999f4

djramic force-pushed the qtPreprocessor branch from 1eac2a7 to 30999f4 Compare October 9, 2024 13:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quickTunerPreproc.py, preprocessor script for quick tuner scripts #1575

quickTunerPreproc.py, preprocessor script for quick tuner scripts #1575

ethansaurusrex commented Jul 18, 2024

pcf000 Jul 19, 2024

ethansaurusrex Jul 19, 2024

pcf000 Jul 19, 2024

ethansaurusrex Jul 19, 2024

krzysz00 left a comment

krzysz00 Jul 23, 2024

pcf000 Jul 23, 2024

krzysz00 Jul 23, 2024

krzysz00 Jul 23, 2024

ethansaurusrex Jul 25, 2024

krzysz00 left a comment

krzysz00 Jul 31, 2024

quickTunerPreproc.py, preprocessor script for quick tuner scripts #1575

Are you sure you want to change the base?

quickTunerPreproc.py, preprocessor script for quick tuner scripts #1575

Conversation

ethansaurusrex commented Jul 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krzysz00 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krzysz00 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment