Skip to content

Commit

Permalink
enhancing MLPerf result import and adding new automation and reproduc…
Browse files Browse the repository at this point in the history
…ibility deliverables (#1178)
  • Loading branch information
arjunsuresh authored Mar 19, 2024
2 parents fa385bf + c2bd4ae commit 1735c22
Show file tree
Hide file tree
Showing 4 changed files with 80 additions and 48 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,15 @@ Use the following CM command if you want to analyze private MLPerf results under
cm run script "get git repo _repo.https://github.com/mlcommons/submissions_inference_v4.0" --env.CM_GIT_CHECKOUT=main --extra_cache_tags=mlperf-inference-results,version-4.0-private
```

Convert raw MLPerf results into CM experiment entries (it can take 5..15 minutes to run submission checker with raw MLPerf results before converting them to the fast CM format):
Convert all raw MLPerf results into CM experiment entries (it can take 5..15 minutes to run submission checker with raw MLPerf results before converting them to the fast CM format):
```bash
cm run script "import mlperf inference to-experiment"
```
or for a specific submitter:
```bash
cm run script "import mlperf inference to-experiment" --submitter=CTuning
```


If you already generated `summary.csv` in your current directory, you can skip submission checker as follows:
```bash
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ tags:

input_mapping:
target_repo: CM_IMPORT_MLPERF_INFERENCE_TARGET_REPO
submitter: CM_MLPERF_SUBMITTER

# Dependencies on other CM scripts
deps:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,12 +78,21 @@ def preprocess(i):

print ('* Running submission checker ...')

xenv = {}

submitter = env.get('CM_MLPERF_SUBMITTER', '')
if submitter != '':
xenv['CM_MLPERF_SUBMITTER'] = submitter

ii = {'action':'run',
'automation':'script',
'tags':'run,mlperf,inference,submission,checker',
'extra_args':' --skip-extra-files-in-root-check',
'submission_dir':path}

if len(xenv)>0:
ii['env'] = xenv

if version!='':
print (' Version detected from cache tags: {}'.format(version))
ii['version']=version
Expand Down
111 changes: 64 additions & 47 deletions docs/taskforce.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,56 +33,73 @@ where CM is used:

You can participate in our discussions via [public Discord server](https://discord.gg/JjWNWXKxwT).

## Status

### 2023

Following our [regular public weekly conf-calls in 2023](https://docs.google.com/document/d/1zMNK1m_LhWm6jimZK6YE05hu4VH9usdbKJ3nBy-ZPAw),
[Discord discussions](https://discord.gg/JjWNWXKxwT) and very useful feedback from the research community,
Google, AMD, Neural Magic, OctoML, Nvidia, Qualcomm, Dell, HPE, Red Hat,
Intel, TTA, One Stop Systems and other organizations, we have developed a new version
of the [Collective Mind automation framework (CM)](https://github.com/mlcommons/ck) as a very lightweight,
non-intrusive and technology-agnostic workflow automation framework that provides a common, simple
and human-readable interface to manage, run, reproduce and customize MLPerf benchmarks
across continuously changing models, datasets, software and hardware from different vendors:
[github.com/mlcommons/ck/tree/master/docs/mlperf](https://github.com/mlcommons/ck/tree/master/docs/mlperf).

#### Outcome

* We have released the new CM version 1.5.3 and successfully validated it with the community during MLPerf inference v3.1 submission
enabling the 1st mass submission of 12K+ performance and power results across 120+ system configurations.

* We continued extending [Modular C++ Inference Library for MLPerf (MIL)](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/app-mlperf-inference-cpp)
and validated it in the v3.1 round. We also developed a Python prototype of the reference network implementation
with CM interface for BERT model and added it to the [main inference repository](https://github.com/mlcommons/inference/tree/master/language/bert#loadgen-over-the-network).

* We successfully introduced CM automation at the [ACM/IEEE International Symposium on Microarchitecture (MICRO'23)](https://cTuning.org/ae/micro2023.html)
to reproduce results from accepted research papers: [GitHub](https://github.com/ctuning/cm-reproduce-research-projects/tree/main/script)

* We successfully introduced CM automation at the [Student Cluster Competition at SuperComputing'23](https://github.com/mlcommons/ck/blob/master/docs/tutorials/scc23-mlperf-inference-bert.md)
to run MLPerf inference benchmarks across diverse hardware. It resulted in different contributions to improve MLPerf inference benchmarks
(such as adding AMD GPU backend) and CM workflows.

* We gave invited keynote about CM at the 1st ACM Conference on Reproducibility and Replicability:
[Slides](https://doi.org/10.5281/zenodo.8105339), [ACM YouTube channel](https://www.youtube.com/watch?v=_1f9i_Bzjmg)


### 2024

We are requested to focus on the following tasks:

1) Continue extending CM interface and workflows to reproduce as many MLPerf inference v3.1 submissions as possible: see [current coverage](https://github.com/mlcommons/ck/issues/1052)
2) Help all MLPerf inference v4.0 submitters automate their submissions and provide a common CM interface to rerun them in a unified way
3) Apply standard [ACM artifact review and badging methodology](https://www.acm.org/publications/policies/artifact-review-and-badging-current)
with [cTuning extensions to ML and systems conferences](https://cTuning.org/ae)
to MLPerf inference v4.0 submission to make it easier to rerun experiments and reproduce results (see [ACM Tech Talk](https://www.youtube.com/watch?v=7zpeIVwICa4)
and [ACM REP keynote](https://doi.org/10.5281/zenodo.8105339) to learn more about our vision and related experience).
4) Add CM support for the new automotive benchmark
5) Extend [CM GUI](https://cknowledge.org/cm-gui/?tags=generic,app,mlperf,inference) to help users generate CM commands to customize and run MLPerf inference benchmarks
6) Extend [Modular C++ Inference Library for MLPerf](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/app-mlperf-inference-cpp) to support more models and backends
## Deliverables (2024)

* Collaborate with chip vendors and MLPerf inference submitters to add
their implementations to CM and automate their submissions.

* Develop a more universal Python and C++ wrapper for the MLPerf loadgen
with the CM automation to support different models, data sets, software
and hardware: [Python prototype](https://github.com/mlcommons/ck/blob/master/docs/tutorials/scc23-mlperf-inference-bert.md);
[C++ prototype](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/app-mlperf-inference-mlcommons-cpp).

* Introduce [CM automation badge](https://github.com/mlcommons/ck/tree/master/cm-mlops/script/app-mlperf-inference-mlcommons-cpp)
to MLPerf inference v4.1 submission
similar to ACM/IEEE/NeurIPS reproducibility badges to make it easier for
all submitters to re-run and reproduce each others’ results before the
publication date.

* Collaborate with system vendors and cloud providers to quickly benchmark
their platforms using the best available MLPerf inference implementation.

* Collaborate with other MLPerf working groups to modularize their
benchmarks using [CM automation recipes](https://access.cknowledge.org/playground/?action=scripts).

* Use MLCommons CM to modularize and automate the upcoming [automotive benchmark](https://mlcommons.org/working-groups/benchmarks/automotive/).

* Use [MLCommons Croissant](https://mlcommons.org/working-groups/data/croissant/)
to unify [MLPerf datasets](https://access.cknowledge.org/playground/?action=scripts).



## Completed deliverables

* Developed [reusable and technology-agnostic automation recipes and workflows](https://access.cknowledge.org/playground/?action=scripts)
with a common and human-friendly interface (MLCommons Collective Mind aka CM) to modularize
MLPerf inference benchmarks and run them in a unified and automated way
across diverse models, data sets, software and hardware from different
vendors.

* Added [GitHub actions](https://github.com/mlcommons/inference/tree/master/.github/workflows)
to test MLPerf inference benchmarks using CM.

* Encoded MLPerf inference rules and best practices in the [CM automation
recipes and workflows for MLPerf](https://github.com/mlcommons/ck/tree/master/cm-mlops/script)
and reduced the burden for submitters to go through numerous README files
and track all the latest changes and reproduce results.

* Automated [MLPerf inference submissions](https://access.cknowledge.org/playground/?action=howtorun)
and made it easier to re-run and reproduce results
(see [submitters orientation](https://doi.org/10.5281/zenodo.10605079)
and [CM-MLPerf documentation](https://github.com/mlcommons/ck/tree/master/docs/mlperf)).

* Started developing an open-source platform to automatically compose
high-performance and cost-effective AI applications and systems using
MLPerf and CM (see our [presentation at MLPerf-Bench at HPCA’24](https://doi.org/10.5281/zenodo.10786893)).

* Supported AI, ML and Systems conferences to automate artifact evaluation
and reproducibility initiatives (see CM at [ACM/IEEE MICRO’23](https://ctuning.org/ae/micro2023.html)
and [SCC’23/SuperComputing’23](https://github.com/mlcommons/ck/blob/master/docs/tutorials/scc23-mlperf-inference-bert.md)).



## Resources

* [GitHub project](https://github.com/mlcommons/ck)
* [Getting Started Guide](https://github.com/mlcommons/ck/blob/master/docs/getting-started.md)
* [CM-MLPerf commands](https://github.com/mlcommons/ck/tree/master/docs/mlperf)
* [CM-MLPerf GUI](https://access.cknowledge.org/playground/?action=howtorun)
* [Invited talk at MLPerf-Bench @ HPCA'24 about Automatically Composing High-Performance and Cost-effective AI Systems with MLCommons' CM and MLPerf](https://doi.org/10.5281/zenodo.10786893)
* [Invited keynote about CM framework at ACM REP'23](https://doi.org/10.5281/zenodo.8105339)
* [ACM artifact review and badging methodology](https://www.acm.org/publications/policies/artifact-review-and-badging-current)
* [Artifact Evaluation at ML and systems conferences](https://cTuning.org/ae)
Expand Down

0 comments on commit 1735c22

Please sign in to comment.