Docs update for r-gat (#1969)

* Fixes #1648, restrict loadgen uncommitted error message to within the loadgen directory * Update test-rnnt.yml (#1688) Stopping the github action for rnnt * Added docs init Added github action for website publish Update benchmark documentation Update publish.yaml Update publish.yaml Update benchmark documentation Improved the submission documentation Fix taskname Removed unused images * Fix benchmark URLs * Fix links * Add _full variation to run commands * Added script flow diagram * Added docker setup command for CM, extra run options * Added support for docker options in the docs * Added --quiet to the CM run_cmds in docs * Fix the test query count for cm commands * Support ctuning-cpp implementation * Added commands for mobilenet models * Docs cleanup * Docs cleanup * Added separate files for dataset and models in the docs * Remove redundant tab in the docs * Fixes some WIP models in the docs * Use the official docs page for CM installation * Fix the deadlink in docs * Fix indendation issue in docs * Added dockerinfo for nvidia implementation * Added run options for gptj * Added execution environment tabs * Cleanup of the docs * Cleanup of the docs * Reordered the sections of the docs page * Removed an unnecessary heading in the docs * Fixes the commands for datacenter * Fix the build --sdist for loadgen * Fixes #1761, llama2 and mixtral runtime error on CPU systems * Added mixtral to the benchmark list, improved benchmark docs * Update docs for MLPerf inference v4.1 * Update docs for MLPerf inference v4.1 * Fix typo * Gave direct link to implementation readmes * Added tables detailing implementations * Update vision README.md, split the frameworks into separate rows * Update README.md * pointed links to specific frameworks * pointed links to specific frameworks * Update Submission_Guidelines.md * Update Submission_Guidelines.md * Update Submission_Guidelines.md * api support llama2 * Added request module and reduced max token len * Fix for llama2 api server * Update SUT_API offline to work for OpenAI * Update SUT_API.py * Minor fixes * Fix json import in SUT_API.py * Fix llama2 token length * Added model name verification with server * clean temp files * support num_workers in LLAMA2 SUTs * Remove batching from Offline SUT_API.py * Update SUT_API.py * Minor fixes for llama2 API * Fix for llama2 API * removed table of contents * enabled llama2-nvidia + vllm-NM : WIP * enabled dlrm for intel * lower cased implementation * added raw data input * corrected data download commands * renamed filename * changes for bert and vllm * documentation to work on custom repo and branch * benchmark index page update * enabled sdxl for nvidia and intel * updated vllm server run cmd * benchmark page information addition * fix indendation issue * Added submission categories * update submission page - generate submission with or w/o using CM for benchmarking * Updated kits dataset documentation * Updated model parameters * updation of information * updated non cm based benchmark * added info about hf password * added links to model and access tokens * Updated reference results structuree tree * submission docs cleanup * Some cleanups for benchmark info * Some cleanups for benchmark info * Some cleanups for benchmark info * added generic stubs deepsparse * Some cleanups for benchmark info * Some cleanups for benchmark info * Some cleanups for benchmark info * Some cleanups for benchmark info (FID and CLIP data added) * typo fix for bert deepsparse framework * added min system requirements for models * fixed code version * changes for displaying reference and intel implementation tip * added reference to installation page * updated neural magic documentation * Added links to the install page, redirect benchmarks page * added tips about batch size and dataset for nvidia llama2 * fix conditions logic * modified tips and additional run cmds * sentence corrections * Minor fix for the documentation * fixed bug in deepsparse generic model stubs + styling * added more information to stubs * Added SCC24 readme, support reproducibility in the docs * Made clear the custom CM repo URL format * Support conditional implementation, setup and run tips * Support rocm for sdxl * Fix _short tag support * Fix install URL * Expose bfloat16 and float16 options for sdxl * Expose download model to host option for sdxl * IndySCC24 documentation added * Improve the SCC24 docs * Improve the support of short variation * Improved the indyscc24 documentation * Updated scc run commands * removed test_query_count option for scc * Remove scc24 in the main docs * Remove scc24 in the main docs * Fix docs: indendation issue on the submission page * generalised code for skipping test query count * Fixes for SCC24 docs * Fix scenario text in main.py * Fix links for scc24 * Fix links for scc24 * Improve the general docs * Fix links for scc24 * Use float16 in scc24 doc * Improve scc24 docs * Improve scc24 docs * Use float16 in scc24 doc * fixed command bug * Fix typo in docs * Fix typo in docs * Remove unnecessary indendation in docs * initial commit for tip - native run CUDA * Updated tip * added docker_cm_repo_branch to more run option - docker * Update docs for IndySCC24 * Support custom repo branch and owner for final report generation * enabled amd implementation for llama2 * updations for amd - docs * Fix scenarios in docs page * formatted the files to pass the gh action * scenarios -> fixed_scenarios in docs * [Automated Commit] Format Codebase * Update indyscc24-bert.md * Update scc24.md * updated tip for reference implementation (#1912) * [Automated Commit] Format Codebase * fix for run suffix (#1913) * [Automated Commit] Format Codebase * Updation for adding submission flow diagram * Added submission flow diagram * Update scc24.md * changes in submission documentation (#1946) * update results category (#1947) * changes for adding rgat to docs (#1965) * Update index.md | Added R-GAT details (WIP) * Update index.md * Create system_requirements.yml * Update system_requirements.yml * Update system_requirements.yml * Update system_requirements.yml --------- Co-authored-by: anandhu-eng <anandhukicks@gmail.com> Co-authored-by: ANANDHU S <71482562+anandhu-eng@users.noreply.github.com> Co-authored-by: Michael Goin <michael@neuralmagic.com> Co-authored-by: arjunsuresh <arjunsuresh@users.noreply.github.com> Co-authored-by: Pablo Gonzalez <pablo.gonzalez@factored.ai> Co-authored-by: Mitchelle Rasquinha <80070689+mrasquinha-g@users.noreply.github.com> Co-authored-by: Miro <mirhodak@amd.com>
mlcommons · Dec 18, 2024 · 8397bec · 8397bec
1 parent b3e1e8e
commit 8397bec
Show file tree

Hide file tree

Showing 7 changed files with 211 additions and 81 deletions.
diff --git a/docs/benchmarks/graph/get-rgat-data.md b/docs/benchmarks/graph/get-rgat-data.md
@@ -0,0 +1,39 @@
+---
+hide:
+  - toc
+---
+
+# Graph Neural Network using R-GAT 
+
+## Dataset
+
+The benchmark implementation run command will automatically download the validation and calibration datasets and do the necessary preprocessing. In case you want to download only the datasets, you can use the below commands.
+
+=== "Full Dataset"
+    R-GAT validation run uses the IGBH dataset consisting of 547,306,935 nodes and 5,812,005,639 edges.
+
+    ### Get Full Dataset
+    ```
+    cm run script --tags=get,dataset,igbh,_full -j
+    ```
+
+=== "Debug Dataset"
+    R-GAT debug run uses the IGBH debug dataset(tiny).
+
+    ### Get Full Dataset
+    ```
+    cm run script --tags=get,dataset,igbh,_debug -j
+    ```
+
+## Model
+The benchmark implementation run command will automatically download the required model and do the necessary conversions. In case you want to only download the official model, you can use the below commands.
+
+Get the Official MLPerf R-GAT Model
+
+=== "PyTorch"
+
+    ### PyTorch
+    ```
+    cm run script --tags=get,ml-model,rgat -j
+    ```
+
diff --git a/docs/benchmarks/graph/rgat.md b/docs/benchmarks/graph/rgat.md
@@ -0,0 +1,13 @@
+---
+hide:
+  - toc
+---
+
+
+# Graph Neural Network using R-GAT 
+
+
+=== "MLCommons-Python"
+    ## MLPerf Reference Implementation in Python
+
+{{ mlperf_inference_implementation_readme (4, "rgat", "reference", devices = ["CPU", "CUDA"]) }}
diff --git a/docs/index.md b/docs/index.md
@@ -1,7 +1,7 @@
 # MLPerf Inference Benchmarks
 
 ## Overview
-The currently valid [MLPerf Inference Benchmarks](index_gh.md) as of MLPerf inference v4.0 round are listed below, categorized by tasks. Under each model you can find its details like the dataset used, reference accuracy, server latency constraints etc.
+The currently valid [MLPerf Inference Benchmarks](index_gh.md) as of MLPerf inference v5.0 round are listed below, categorized by tasks. Under each model you can find its details like the dataset used, reference accuracy, server latency constraints etc.
 
 ---
 
@@ -80,7 +80,7 @@ The currently valid [MLPerf Inference Benchmarks](index_gh.md) as of MLPerf infe
 - **Server Scenario Latency Constraint**: 130ms
 - **Equal Issue mode**: False
 - **High accuracy variant**: yes
-- **Submission Category**: Datacenter, Edge
+- **Submission Category**: Edge
 
 #### [LLAMA2-70B](benchmarks/language/llama2-70b.md)
 - **Dataset**: OpenORCA (GPT-4 split, max_seq_len=1024)
@@ -157,11 +157,22 @@ The currently valid [MLPerf Inference Benchmarks](index_gh.md) as of MLPerf infe
 - **High accuracy variant**: Yes
 - **Submission Category**: Datacenter
 
+## Graph Neural Networks
+### [R-GAT](benchmarks/graph/rgat.md)
+- **Dataset**: Illinois Graph Benchmark Heterogeneous validation dataset
+    - **Dataset Size**: 788,379
+    - **QSL Size**: 788,379
+- **Number of Parameters**: 
+- **Reference Model Accuracy**: ACC = ?
+- **Server Scenario Latency Constraint**: N/A
+- **Equal Issue mode**: True
+- **High accuracy variant**: No
+- **Submission Category**: Datacenter
 ---
 
 ## Submission Categories
-- **Datacenter Category**: All the current inference benchmarks are applicable to the datacenter category.
-- **Edge Category**: All benchmarks except DLRMv2, LLAMA2-70B, and Mixtral-8x7B are applicable to the edge category.
+- **Datacenter Category**: All benchmarks except bert are applicable to the datacenter category for inference v5.0.
+- **Edge Category**: All benchmarks except DLRMv2, LLAMA2-70B, Mixtral-8x7B and R-GAT are applicable to the edge category for v5.0.
 
 ## High Accuracy Variants
 - **Benchmarks**: `bert`, `llama2-70b`, `gpt-j`,  `dlrm_v2`, and `3d-unet` have a normal accuracy variant as well as a high accuracy variant.

diff --git a/docs/submission/index.md b/docs/submission/index.md
@@ -13,13 +13,15 @@ hide:
 
 Click [here](https://youtu.be/eI1Hoecc3ho) to view the recording of the workshop: Streamlining your MLPerf Inference results using CM.
 
-=== "CM based benchmark"
+Click [here](https://docs.google.com/presentation/d/1cmbpZUpVr78EIrhzyMBnnWnjJrD-mZ2vmSb-yETkTA8/edit?usp=sharing) to view the prposal slide for Common Automation for MLPerf Inference Submission Generation through CM.
+
+=== "CM based results"
     If you have followed the `cm run` commands under the individual model pages in the [benchmarks](../index.md) directory, all the valid results will get aggregated to the `cm cache` folder. The following command could be used to browse the structure of inference results folder generated by CM.
     ### Get results folder structure
     ```bash
     cm find cache --tags=get,mlperf,inference,results,dir | xargs tree
     ```
-=== "Non CM based benchmark"
+=== "Non CM based results"
     If you have not followed the `cm run` commands under the individual model pages in the [benchmarks](../index.md) directory, please make sure that the result directory is structured in the following way. 
     ```
     └── System description ID(SUT Name)
@@ -35,18 +37,20 @@ Click [here](https://youtu.be/eI1Hoecc3ho) to view the recording of the workshop
                 |   ├── mlperf_log_detail.txt
                 |   ├── mlperf_log_accuracy.json
                 |   └── accuracy.txt
-                └── Compliance_Test_ID
-                    ├── Performance
-                    |   └── run_x/#1 run for all scenarios
-                    |       ├── mlperf_log_summary.txt
-                    |       └── mlperf_log_detail.txt
-                    ├── Accuracy
-                    |   ├── baseline_accuracy.txt
-                    |   ├── compliance_accuracy.txt
-                    |   ├── mlperf_log_accuracy.json
-                    |   └── accuracy.txt
-                    ├── verify_performance.txt
-                    └── verify_accuracy.txt #for TEST01 only
+                |── Compliance_Test_ID
+                |   ├── Performance
+                |   |   └── run_x/#1 run for all scenarios
+                |   |       ├── mlperf_log_summary.txt
+                |   |       └── mlperf_log_detail.txt
+                |   ├── Accuracy
+                |   |   ├── baseline_accuracy.txt
+                |   |   ├── compliance_accuracy.txt
+                |   |   ├── mlperf_log_accuracy.json
+                |   |   └── accuracy.txt
+                |   ├── verify_performance.txt
+                |   └── verify_accuracy.txt #for TEST01 only
+                |── user.conf
+                └── measurements.json
     ```
 
     <details>
@@ -67,67 +71,69 @@ Once all the results across all the models are ready you can use the following c
 
 ## Generate actual submission tree
 
-=== "Closed Edge"
-    ### Closed Edge Submission
-    ```bash
-    cm run script --tags=generate,inference,submission \
-       --clean \
-       --preprocess_submission=yes \
-       --run-checker \
-       --submitter=MLCommons \
-       --tar=yes \
-       --env.CM_TAR_OUTFILE=submission.tar.gz \
-       --division=closed \
-       --category=edge \
-       --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
-       --quiet
-    ```
-
-=== "Closed Datacenter"
-    ### Closed Datacenter Submission
-    ```bash
-    cm run script --tags=generate,inference,submission \
-       --clean \
-       --preprocess_submission=yes \
-       --run-checker \
-       --submitter=MLCommons \
-       --tar=yes \
-       --env.CM_TAR_OUTFILE=submission.tar.gz \
-       --division=closed \
-       --category=datacenter \
-       --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
-       --quiet
-    ```
-=== "Open Edge"
-    ### Open Edge Submission
-    ```bash
-    cm run script --tags=generate,inference,submission \
-       --clean \
-       --preprocess_submission=yes \
-       --run-checker \
-       --submitter=MLCommons \
-       --tar=yes \
-       --env.CM_TAR_OUTFILE=submission.tar.gz \
-       --division=open \
-       --category=edge \
-       --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
-       --quiet
-    ```
-=== "Open Datacenter"
-    ### Closed Datacenter Submission
-    ```bash
-    cm run script --tags=generate,inference,submission \
-       --clean \
-       --preprocess_submission=yes \
-       --run-checker \
-       --submitter=MLCommons \
-       --tar=yes \
-       --env.CM_TAR_OUTFILE=submission.tar.gz \
-       --division=open \
-       --category=datacenter \
-       --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
-       --quiet
-    ```
+=== "Docker run"
+    ### Docker run
+    === "Closed"
+        ### Closed Submission
+        ```bash
+        cm docker script --tags=generate,inference,submission \
+            --clean \
+            --preprocess_submission=yes \
+            --run-checker \
+            --submitter=MLCommons \
+            --tar=yes \
+            --env.CM_TAR_OUTFILE=submission.tar.gz \
+            --division=closed \
+            --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
+            --quiet
+        ```
+
+    === "Open"
+        ### Open Submission
+        ```bash
+        cm docker script --tags=generate,inference,submission \
+            --clean \
+            --preprocess_submission=yes \
+            --run-checker \
+            --submitter=MLCommons \
+            --tar=yes \
+            --env.CM_TAR_OUTFILE=submission.tar.gz \
+            --division=open \
+            --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
+            --quiet
+        ```
+
+=== "Native run"
+    ### Native run
+    === "Closed"
+        ### Closed Submission
+        ```bash
+        cm run script --tags=generate,inference,submission \
+            --clean \
+            --preprocess_submission=yes \
+            --run-checker \
+            --submitter=MLCommons \
+            --tar=yes \
+            --env.CM_TAR_OUTFILE=submission.tar.gz \
+            --division=closed \
+            --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
+            --quiet
+        ```
+
+    === "Open"
+        ### Open Submission
+        ```bash
+        cm run script --tags=generate,inference,submission \
+            --clean \
+            --preprocess_submission=yes \
+            --run-checker \
+            --submitter=MLCommons \
+            --tar=yes \
+            --env.CM_TAR_OUTFILE=submission.tar.gz \
+            --division=open \
+            --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
+            --quiet
+        ```
 
 * Use `--hw_name="My system name"` to give a meaningful system name. Examples can be seen [here](https://github.com/mlcommons/inference_results_v3.0/tree/main/open/cTuning/systems)
 
@@ -137,6 +143,10 @@ Once all the results across all the models are ready you can use the following c
 
 * Use `--results_dir` option to specify the results folder for Non CM based benchmarks
 
+* Use `--category` option to specify the category for which submission is generated(datacenter/edge). By default, the category is taken from `system_meta.json` file located in the SUT root directory.
+
+* Use `--submission_base_dir` to specify the directory to which outputs from preprocess submission script and final submission is to be dumped. No need to provide `--submission_dir` along with this. For `docker run`, use `--submission_base_dir` instead of `--submission_dir`.
+
 The above command should generate "submission.tar.gz" if there are no submission checker issues and you can upload it to the [MLCommons Submission UI](https://submissions-ui.mlcommons.org/submission).
 
 ## Aggregate Results in GitHub

diff --git a/docs/system_requirements.yml b/docs/system_requirements.yml
@@ -0,0 +1,50 @@
+# All memory requirements in GB
+resnet:
+  reference:
+    fp32:
+      system_memory: 8
+      accelerator_memory: 4
+      disk_storage: 25
+  nvidia:
+    int8:
+      system_memory: 8
+      accelerator_memory: 4
+      disk_storage: 100
+  intel:
+    int8:
+      system_memory: 8
+      accelerator_memory: 0
+      disk_storage: 50
+  qualcomm:
+    int8:
+      system_memory: 8
+      accelerator_memory: 8
+      disk_storage: 50
+retinanet:
+  reference:
+    fp32:
+      system_memory: 8
+      accelerator_memory: 8
+      disk_storage: 200
+  nvidia:
+    int8:
+      system_memory: 8
+      accelerator_memory: 8
+      disk_storage: 200
+  intel:
+    int8:
+      system_memory: 8
+      accelerator_memory: 0
+      disk_storage: 200
+  qualcomm:
+    int8:
+      system_memory: 8
+      accelerator_memory: 8
+      disk_storage: 200
+rgat:
+  reference:
+    fp32:
+      system_memory: 768
+      accelerator_memory: 8
+      disk_storage: 2300
+
diff --git a/main.py b/main.py
@@ -239,7 +239,8 @@ def mlperf_inference_implementation_readme(
 
                             common_info = get_common_info(
                                 spaces + 16,
-                                implementation
+                                implementation,
+                                model.lower()
                             )
 
                             if (
@@ -488,15 +489,19 @@ def get_venv_command(spaces):
 
     # contains run command information which is common to both docker and
     # native runs
-    def get_common_info(spaces, implementation):
+    def get_common_info(spaces, implementation, model):
         info = ""
         pre_space = ""
         for i in range(1, spaces):
             pre_space = pre_space + " "
         pre_space += " "
         # pre_space = "                "
         info += f"\n{pre_space}!!! tip\n\n"
+        info += f"{pre_space}    - Number of threads could be adjusted using `--threads=#`, where `#` is the desired number of threads. This option works only if the implementation in use supports threading.\n\n"
         info += f"{pre_space}    - Batch size could be adjusted using `--batch_size=#`, where `#` is the desired batch size. This option works only if the implementation in use is supporting the given batch size.\n\n"
+        if model == "rgat":
+            info += f"{pre_space}    - Add `--env.CM_DATASET_IGBH_PATH=<Path to IGBH dataset>` if you have already downloaded the dataset. The path will be automatically mounted when using docker run.\n\n"
+            info += f"{pre_space}    - Add `--env.CM_ML_MODEL_RGAT_CHECKPOINT_PATH=<Path to R-GAT model checkpoint>` if you have already downloaded the model. The path will be automatically mounted when using docker run.\n\n"  
         if implementation.lower() == "reference":
             info += f"{pre_space}    - Add `--adr.mlperf-implementation.tags=_branch.master,_repo.<CUSTOM_INFERENCE_REPO_LINK>` if you are modifying the official MLPerf Inference implementation in a custom fork.\n\n"
             info += f"{pre_space}    - Add `--adr.inference-src.tags=_repo.<CUSTOM_INFERENCE_REPO_LINK>` if you are modifying the model config accuracy script in the submission checker within a custom fork.\n\n"

diff --git a/mkdocs.yml b/mkdocs.yml
@@ -42,6 +42,8 @@ nav:
       - MIXTRAL-8x7B: benchmarks/language/mixtral-8x7b.md
     - Recommendation:
       - DLRM-v2: benchmarks/recommendation/dlrm-v2.md
+    - Graph Neural Networks:
+      - R-GAT: benchmarks/graph/rgat.md
   - Install CM:
     - install/index.md
   - Submission: