Merge branch 'main' into t5_small

FlagOpen · Aug 29, 2023 · a0f009c · a0f009c
2 parents 4cda811 + 8d27b48
commit a0f009c
Show file tree

Hide file tree

Showing 130 changed files with 5,276 additions and 434 deletions.
diff --git a/README.md b/README.md
@@ -9,12 +9,12 @@ FlagPerf是一款面向AI异构芯片的通用基准测试平台。我们希望
 ----------
 ### 支持列表
 
-你可以点击**模型或训练框架**来跳转到**对应case的训练脚本**，✅来跳转到**对应厂商的运行配置**。
-
 under review表示对应case的支持已开发完毕，在review中；Incoming表示正在添加或计划添加中；N/A表示不支持或尚无计划添加
 
 #### 训练列表
 
+你可以点击**模型或训练框架**来跳转到**对应case的训练脚本**，✅来跳转到**对应厂商的运行配置**。
+
 <table width="960" border="0" cellpadding="0" cellspacing="0" style='width:960pt;border-collapse:collapse;table-layout:fixed;'>
    <col width="73.60" style='mso-width-source:userset;mso-width-alt:3588;'/>
    <col width="70" style='mso-width-source:userset;mso-width-alt:3413;'/>
@@ -43,7 +43,7 @@ under review表示对应case的支持已开发完毕，在review中；Incoming
     <td class="xl69" x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/training/benchmarks/bert/paddle" style="text-decoration:none" target="_parent">Paddle</a></td>
     <td class="xl69" x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/training/nvidia/bert-paddle" style="text-decoration:none" target="_parent">✅</a></td>
     <td class="xl69" x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/training/kunlunxin/bert-paddle" style="text-decoration:none" target="_parent">✅</a></td>
-    <td class="xl69" x:str>N/A</td>
+    <td class="xl69" x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/training/iluvatar/bert-paddle" style="text-decoration:none" target="_parent">✅</a></td>
       <td class="xl69" x:str>N/A</td>
    </tr>
    <tr height="16.80" style='height:16.80pt;'>
@@ -89,7 +89,7 @@ under review表示对应case的支持已开发完毕，在review中；Incoming
     <td class="xl69" x:str>N/A</td>
     <td class="xl69" x:str>N/A</td>
     <td class="xl69" x:str>N/A</td>
-      <td class="xl69" x:str><a href="https://github.com/FlagOpen/FlagPerf/pull/144" style="text-decoration:none" target="_parent">under review</a></td>
+      <td class="xl69" x:str>under review</td>
    </tr>
    <tr height="16.80" style='height:16.80pt;'>
     <td class="xl65" height="33.60" rowspan="2" style='height:33.60pt;border-right:none;border-bottom:none;' x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/training/benchmarks/mobilenetv2" style="text-decoration:none" target="_parent">MobileNetV2</a></td>
@@ -227,7 +227,7 @@ under review表示对应case的支持已开发完毕，在review中；Incoming
    </tr> 
 <tr height="16.80" style='height:16.80pt;'>
     <td class="xl69" x:str>Paddle</td>
-    <td class="xl69" x:str>Incoming</td>
+    <td class="xl69" x:str>under review</td>
     <td class="xl69" x:str>N/A</td>
     <td class="xl69" x:str>N/A</td>
       <td class="xl69" x:str>N/A</a></td>
@@ -298,7 +298,7 @@ under review表示对应case的支持已开发完毕，在review中；Incoming
 <tr height="16.80" style='height:16.80pt;'>
     <td class="xl65" height="33.60" rowspan="1" style='height:33.60pt;border-right:none;border-bottom:none;' x:str>GPT2</td>
     <td class="xl69" x:str>PyTorch</td>
-    <td class="xl69" x:str>Incoming</td>
+    <td class="xl69" x:str>under review</td>
     <td class="xl69" x:str>Incoming</td>
     <td class="xl69" x:str>N/A</td>
       <td class="xl69" x:str>N/A</a></td>
@@ -354,8 +354,11 @@ under review表示对应case的支持已开发完毕，在review中；Incoming
   </table>
 
 
+
 #### 推理列表
 
+你可以点击**模型来跳转到**对应case的推理脚本及结果。
+
 <table width="960" border="0" cellpadding="0" cellspacing="0" style='width:960pt;border-collapse:collapse;table-layout:fixed;'>
    <col width="73.60" style='mso-width-source:userset;mso-width-alt:3588;'/>
    <col width="70" style='mso-width-source:userset;mso-width-alt:3413;'/>
@@ -365,124 +368,84 @@ under review表示对应case的支持已开发完毕，在review中；Incoming
    <tr height="16.80" class="xl65" style='height:16.80pt;'>
    </tr>
    <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl65" x:str>模型</td> 
-    <td class="xl65" x:str>精度</td>
-    <td class="xl65" x:str>英伟达tensorrt</td>
-    <td class="xl65" x:str>昆仑芯xtcl</td>
-    <td class="xl65" x:str>天数智芯ixrt</td>
+    <td class="xl65" x:str>模型</td>
+    <td class="xl65" x:str>英伟达+tensorrt</td>
+    <td class="xl65" x:str>英伟达+inductor</td>
+    <td class="xl65" x:str>昆仑芯+xtcl</td>
+    <td class="xl65" x:str>天数智芯+ixrt</td>
    </tr>
    <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl65" height="33.60" rowspan="2" style='height:33.60pt;border-right:none;border-bottom:none;' x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/inference/benchmarks/resnet50" style="text-decoration:none" target="_parent">resnet50</a></td>
-    <td class="xl69" x:str>float32</td>
-    <td class="xl69" x:str>✅</td>
-    <td class="xl69" x:str>Incoming</td>
-    <td class="xl69" x:str>Incoming</td>
-   </tr>
-    <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl69" x:str>float16</td>
-    <td class="xl69" x:str>✅</td>
-    <td class="xl69" x:str>Incoming</td>
-    <td class="xl69" x:str>Incoming</td>
-   </tr>
-    <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl65" height="33.60" rowspan="2" style='height:33.60pt;border-right:none;border-bottom:none;' x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/inference/benchmarks/bertLarge" style="text-decoration:none" target="_parent">BertLarge</a></td>
-    <td class="xl69" x:str>float32</td>
-    <td class="xl69" x:str>✅</td>
-    <td class="xl69" x:str>Incoming</td>
-    <td class="xl69" x:str>Incoming</td>
-   </tr>
-    <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl69" x:str>float16</td>
-    <td class="xl69" x:str>✅</td>
-    <td class="xl69" x:str>Incoming</td>
-    <td class="xl69" x:str>Incoming</td>
-   </tr>
-    <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl65" height="33.60" rowspan="2" style='height:33.60pt;border-right:none;border-bottom:none;' x:str><a href="https://github.com/FlagOpen/FlagPerf/pull/186" style="text-decoration:none" target="_parent">VisionTransformer</a></td>
-    <td class="xl69" x:str>float32</td>
-    <td class="xl69" x:str>Under Review</td>
-    <td class="xl69" x:str>Incoming</td>
-    <td class="xl69" x:str>Incoming</td>
+    <td class="xl65" height="33.60"  style='height:33.60pt;border-right:none;border-bottom:none;' x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/inference/benchmarks/resnet50" style="text-decoration:none" target="_parent">resnet50</a></td>
+    <td class="xl69" x:str>f32/f16</td>
+    <td class="xl69" x:str>N/A</td>
+    <td class="xl69" x:str>f32</td>
+    <td class="xl69" x:str>f16</td>
    </tr>
     <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl69" x:str>float16</td>
-    <td class="xl69" x:str>Under Review</td>
-    <td class="xl69" x:str>Incoming</td>
+    <td class="xl65" height="33.60"  style='height:33.60pt;border-right:none;border-bottom:none;' x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/inference/benchmarks/bertLarge" style="text-decoration:none" target="_parent">BertLarge</a></td>
+    <td class="xl69" x:str>f32/f16</td>
+        <td class="xl69" x:str>N/A</td>
+    <td class="xl69" x:str>under review</td>
     <td class="xl69" x:str>Incoming</td>
    </tr>
     <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl65" height="33.60" rowspan="2" style='height:33.60pt;border-right:none;border-bottom:none;' x:str><a href="https://github.com/FlagOpen/FlagPerf/pull/190" style="text-decoration:none" target="_parent">Yolov5_large</a></td>
-    <td class="xl69" x:str>float32</td>
-    <td class="xl69" x:str>Under Review</td>
-    <td class="xl69" x:str>Incoming</td>
-    <td class="xl69" x:str>Incoming</td>
+    <td class="xl65" height="33.60"  style='height:33.60pt;border-right:none;border-bottom:none;' x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/inference/benchmarks/vit_l_16" style="text-decoration:none" target="_parent">VisionTransformer</a></td>
+    <td class="xl69" x:str>f32/f16</td>
+        <td class="xl69" x:str>N/A</td>
+    <td class="xl69" x:str>under review</td>
+    <td class="xl69" x:str>N/A</td>
    </tr>
     <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl69" x:str>float16</td>
-    <td class="xl69" x:str>Under Review</td>
+    <td class="xl65" height="33.60" style='height:33.60pt;border-right:none;border-bottom:none;' x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/inference/benchmarks/yolov5" style="text-decoration:none" target="_parent">Yolov5_large</a></td>
+    <td class="xl69" x:str>f32/f16</td>
+        <td class="xl69" x:str>N/A</td>
     <td class="xl69" x:str>Incoming</td>
     <td class="xl69" x:str>Incoming</td>
    </tr>
    <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl65" height="33.60" rowspan="2" style='height:33.60pt;border-right:none;border-bottom:none;' x:str><a href="https://github.com/FlagOpen/FlagPerf/pull/191" style="text-decoration:none" target="_parent">Stable Diffusion v1.4</a></td>
-    <td class="xl69" x:str>float32</td>
-    <td class="xl69" x:str>Under Review</td>
-    <td class="xl69" x:str>Incoming</td>
+    <td class="xl65" height="33.60"  style='height:33.60pt;border-right:none;border-bottom:none;' x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/inference/benchmarks/stable_diffusion_v1_4" style="text-decoration:none" target="_parent">Stable Diffusion v1.4</a></td>
+    <td class="xl69" x:str>f32/f16</td>
+       <td class="xl69" x:str>N/A</td>
     <td class="xl69" x:str>Incoming</td>
+    <td class="xl69" x:str>N/A</td>
    </tr>
     <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl69" x:str>float16</td>
-    <td class="xl69" x:str>Under Review</td>
-    <td class="xl69" x:str>Incoming</td>
-    <td class="xl69" x:str>Incoming</td>
-   </tr> 
-    <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl65" height="33.60" rowspan="1" style='height:33.60pt;border-right:none;border-bottom:none;' x:str>SwinTransformer</td>
-    <td class="xl69" x:str>float32</td>
-    <td class="xl69" x:str>Incoming</td>
-    <td class="xl69" x:str>Incoming</td>
+    <td class="xl65" height="33.60"  style='height:33.60pt;border-right:none;border-bottom:none;' x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/inference/benchmarks/swinTransformer" style="text-decoration:none" target="_parent">SwinTransformer</td>
+    <td class="xl69" x:str>f32/f16</td>
+        <td class="xl69" x:str>N/A</td>
     <td class="xl69" x:str>Incoming</td>
+    <td class="xl69" x:str>N/A</td>
    </tr>
     <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl65" height="33.60" rowspan="1" style='height:33.60pt;border-right:none;border-bottom:none;' x:str>SAM(segment anything)</td>
-    <td class="xl69" x:str>float32</td>
-    <td class="xl69" x:str>Incoming</td>
-    <td class="xl69" x:str>Incoming</td>
-    <td class="xl69" x:str>Incoming</td>
-   </tr>
-        <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl65" height="33.60" rowspan="1" style='height:33.60pt;border-right:none;border-bottom:none;' x:str>DLRM</td>
-    <td class="xl69" x:str>float32</td>
+    <td class="xl65" height="33.60"  style='height:33.60pt;border-right:none;border-bottom:none;' x:str>DLRM</td>
     <td class="xl69" x:str>Incoming</td>
+        <td class="xl69" x:str>N/A</td>
     <td class="xl69" x:str>N/A</td>
     <td class="xl69" x:str>N/A</td>
    </tr>
         <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl65" height="33.60" rowspan="1" style='height:33.60pt;border-right:none;border-bottom:none;' x:str>RNNT</td>
-    <td class="xl69" x:str>float32</td>
+    <td class="xl65" height="33.60"  style='height:33.60pt;border-right:none;border-bottom:none;' x:str>RNNT</td>
     <td class="xl69" x:str>Incoming</td>
+            <td class="xl69" x:str>N/A</td>
     <td class="xl69" x:str>N/A</td>
     <td class="xl69" x:str>N/A</td>
    </tr>
     <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl65" height="33.60" rowspan="1" style='height:33.60pt;border-right:none;border-bottom:none;' x:str>Llama2-7B</td>
-    <td class="xl69" x:str>float16</td>
-    <td class="xl69" x:str>Incoming</td>
-    <td class="xl69" x:str>Incoming</td>
-    <td class="xl69" x:str>Incoming</td>
+    <td class="xl65" height="33.60" style='height:33.60pt;border-right:none;border-bottom:none;' x:str><a href="https://github.com/FlagOpen/FlagPerf/tree/main/inference/benchmarks/llama2_7b_mmlu" style="text-decoration:none" target="_parent">Llama2-7B-mmlu</td>
+        <td class="xl69" x:str>N/A</td>
+    <td class="xl69" x:str>f32/f16</td>
+    <td class="xl69" x:str>N/A</td>
+    <td class="xl69" x:str>N/A</td>
    </tr>
     <tr height="16.80" style='height:16.80pt;'>
-    <td class="xl65" height="33.60" rowspan="1" style='height:33.60pt;border-right:none;border-bottom:none;' x:str>Aquila-7B</td>
-    <td class="xl69" x:str>float16</td>
-    <td class="xl69" x:str>Incoming</td>
-    <td class="xl69" x:str>Incoming</td>
+    <td class="xl65" height="33.60" style='height:33.60pt;border-right:none;border-bottom:none;' x:str>Aquila-7B-mmlu</td>
+        <td class="xl69" x:str>N/A</td>
     <td class="xl69" x:str>Incoming</td>
+    <td class="xl69" x:str>N/A</td>
+    <td class="xl69" x:str>N/A</td>
    </tr>
 </table>
 
-
-
-
 ### 训练部署及启动说明
 
 [训练文档](https://github.com/FlagOpen/FlagPerf/tree/main/training/README.md)

diff --git a/docs/dev/inference-case-doc.md b/docs/dev/inference-case-doc.md
@@ -205,6 +205,8 @@ exist_compiler_path: null
 
 对于某些case，如需要镜像本身以外的pip包，可在benchmarks/&lt;case\>/\<framework\>下添加requirements.txt，框架启动时会自动在运行此次评测的镜像中安装
 
+对于部分厂商适配case，如果需要与标准case不同，且不能包含在厂商dockerfile中的包（例如，yolov5标准case依赖于pycocotools2.0.4，厂商A只接受pycocotools2.0.7），则可以在同目录下添加\<vendor\>_requirements.txt（如nvidia\_requirements.txt）。框架在自动安装时如果检测到对应vendor的文件，则会**忽略标准case的“requirements.txt，只安装该厂商指定的\<vendor\>\_requirements.txt**。
+
 ##### 3.2.1 config
 
 对于标准case、需要组织的是configurations.yaml、parameters.yaml，及vendor_config/nvidia_configurations.yaml。

diff --git a/inference/benchmarks/llama2_7b_mmlu/README.md b/inference/benchmarks/llama2_7b_mmlu/README.md
@@ -0,0 +1,69 @@
+### 1. 推理数据集
+
+* 下载地址：`https://huggingface.co/datasets/Stevross/mmlu/tree/main`
+  1. 下载其中的data.tar
+  2. 将.tar文件还原为目录
+  3. 将解压后的data目录放置在config.data_dir/config.mmlu_dir
+
+### 2. 模型与权重
+
+* 模型实现
+  * pytorch：transformers.LlamaForCausalLM
+* 权重加载
+  * pytorch：LlamaForCausalLM.from_pretrained(config.data_dir/config.weight_dir)
+* 权重获取方式
+  1. 填写申请表，向meta ai申请获取llama2模型权重，并同意相关协议
+  2. 下载其中的llama2-7b权重（注意不是chat）
+  3. 使用huggingface提供的convert.py将权重转化为huggingface格式，并保存在config.data_dir/config.weight_dir
+
+### 3. 软硬件配置与运行信息参考
+
+#### 3.1 Nvidia A100
+
+- ##### 硬件环境
+    - 机器、加速卡型号: NVIDIA_A100-SXM4-40GB
+    - 多机网络类型、带宽: InfiniBand，200Gb/s
+
+- ##### 软件环境
+   - OS版本：Ubuntu 20.04
+   - OS kernel版本: 5.4.0-113-generic
+   - 加速卡驱动版本：470.129.06
+   - Docker 版本：20.10.16
+   - 训练框架版本：pytorch-2.1.0a0+4136153
+   - 依赖软件版本：
+     - cuda: 12.1
+
+   - 推理工具包
+   - Inductor (torch._dynamo) pytorch-2.1.0a0+4136153
+
+- ##### 优化策略
+
+   - None
+
+- ##### 并行策略
+
+   - None
+
+### 4. 运行情况（Llama2_7b_MMLU）
+
+* 指标列表
+
+| 指标名称           | 指标值索引        | 特殊说明                                                    |
+| ------------------ | ----------------- | ----------------------------------------------------------- |
+| 数据精度           | precision         | 可选fp32/fp16                                               |
+| 硬件存储使用       | mem               | 通常称为“显存”,单位为GiB                                    |
+| 端到端时间         | e2e_time          | 总时间+Perf初始化等时间                                     |
+| 验证总吞吐量       | p_val_whole       | 实际验证序列数除以总验证时间                                |
+| 验证计算吞吐量     | p_val_core       | 不包含IO部分耗时                                            |
+| 推理总吞吐量       | p_infer_whole     | 实际推理序列数除以总推理时间                                |
+| **推理计算吞吐量** | **\*p_infer_core** | 不包含IO部分耗时                             |
+| **计算卡使用率** | **\*MFU** | model flops utilization                             |
+| 推理结果           | acc(推理/验证)    | 单位为MMLU回答准确率                            |
+
+* 指标值
+
+
+| 推理工具  | precision | e2e_time | p_val_whole | p_val_core | p_infer_whole | \*p_infer_core | \*MFU     | acc         | mem        |
+| ----------- | --------- | ---- | ---- | -------- | ----------- | ---------- | ------------- | ------------ | ----------- | ----------- |
+| inductor | fp16      | 2558     | 8596.9      | 8630.3     | 9230.8        | 10052.2        | 45.1% | 45.8%/45.8% | 28.0/40.0 |
+| inductor | fp32   | 4143     | 5455.3      | 5469.4     | 5675.7        | 5951.8         | 53.4% | 45.8%/45.8% | 35.0/40.0 |
diff --git a/inference/benchmarks/llama2_7b_mmlu/pytorch/__init__.py b/inference/benchmarks/llama2_7b_mmlu/pytorch/__init__.py
@@ -0,0 +1,5 @@
+from .dataloader import build_dataloader
+from .model import create_model
+from .export import export_model
+from .evaluator import evaluator
+from .forward import model_forward, engine_forward