update doc for modifying checkpoint and performance principles #72

upvenly · 2023-05-17T01:14:01Z

No description provided.

shh2000 · 2023-05-17T01:50:50Z

docs/dev/specifications/standard-case-spec.md

-  9. 测试1x1,2x8
-  10. 补充case文档，模型文档
-  11. 对照PR提交规范，提交PR
+  2. 从头开始训练，保存ckpt(可选), 验证原始仓库精度达标


是否需要说明一下：对于具有backbone的模型，仅需从头开始训练非backbone的部分
# 不太清楚咱们对于大模型是否有标准，是在一个较广泛使用的小数据集上完成finetune的全部训练吗

具有backbone的模型，backbone可以使用pretrained weights

docs/dev/specifications/standard-case-spec.md

yuzhou03 · 2023-05-18T10:02:34Z

docs/dev/specifications/case-adatpion-spec.md

 ## 1. 厂商适配Case的代码和配置文件目录结构说明

 厂商基于标准case做硬件上的适配，适配原则：
 1. 优先默认使用标准case实现，不做上层接口变动，理想情况下，只有底层算子的区别，对用户不感知；
 2. 接受对模型做合理优化以此来提升模型的性能表现，如bs调整等，建议底层优化，暂不接受torch接口层优化, 具体可case by case讨论。
 3. 对于标准case中厂商不支持的算子，可有合理替代方案，具体可讨论。

-标准Case实现路径在training/benchmarks/&lt;model&gt;/&lt;framework&gt;/下，厂商可以通过扩展模型实现的接口来适配自己的芯片。代码放在training/&lt;vendor&gt;/下，主要包括以下几部分(以Nvidia, glm, pytorch为例）：
+标准Case实现路径在training/benchmarks/&lt;model&gt;/&lt;framework&gt;/下，厂商可以通过扩展模型实现的接口来适配自己的芯片。厂商修改的代码放在training/<vendor>/glm-pytorch下，主要包括以下几部分(以kunlunxin, glm, pytorch为例）：


还有一点Markdown格式上的小问题，< > 需要转义一下。方式1：< /> 首页readme是这样处理的方式2：用HTML的语法 < 和 > 推荐方式1，更简单些。

upvenly added 2 commits May 16, 2023 18:26

update doc

617d098

update doc

e98aa69

upvenly changed the title ~~update doc~~ update doc for modifying checkpoint and performance principles May 17, 2023

upvenly requested review from shh2000, Ox7c000000 and yuzhou03 May 17, 2023 01:16

shh2000 reviewed May 17, 2023

View reviewed changes

yuzhou03 reviewed May 17, 2023

View reviewed changes

docs/dev/specifications/standard-case-spec.md Outdated Show resolved Hide resolved

yuzhou03 reviewed May 17, 2023

View reviewed changes

docs/dev/specifications/standard-case-spec.md Outdated Show resolved Hide resolved

upvenly added 2 commits May 17, 2023 18:46

update doc

2ecb67d

update

f911822

yuzhou03 reviewed May 18, 2023

View reviewed changes

update

355c4af

yuzhou03 approved these changes May 18, 2023

View reviewed changes

yuzhou03 assigned shh2000, upvenly and Ox7c000000 May 19, 2023

shh2000 approved these changes May 19, 2023

View reviewed changes

yuzhou03 unassigned shh2000 May 24, 2023

upvenly merged commit 09fe46b into FlagOpen:main May 30, 2023

upvenly deleted the wwl/update_doc1 branch August 9, 2023 03:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update doc for modifying checkpoint and performance principles #72

update doc for modifying checkpoint and performance principles #72

upvenly commented May 17, 2023

shh2000 May 17, 2023

yuzhou03 May 17, 2023

yuzhou03 May 18, 2023 •

edited

Loading

update doc for modifying checkpoint and performance principles #72

update doc for modifying checkpoint and performance principles #72

Conversation

upvenly commented May 17, 2023

shh2000 May 17, 2023

Choose a reason for hiding this comment

yuzhou03 May 17, 2023

Choose a reason for hiding this comment

yuzhou03 May 18, 2023 • edited Loading

Choose a reason for hiding this comment

yuzhou03 May 18, 2023 •

edited

Loading