Skip to content

Commit

Permalink
add Iluvatar retinanet case. (#173)
Browse files Browse the repository at this point in the history
* add iluvatar retinanet case

* update README

* update iluvatar retinanet config and README

---------

Co-authored-by: uuup <55571217+upvenly@users.noreply.github.com>
  • Loading branch information
forestlee95 and upvenly authored Aug 1, 2023
1 parent d5c3a3a commit edd64f1
Show file tree
Hide file tree
Showing 8 changed files with 54 additions and 3 deletions.
2 changes: 1 addition & 1 deletion training/benchmarks/retinanet/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,6 @@ torchvision.models.resnet.__dict__['model_urls'][
| ---------- | ------- |
| Nvidia GPU ||
| 昆仑芯 XPU | N/A |
| 天数智芯 | N/A |
| 天数智芯 | |


4 changes: 3 additions & 1 deletion training/benchmarks/retinanet/pytorch/config/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,5 +67,7 @@
sync_bn: bool = False
gradient_accumulation_steps: int = 1

cudnn_benchmark: bool = True
cudnn_deterministic: bool = False

pretrained_path = "resnet50-0676ba61.pth"
pretrained_path = "resnet50-0676ba61.pth"
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,6 @@
'do_train', 'fp16', 'distributed', 'warmup', 'dist_backend', 'num_workers',
'device',
'cudnn_benchmark',
'cudnn_deterministic'
'cudnn_deterministic',
'local_rank'
]
35 changes: 35 additions & 0 deletions training/iluvatar/retinanet-pytorch/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
### 模型backbone权重下载
[模型backbone权重下载](https://download.pytorch.org/models/resnet50-0676ba61.pth)

这一部分路径在FlagPerf/training/benchmarks/retinanet/pytorch/model/\_\_init__.py中提供:

```python
torchvision.models.resnet.__dict__['model_urls'][
'resnet50'] = 'https://download.pytorch.org/models/resnet50-0676ba61.pth'
```
本case中默认配置为,从官网同路径(0676ba61)自动下载backbone权重。用户如需手动指定,可自行下载至被挂载到容器内的路径下,并于此处修改路径为"file://"+download_path

### 测试数据集下载

[测试数据集下载](https://cocodataset.org/)

### 天数智芯 BI-V100 GPU配置与运行信息参考
#### 环境配置
- ##### 硬件环境
- 机器、加速卡型号: Iluvatar BI-V100 32GB

- ##### 软件环境
- OS版本:Ubuntu 20.04
- OS kernel版本: 4.15.0-156-generic x86_64
- 加速卡驱动版本:3.1.0
- Docker 版本:20.10.8
- 训练框架版本:torch-1.13.1+corex.3.1.0
- 依赖软件版本:无


### 运行情况
| 训练资源 | 配置文件 | 运行时长(s) | 目标精度 | 收敛精度 | Steps数 | 性能(samples/s) |
| -------- | --------------- | ----------- | -------- | -------- | ------- | ---------------- |
| 单机8卡 | config_BI-V100x1x8 | | 0.35 | 0.348 | | |

训练精度来源:[torchvision.models — Torchvision 0.8.1 documentation (pytorch.org)](https://pytorch.org/vision/0.8/models.html?highlight=faster#torchvision.models.detection.retinanet_resnet50_fpn)
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
vendor: str = "iluvatar"
train_batch_size = 16
eval_batch_size = 16
lr = 0.04
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# =================================================
# Export variables
# =================================================


export OMP_NUM_THREADS=1
3 changes: 3 additions & 0 deletions training/iluvatar/retinanet-pytorch/config/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
opencv-python>=4.1.1
opencv-python-headless==4.1.2.30
pycocotools>=2.0
Empty file.

0 comments on commit edd64f1

Please sign in to comment.