Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add efficientnet for iluvatar #170

Merged
merged 2 commits into from
Aug 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions training/iluvatar/docker_image/pytorch/packages/README.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# 以下软件包需联系天数智芯获取

>联系邮箱: rui.yan@iluvatar.com
>联系邮箱: contact-us@iluvatar.com

apex-0.1+corex.3.0.0-cp38-cp38-linux_x86_64.whl
apex-0.1+corex.3.1.0-cp38-cp38-linux_x86_64.whl

torch-1.10.2+corex.3.0.0-cp38-cp38-linux_x86_64.whl
torch-1.13.1+corex.3.1.0-cp38-cp38-linux_x86_64.whl

torchtext-0.11.2+corex.3.0.0-cp38-cp38-linux_x86_64.whl
torchtext-0.14.1+corex.3.1.0-cp38-cp38-linux_x86_64.whl

torchvision-0.14.1+corex.3.1.0-cp38-cp38-linux_x86_64.whl
4 changes: 2 additions & 2 deletions training/iluvatar/docker_image/pytorch/sdk_installers/README.md
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# 以下软件包需联系天数智芯获取

>联系邮箱: rui.yan@iluvatar.com
>联系邮箱: contact-us@iluvatar.com

corex-installer-linux64-3.0.0_x86_64_10.2.run
corex-installer-linux64-3.1.0_x86_64_10.2.run

cuda_10.2.89_440.33.01_linux.run
26 changes: 26 additions & 0 deletions training/iluvatar/efficientnet-pytorch/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
### 测试数据集下载
[测试数据集下载](../../benchmarks/efficientnet/README.md#数据集)

### 天数智芯 BI-V100 GPU配置与运行信息参考
#### 环境配置
- ##### 硬件环境
- 机器、加速卡型号: Iluvatar BI-V100 32GB

- ##### 软件环境
- OS版本:Ubuntu 20.04
- OS kernel版本: 4.15.0-156-generic x86_64
- 加速卡驱动版本:3.1.0
- Docker 版本:20.10.21
- 训练框架版本:torch-1.13.1+corex.3.1.0
- 依赖软件版本:无


### 运行情况
| 训练资源 | 配置文件 | 运行时长(s) | 目标精度 | 收敛精度 | Steps数 | 性能(samples/s) |
| -------- | --------------- | ----------- | -------- | -------- | ------- | ---------------- |
| 单机1卡 | config_BI-V100x1x1 | | | | | |
| 单机2卡 | config_BI-V100x1x2 | | | | | |
| 单机4卡 | config_BI-V100x1x4 | | | | | |
| 单机8卡 | config_BI-V100x1x8 | | 82.672 | 82.528 | 1000800 | |
| 两机8卡 | config_BI-V100x2x8 | | | | | |

Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
from config_common import *

train_batch_size = 96
eval_batch_size = 96
25 changes: 25 additions & 0 deletions training/iluvatar/efficientnet-pytorch/config/config_common.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
vendor = "iluvatar"
dist_backend = "nccl"

lr = 0.5
lr_scheduler = "cosineannealinglr"
lr_warmup_epochs = 5
lr_warmup_method = "linear"
auto_augment = "ta_wide"
random_erase = 0.1
label_smoothing = 0.1
mixup_alpha = 0.2
cutmix_alpha = 1.0
weight_decay = 0.00002
norm_weight_decay = 0.0
ra_sampler = True
ra_reps = 4
epochs = 600
num_workers = 8

# efficientnet_v2_s
TRAIN_SIZE = 300
train_crop_size = TRAIN_SIZE
EVAL_SIZE = 384
val_crop_size = EVAL_SIZE
val_resize_size = EVAL_SIZE
Empty file.