Dev support char and short dtype #10086

BBuf · 2023-04-08T02:48:43Z

No description provided.

oneflow/core/common/data_type.h

marigoold · 2023-04-08T13:54:38Z

可以通过编译吗？我记得有个 clip kernel 会报错 int16_t 没有特化

BBuf · 2023-04-08T14:06:56Z

可以通过编译吗？我记得有个 clip kernel 会报错 int16_t 没有特化

可以呀。

github-actions · 2023-04-08T14:42:27Z

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

marigoold · 2023-04-08T14:44:36Z

oneflow/core/common/data_type_seq.h

@@ -32,6 +32,7 @@ limitations under the License.
  OF_PP_MAKE_TUPLE_SEQ(int32_t, DataType::kInt32) \
  OF_PP_MAKE_TUPLE_SEQ(int64_t, DataType::kInt64)

+#define INT16_DATA_TYPE_SEQ OF_PP_MAKE_TUPLE_SEQ(int16_t, DataType::kInt16)


这里为什么不加到 SIGNED_INT_DATA_TYPE_SEQ 里面呢

这个后续需要整理一下，现在宏已经烂了，如果加到SIGNED_INT_DATA_TYPE_SEQ里面某些kernel宏展开会失败，就先单独处理了

github-actions · 2023-04-08T14:54:03Z

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

github-actions · 2023-04-08T19:07:03Z

CI failed when running job: Build llvm15. PR label automerge has been removed

github-actions · 2023-04-09T01:51:26Z

CI failed when running job: cpu-module. PR label automerge has been removed

github-actions · 2023-04-09T01:51:49Z

Speed stats:

GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 141.2ms (= 14121.8ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 145.8ms (= 14581.4ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.03 (= 145.8ms / 141.2ms)

OneFlow resnet50 time: 81.4ms (= 8141.7ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 93.0ms (= 9302.7ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.14 (= 93.0ms / 81.4ms)

OneFlow resnet50 time: 51.1ms (= 10228.1ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 72.0ms (= 14397.9ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.41 (= 72.0ms / 51.1ms)

OneFlow resnet50 time: 33.9ms (= 6777.4ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 55.2ms (= 11030.5ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.63 (= 55.2ms / 33.9ms)

OneFlow resnet50 time: 26.3ms (= 5256.1ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 64.8ms (= 12952.2ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 2.46 (= 64.8ms / 26.3ms)

OneFlow swin dataloader time: 0.240s (= 47.980s / 200, num_workers=1)
PyTorch swin dataloader time: 0.149s (= 29.709s / 200, num_workers=1)
Relative speed: 0.619 (= 0.149s / 0.240s)

OneFlow swin dataloader time: 0.069s (= 13.767s / 200, num_workers=4)
PyTorch swin dataloader time: 0.040s (= 8.073s / 200, num_workers=4)
Relative speed: 0.586 (= 0.040s / 0.069s)

OneFlow swin dataloader time: 0.039s (= 7.789s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.346s / 200, num_workers=8)
Relative speed: 0.558 (= 0.022s / 0.039s)

❌ OneFlow resnet50 time: 152.6ms (= 15261.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 166.2ms (= 16616.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.09 (= 166.2ms / 152.6ms)

OneFlow resnet50 time: 92.7ms (= 9265.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 103.7ms (= 10368.7ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.12 (= 103.7ms / 92.7ms)

OneFlow resnet50 time: 61.4ms (= 12277.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.9ms (= 15773.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.28 (= 78.9ms / 61.4ms)

OneFlow resnet50 time: 42.5ms (= 8496.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 72.5ms (= 14491.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.71 (= 72.5ms / 42.5ms)

OneFlow resnet50 time: 38.0ms (= 7602.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 66.5ms (= 13303.8ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.75 (= 66.5ms / 38.0ms)

github-actions · 2023-04-09T01:54:24Z

CI failed when running job: cuda-module. PR label automerge has been removed

github-actions · 2023-04-09T03:31:39Z

Speed stats:

github-actions · 2023-04-09T06:56:46Z

Speed stats:

GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 141.3ms (= 14132.2ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 153.8ms (= 15380.6ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.09 (= 153.8ms / 141.3ms)

OneFlow resnet50 time: 83.0ms (= 8303.6ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 94.0ms (= 9404.3ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.13 (= 94.0ms / 83.0ms)

OneFlow resnet50 time: 51.7ms (= 10339.8ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 71.8ms (= 14363.9ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.39 (= 71.8ms / 51.7ms)

OneFlow resnet50 time: 34.0ms (= 6805.4ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 67.3ms (= 13450.8ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.98 (= 67.3ms / 34.0ms)

OneFlow resnet50 time: 26.5ms (= 5300.0ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 64.3ms (= 12857.2ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 2.43 (= 64.3ms / 26.5ms)

OneFlow swin dataloader time: 0.237s (= 47.305s / 200, num_workers=1)
PyTorch swin dataloader time: 0.151s (= 30.226s / 200, num_workers=1)
Relative speed: 0.639 (= 0.151s / 0.237s)

OneFlow swin dataloader time: 0.070s (= 13.927s / 200, num_workers=4)
PyTorch swin dataloader time: 0.041s (= 8.132s / 200, num_workers=4)
Relative speed: 0.584 (= 0.041s / 0.070s)

OneFlow swin dataloader time: 0.040s (= 8.059s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.341s / 200, num_workers=8)
Relative speed: 0.539 (= 0.022s / 0.040s)

❌ OneFlow resnet50 time: 153.3ms (= 15333.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 165.2ms (= 16519.3ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.08 (= 165.2ms / 153.3ms)

OneFlow resnet50 time: 93.3ms (= 9333.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 104.5ms (= 10446.8ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.12 (= 104.5ms / 93.3ms)

OneFlow resnet50 time: 62.0ms (= 12393.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 80.4ms (= 16086.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.30 (= 80.4ms / 62.0ms)

OneFlow resnet50 time: 43.5ms (= 8696.7ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.1ms (= 14216.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.63 (= 71.1ms / 43.5ms)

OneFlow resnet50 time: 37.2ms (= 7430.1ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 62.1ms (= 12420.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.67 (= 62.1ms / 37.2ms)

github-actions · 2023-04-09T07:09:12Z

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/10086/

github-actions · 2023-04-09T07:21:15Z

CI failed when running job: cuda-misc. PR label automerge has been removed

…-Inc/oneflow into dev_support_char_and_short_dtype

github-actions · 2023-04-09T09:50:55Z

Speed stats:

GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 141.3ms (= 14134.8ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 146.7ms (= 14673.0ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.04 (= 146.7ms / 141.3ms)

OneFlow resnet50 time: 82.5ms (= 8248.8ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 94.3ms (= 9426.4ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.14 (= 94.3ms / 82.5ms)

OneFlow resnet50 time: 51.5ms (= 10296.3ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 70.3ms (= 14053.5ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.36 (= 70.3ms / 51.5ms)

OneFlow resnet50 time: 33.9ms (= 6771.5ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 57.8ms (= 11560.2ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.71 (= 57.8ms / 33.9ms)

OneFlow resnet50 time: 26.0ms (= 5202.4ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 65.3ms (= 13069.3ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 2.51 (= 65.3ms / 26.0ms)

OneFlow swin dataloader time: 0.237s (= 47.436s / 200, num_workers=1)
PyTorch swin dataloader time: 0.149s (= 29.835s / 200, num_workers=1)
Relative speed: 0.629 (= 0.149s / 0.237s)

OneFlow swin dataloader time: 0.072s (= 14.314s / 200, num_workers=4)
PyTorch swin dataloader time: 0.041s (= 8.171s / 200, num_workers=4)
Relative speed: 0.571 (= 0.041s / 0.072s)

OneFlow swin dataloader time: 0.040s (= 8.001s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.444s / 200, num_workers=8)
Relative speed: 0.555 (= 0.022s / 0.040s)

❌ OneFlow resnet50 time: 153.1ms (= 15312.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 163.9ms (= 16390.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.07 (= 163.9ms / 153.1ms)

OneFlow resnet50 time: 93.3ms (= 9333.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 114.5ms (= 11450.1ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.23 (= 114.5ms / 93.3ms)

OneFlow resnet50 time: 61.6ms (= 12317.8ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 79.2ms (= 15848.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.29 (= 79.2ms / 61.6ms)

OneFlow resnet50 time: 43.2ms (= 8632.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 70.3ms (= 14067.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.63 (= 70.3ms / 43.2ms)

OneFlow resnet50 time: 35.9ms (= 7170.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 66.5ms (= 13302.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.86 (= 66.5ms / 35.9ms)

github-actions · 2023-04-09T09:55:19Z

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/10086/

github-actions · 2023-04-09T09:57:14Z

CI failed when running job: cpu-module. PR label automerge has been removed

github-actions · 2023-04-09T10:32:02Z

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/10086/

github-actions · 2023-04-09T10:32:37Z

Speed stats:

GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 141.3ms (= 14125.0ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 153.3ms (= 15331.6ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.09 (= 153.3ms / 141.3ms)

OneFlow resnet50 time: 82.2ms (= 8224.6ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 94.0ms (= 9403.0ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.14 (= 94.0ms / 82.2ms)

OneFlow resnet50 time: 52.3ms (= 10457.4ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 70.9ms (= 14170.2ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.36 (= 70.9ms / 52.3ms)

OneFlow resnet50 time: 34.8ms (= 6952.0ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 60.0ms (= 12007.5ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.73 (= 60.0ms / 34.8ms)

OneFlow resnet50 time: 26.2ms (= 5244.7ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 61.4ms (= 12272.8ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 2.34 (= 61.4ms / 26.2ms)

OneFlow swin dataloader time: 0.243s (= 48.643s / 200, num_workers=1)
PyTorch swin dataloader time: 0.150s (= 29.914s / 200, num_workers=1)
Relative speed: 0.615 (= 0.150s / 0.243s)

OneFlow swin dataloader time: 0.068s (= 13.619s / 200, num_workers=4)
PyTorch swin dataloader time: 0.040s (= 8.025s / 200, num_workers=4)
Relative speed: 0.589 (= 0.040s / 0.068s)

OneFlow swin dataloader time: 0.043s (= 8.629s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.462s / 200, num_workers=8)
Relative speed: 0.517 (= 0.022s / 0.043s)

❌ OneFlow resnet50 time: 152.9ms (= 15290.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 163.9ms (= 16389.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.07 (= 163.9ms / 152.9ms)

OneFlow resnet50 time: 93.4ms (= 9337.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 106.7ms (= 10673.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.14 (= 106.7ms / 93.4ms)

OneFlow resnet50 time: 61.7ms (= 12339.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 79.8ms (= 15968.0ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.29 (= 79.8ms / 61.7ms)

OneFlow resnet50 time: 43.5ms (= 8691.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 70.0ms (= 13994.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.61 (= 70.0ms / 43.5ms)

OneFlow resnet50 time: 36.4ms (= 7278.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 74.5ms (= 14898.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 2.05 (= 74.5ms / 36.4ms)

BBuf added 2 commits April 7, 2023 10:35

add int16_t registe

4597a12

support char and int16 dtype

dd0c691

BBuf requested review from liujuncheng and daquexian as code owners April 8, 2023 02:48

BBuf mentioned this pull request Apr 8, 2023

oneflow适配调研 Oneflow-Inc/OneFlow-Pruning#1

Closed

refine

75aa85b

daquexian approved these changes Apr 8, 2023

View reviewed changes

oneflow/core/common/data_type.h Outdated Show resolved Hide resolved

Merge branch 'master' into dev_support_char_and_short_dtype

84010ba

BBuf requested a review from oneflow-ci-bot April 8, 2023 14:39

BBuf added enhancement feature automerge eager api labels Apr 8, 2023

BBuf and others added 2 commits April 8, 2023 22:40

Merge branch 'master' into dev_support_char_and_short_dtype

f365380

auto format by CI

70865f5

marigoold reviewed Apr 8, 2023

View reviewed changes

BBuf requested review from oneflow-ci-bot and removed request for oneflow-ci-bot April 8, 2023 14:51

auto format by CI

d305808

BBuf requested review from oneflow-ci-bot and removed request for oneflow-ci-bot April 8, 2023 14:55

marigoold approved these changes Apr 8, 2023

View reviewed changes

Merge branch 'master' into dev_support_char_and_short_dtype

74a8fea

github-actions bot removed the automerge label Apr 8, 2023

BBuf added the automerge label Apr 9, 2023

BBuf requested a review from oneflow-ci-bot April 9, 2023 00:36

github-actions bot removed the automerge label Apr 9, 2023

fix ci test error in test_tensor_ops.py

b981129

BBuf added the automerge label Apr 9, 2023

Merge branch 'master' into dev_support_char_and_short_dtype

b82d05c

github-actions bot removed the automerge label Apr 9, 2023

BBuf added 2 commits April 9, 2023 08:39

try to fix ci error

d35d7b7

Merge branch 'dev_support_char_and_short_dtype' of github.com:Oneflow…

dc81d03

…-Inc/oneflow into dev_support_char_and_short_dtype

BBuf added the automerge label Apr 9, 2023

BBuf requested review from oneflow-ci-bot and removed request for oneflow-ci-bot April 9, 2023 08:40

github-actions bot removed the automerge label Apr 9, 2023

BBuf requested review from oneflow-ci-bot and removed request for oneflow-ci-bot April 9, 2023 10:08

BBuf added the automerge label Apr 9, 2023

mergify bot merged commit c19d148 into master Apr 9, 2023

mergify bot deleted the dev_support_char_and_short_dtype branch April 9, 2023 11:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev support char and short dtype #10086

Dev support char and short dtype #10086

BBuf commented Apr 8, 2023

marigoold commented Apr 8, 2023

BBuf commented Apr 8, 2023

github-actions bot commented Apr 8, 2023

marigoold Apr 8, 2023

BBuf Apr 8, 2023

github-actions bot commented Apr 8, 2023

github-actions bot commented Apr 8, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

Dev support char and short dtype #10086

Dev support char and short dtype #10086

Conversation

BBuf commented Apr 8, 2023

marigoold commented Apr 8, 2023

BBuf commented Apr 8, 2023

github-actions bot commented Apr 8, 2023

marigoold Apr 8, 2023

Choose a reason for hiding this comment

BBuf Apr 8, 2023

Choose a reason for hiding this comment

github-actions bot commented Apr 8, 2023

github-actions bot commented Apr 8, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023

github-actions bot commented Apr 9, 2023