Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tuning] rel-4.5 fdb update for develop #1196

Merged
merged 2 commits into from
Oct 4, 2021
Merged

Conversation

cderb
Copy link
Contributor

@cderb cderb commented Sep 29, 2021

No description provided.

@atamazov
Copy link
Contributor

Is this from 4.5 tuning?

@junliume
Copy link
Contributor

Is this from 4.5 tuning?

I think so, and the source branch name also double confirms cderb/fdb_tuning_dev-4.5 so.

@atamazov atamazov changed the title [Tuning] develop fdb update [Tuning] rel-4.5 fdb update for develop Sep 29, 2021
@atamazov
Copy link
Contributor

In performance testing right now...

Copy link
Contributor

@atamazov atamazov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fdb duplicate IDs for gfx906/60

@atamazov
Copy link
Contributor

atamazov commented Sep 30, 2021

Duplicate IDs for gfx906/60/HIP

It seems like the root reason of #1133 (comment) is not yet fixed.

Error messages

MIOpen(HIP): Error [ParseContents] Duplicate ID (ignored): miopenConvolutionBwdDataAlgoDirect; key: 256-28-28-1x1-64-56-56-8-0x0-2x2-1x1-0-NCHW-BF16-B
MIOpen(HIP): Error [ParseContents] Duplicate ID (ignored): miopenConvolutionBwdDataAlgoDirect; key: 64-56-56-3x3-64-56-56-8-1x1-1x1-1x1-0-NCHW-BF16-B
MIOpen(HIP): Error [ParseContents] Duplicate ID (ignored): miopenConvolutionBwdDataAlgoWinograd; key: 128-6-60-3x3-64-6-60-16-1x1-1x1-1x1-0-NCHW-FP16-B
MIOpen(HIP): Error [ParseContents] Duplicate ID (ignored): miopenConvolutionBwdDataAlgoWinograd; key: 256-7-7-1x1-832-7-7-16-0x0-1x1-1x1-0-NCHW-FP32-B
MIOpen(HIP): Error [ParseContents] Duplicate ID (ignored): miopenConvolutionFwdAlgoDirect; key: 256-14-14-1x1-1024-7-7-8-0x0-2x2-1x1-0-NCHW-BF16-F
MIOpen(HIP): Error [ParseContents] Duplicate ID (ignored): miopenConvolutionFwdAlgoDirect; key: 256-14-14-1x1-256-14-14-8-0x0-1x1-1x1-0-NCHW-FP16-F
MIOpen(HIP): Error [ParseContents] Duplicate ID (ignored): miopenConvolutionFwdAlgoImplicitGEMM; key: 128-28-28-1x1-512-28-28-8-0x0-1x1-1x1-0-NCHW-FP32-F
MIOpen(HIP): Error [ParseContents] Duplicate ID (ignored): miopenConvolutionFwdAlgoImplicitGEMM; key: 512-28-28-1x1-1024-14-14-16-0x0-2x2-1x1-0-NCHW-FP32-F
MIOpen(HIP): Error [ParseContents] Duplicate ID (ignored): miopenConvolutionFwdAlgoWinograd; key: 512-7-7-3x3-512-7-7-8-1x1-1x1-1x1-0-NCHW-FP16-F

Examples of duplicates

256-28-28-1x1-64-56-56-8-0x0-2x2-1x1-0-NCHW-BF16-B=

miopenConvolutionBwdDataAlgoDirect:
    ConvOclDirectFwd1x1,0.10016,0,miopenConvolutionBwdDataAlgoDirect,256x28x28x1x1x64x56x56x8xNCHWxBF16x0x0x2x2x1x1x1xB;
miopenConvolutionBwdDataAlgoDirect:
    ConvOclDirectFwd1x1,0.10016,0,miopenConvolutionBwdDataAlgoDirect,256x28x28x1x1x64x56x56x8xNCHWxBF16x0x0x2x2x1x1x1xB;
miopenConvolutionBwdDataAlgoGEMM:
    GemmBwd1x1_stride2,1.48128,4014080,miopenConvolutionBwdDataAlgoGEMM,256x28x28x1x1x64x56x56x8xNCHWxBF16x0x0x2x2x1x1x1xB

128-28-28-1x1-512-28-28-8-0x0-1x1-1x1-0-NCHW-FP32-F=

miopenConvolutionFwdAlgoImplicitGEMM
    ConvHipImplicitGemmV4R4Fwd,0.09984,0,miopenConvolutionFwdAlgoImplicitGEMM,128x28x28x1x1x512x28x28x8xNCHWxFP32x0x0x1x1x1x1x1xF
miopenConvolutionFwdAlgoDirect
    ConvAsm1x1U,0.11744,0,miopenConvolutionFwdAlgoDirect,128x28x28x1x1x512x28x28x8xNCHWxFP32x0x0x1x1x1x1x1xF
miopenConvolutionFwdAlgoImplicitGEMM
    ConvHipImplicitGemmV4R4Fwd,0.13424,0,miopenConvolutionFwdAlgoImplicitGEMM,128x28x28x1x1x512x28x28x8xNCHWxFP32x0x0x1x1x1x1x1xF
miopenConvolutionFwdAlgoGEMM
    GemmFwd1x1_0_1,0.13424,0,miopenConvolutionFwdAlgoGEMM,128x28x28x1x1x512x28x28x8xNCHWxFP32x0x0x1x1x1x1x1xF
miopenConvolutionFwdAlgoWinograd
    ConvBinWinogradRxSf3x2,0.231841,0,miopenConvolutionFwdAlgoWinograd,128x28x28x1x1x512x28x28x8xNCHWxFP32x0x0x1x1x1x1x1xF

@atamazov
Copy link
Contributor

atamazov commented Sep 30, 2021

Missing fdb records for gfx906/60/HIP

gfx906_60.HIP.missing.fp32.txt
gfx906_60.HIP.missing.fp16.txt
gfx906_60.HIP.missing.bf16.txt

I suspect that FP32 and FP16 configs are required (not sure about BF16, though)

@atamazov
Copy link
Contributor

Performance testing results

🟢 These are good!

ROCm 4.0, gfx906/60, Performance gain, based on sum of all times/directions:

    GPU time Wall time Aux Wall time
FP32/HIP Default find mode 1.13 1.11 1.45
  Immediate mode 1.13 1.10 1.49
FP16/HIP Default find mode 1.00 0.93 0.74
  Immediate mode 1.00 0.98 1.17
BF16/HIP Default find mode 0.99 0.75 1.16
  Immediate mode 0.99 0.99 1.12
FP32/OpenCL Default find mode 1.00 1.01 0.99
  Immediate mode 1.00 1.00 1.00

@atamazov
Copy link
Contributor

Verdict

I tested this only on gfx906/60. Assuming other GPUs perform similarly, I think this can be merged provided that "Duplicate ID" issues are resolved.

@cderb
Copy link
Contributor Author

cderb commented Sep 30, 2021

Thanks Artem, I'll start looking into the source of the duplication.

@cderb
Copy link
Contributor Author

cderb commented Sep 30, 2021

fdb entries no longer contain duplicate algorithms.

@codecov

This comment has been minimized.

@junliume junliume dismissed atamazov’s stale review October 1, 2021 19:26

Please re-review. fdb entries no longer contain duplicate algorithms.

@atamazov
Copy link
Contributor

atamazov commented Oct 1, 2021

Testing on MI100 is ongoing.

@junliume
Copy link
Contributor

junliume commented Oct 3, 2021

@atamazov @JehandadKhan
Would need #1196 and #1176 for next MIOpen staging promotion
#1192 is also nice to have

@atamazov
Copy link
Contributor

atamazov commented Oct 3, 2021

MI100 testing results (ROCm 4.3.1)

🟡 Looks acceptable. Develop vs cderb/fdb_tuning_dev-4.5, tested with MI100 & ROCm 4.3.1. Performance gain, based on sum of all times/directions:

    GPU time Wall time Aux Wall time
FP32/HIP Default find mode 1.00 0.81 0.94
  Immediate mode 1.00 0.96 1.14
FP16/HIP Default find mode 1.04 0.93 0.78
  Immediate mode 1.03 0.90 0.97

Copy link
Contributor

@atamazov atamazov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@junliume junliume merged commit 442db61 into develop Oct 4, 2021
@junliume
Copy link
Contributor

junliume commented Oct 4, 2021

@atamazov @JehandadKhan Would need #1196 and #1176 for next MIOpen staging promotion #1192 is also nice to have

Merged #1196 and initiated another round of develop staging, #1176 is functional update and can wait till next week's round.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants