Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement L1Loss #3401

Open
wants to merge 24 commits into
base: develop
Choose a base branch
from
Open

Implement L1Loss #3401

wants to merge 24 commits into from

Conversation

cognaiger9
Copy link
Collaborator

@cognaiger9 cognaiger9 commented Nov 22, 2024

  • Add L1Loss operation with forward reduced kernels.
  • Add driver and gtest for kernels.
  • MIOpen performs better if:
    • Reduction mode is either sum or mean

Average improvement over ROCm

type fwd
float16 1.92
float 1.93
bfloat16 1.9

Detail Benchmark

float16
op_name dtype size contiguous reduction direction ROCm MIOpen Improvement
L1Loss float16 [7 4] contiguous sum fwd 124259 44677 2,78
L1Loss float16 [7 4] noncontiguous sum fwd 145427 45850 3,17
L1Loss float16 [18 4] contiguous sum fwd 72674 42117 1,73
L1Loss float16 [28 4] contiguous sum fwd 120067 45157 2,66
L1Loss float16 [28 4] noncontiguous sum fwd 59538 44730 1,33
L1Loss float16 [34 4] noncontiguous sum fwd 105651 47433 2,23
L1Loss float16 [54 4] contiguous sum fwd 64209 40695 1,58
L1Loss float16 [72 4] contiguous sum fwd 108066 43752 2,47
L1Loss float16 [72 4] noncontiguous sum fwd 50754 43059 1,18
L1Loss float16 [98 4] noncontiguous sum fwd 123586 42455 2,91
L1Loss float16 [106 4] contiguous sum fwd 56545 43325 1,31
L1Loss float16 [135 4] contiguous sum fwd 119331 45050 2,65
L1Loss float16 [190 4] noncontiguous sum fwd 111459 52446 2,13
L1Loss float16 [249 128] contiguous sum fwd 100514 54828 1,83
L1Loss float16 [349 222] contiguous sum fwd 58818 44392 1,32
L1Loss float16 [349 222] noncontiguous sum fwd 77970 45352 1,72
L1Loss float16 [451 128] contiguous sum fwd 58737 50312 1,17
L1Loss float16 [451 128] noncontiguous sum fwd 62626 45352 1,38
L1Loss float16 [603 546] contiguous sum fwd 75186 46934 1,60
L1Loss float16 [603 546] noncontiguous sum fwd 75698 57193 1,32
float32
op_name dtype size contiguous reduction direction ROCm MIOpen Improvement
L1Loss float32 [7 4] contiguous sum fwd 81298 51255 1,59
L1Loss float32 [7 4] noncontiguous sum fwd 57249 44713 1,28
L1Loss float32 [18 4] contiguous sum fwd 104194 45122 2,31
L1Loss float32 [28 4] contiguous sum fwd 55697 46224 1,20
L1Loss float32 [28 4] noncontiguous sum fwd 118723 44161 2,69
L1Loss float32 [34 4] noncontiguous sum fwd 58033 46650 1,24
L1Loss float32 [54 4] contiguous sum fwd 123811 44001 2,81
L1Loss float32 [72 4] contiguous sum fwd 60945 43308 1,41
L1Loss float32 [72 4] noncontiguous sum fwd 113218 43735 2,59
L1Loss float32 [98 4] noncontiguous sum fwd 73282 39147 1,87
L1Loss float32 [106 4] contiguous sum fwd 110131 47041 2,34
L1Loss float32 [135 4] noncontiguous sum fwd 114659 43130 2,66
L1Loss float32 [190 4] noncontiguous sum fwd 78946 46810 1,69
L1Loss float32 [207 4] contiguous sum fwd 109475 41245 2,65
L1Loss float32 [207 4] noncontiguous sum fwd 45905 43219 1,06
L1Loss float32 [249 128] noncontiguous sum fwd 133555 42952 3,11
L1Loss float32 [349 222] contiguous sum fwd 53745 44836 1,20
L1Loss float32 [451 128] contiguous sum fwd 119347 44676 2,67
L1Loss float32 [451 128] noncontiguous sum fwd 58114 44375 1,31
L1Loss float32 [603 546] contiguous sum fwd 64529 45992 1,40
L1Loss float32 [603 546] noncontiguous sum fwd 75073 55557 1,35
bfloat16
op_name dtype size contiguous reduction direction ROCm MIOpen Improvement
L1Loss bfloat16 [7 4] contiguous sum fwd 52609 45584 1,15
L1Loss bfloat16 [18 4] contiguous sum fwd 112019 40624 2,76
L1Loss bfloat16 [18 4] noncontiguous sum fwd 113763 48659 2,34
L1Loss bfloat16 [28 4] noncontiguous sum fwd 97154 46846 2,07
L1Loss bfloat16 [34 4] contiguous sum fwd 85330 43824 1,95
L1Loss bfloat16 [54 4] contiguous sum fwd 89058 44944 1,98
L1Loss bfloat16 [54 4] noncontiguous sum fwd 99987 44801 2,23
L1Loss bfloat16 [72 4] noncontiguous sum fwd 103539 44108 2,35
L1Loss bfloat16 [98 4] contiguous sum fwd 79794 43877 1,82
L1Loss bfloat16 [98 4] noncontiguous sum fwd 47489 42686 1,11
L1Loss bfloat16 [106 4] contiguous sum fwd 128467 44979 2,86
L1Loss bfloat16 [106 4] noncontiguous sum fwd 97250 45086 2,16
L1Loss bfloat16 [135 4] noncontiguous sum fwd 109411 42953 2,55
L1Loss bfloat16 [190 4] contiguous sum fwd 74913 47112 1,59
L1Loss bfloat16 [207 4] contiguous sum fwd 116563 45939 2,54
L1Loss bfloat16 [207 4] noncontiguous sum fwd 84306 45317 1,86
L1Loss bfloat16 [349 222] contiguous sum fwd 79554 52535 1,51
L1Loss bfloat16 [349 222] noncontiguous sum fwd 60081 44926 1,34
L1Loss bfloat16 [451 128] contiguous sum fwd 58866 44392 1,33
L1Loss bfloat16 [451 128] noncontiguous sum fwd 96659 44943 2,15
L1Loss bfloat16 [603 546] contiguous sum fwd 78369 54686 1,43
L1Loss bfloat16 [603 546] noncontiguous sum fwd 74386 54117 1,37
L1Loss bfloat16 [1024 1024] contiguous sum fwd 83682 70349 1,19

@cognaiger9 cognaiger9 self-assigned this Nov 22, 2024
@cognaiger9 cognaiger9 marked this pull request as ready for review November 22, 2024 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant