Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task04 Валерий Мацкевич HSE #144

Open
wants to merge 2 commits into
base: task04
Choose a base branch
from

Conversation

blonded04
Copy link

@blonded04 blonded04 commented Oct 5, 2024

Локальный вывод

$ ./build/matrix_transpose 
OpenCL devices:
  Device #0: CPU. pthread-AMD Ryzen 5 5500U with Radeon Graphics. AuthenticAMD. Total memory: 13334 Mb
Using device #0: CPU. pthread-AMD Ryzen 5 5500U with Radeon Graphics. AuthenticAMD. Total memory: 13334 Mb
Data generated for M=4096, K=4096
[matrix_transpose_naive]
    GPU: 0.0169246+-0.000363228 s
    GPU: 991.289 millions/s
[matrix_transpose_local_bad_banks]
    GPU: 0.0162895+-3.76592e-05 s
    GPU: 1029.94 millions/s
[matrix_transpose_local_good_banks]
    GPU: 0.0162737+-4.11965e-05 s
    GPU: 1030.94 millions/s
$ ./build/matrix_multiplication 
OpenCL devices:
  Device #0: CPU. pthread-AMD Ryzen 5 5500U with Radeon Graphics. AuthenticAMD. Total memory: 13334 Mb
Using device #0: CPU. pthread-AMD Ryzen 5 5500U with Radeon Graphics. AuthenticAMD. Total memory: 13334 Mb
Data generated for M=1024, K=1024, N=1024
CPU: 10.3802+-0 s
CPU: 0.192675 GFlops
[naive, ts=4]
    GPU: 0.929132+-0.00651321 s
    GPU: 2.15255 GFlops
    Average difference: 0.000149043%
[naive, ts=8]
    GPU: 0.939817+-0.00427403 s
    GPU: 2.12807 GFlops
    Average difference: 0.000149043%
[naive, ts=16]
    GPU: 1.20814+-0.00336305 s
    GPU: 1.65544 GFlops
    Average difference: 0.000149043%
[local, ts=4]
    GPU: 0.65125+-0.306172 s
    GPU: 3.07102 GFlops
    Average difference: 0.000149043%
[local, ts=8]
    GPU: 0.564746+-0.284267 s
    GPU: 3.54142 GFlops
    Average difference: 0.000149043%
[local, ts=16]
    GPU: 0.310659+-0.00316622 s
    GPU: 6.43794 GFlops
    Average difference: 0.000149043%
[local wpt, ts=4, wpt=2]
    GPU: 1.23883+-0.633773 s
    GPU: 1.61442 GFlops
    Average difference: 0.000149043%
[local wpt, ts=4, wpt=4]
    GPU: 0.543547+-0.0472318 s
    GPU: 3.67953 GFlops
    Average difference: 0.000149043%
[local wpt, ts=8, wpt=2]
    GPU: 0.59599+-0.21084 s
    GPU: 3.35576 GFlops
    Average difference: 0.000149043%
[local wpt, ts=8, wpt=4]
    GPU: 0.671404+-0.117922 s
    GPU: 2.97883 GFlops
    Average difference: 0.000149043%
[local wpt, ts=8, wpt=8]
    GPU: 0.524906+-0.18261 s
    GPU: 3.81021 GFlops
    Average difference: 0.000149043%
[local wpt, ts=16, wpt=2]
    GPU: 0.391485+-0.0641349 s
    GPU: 5.10876 GFlops
    Average difference: 0.000149043%
[local wpt, ts=16, wpt=4]
    GPU: 0.445875+-0.146809 s
    GPU: 4.48556 GFlops
    Average difference: 0.000149043%
[local wpt, ts=16, wpt=8]
    GPU: 0.498357+-0.0494717 s
    GPU: 4.01319 GFlops
    Average difference: 0.000149043%
[local wpt, ts=16, wpt=16]
    GPU: 0.361147+-0.04432 s
    GPU: 5.53792 GFlops
    Average difference: 0.000149043%

Вывод Github CI

$ ./matrix_transpose
OpenCL devices:
  Device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Using device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Data generated for M=4096, K=4096
[matrix_transpose_naive]
    GPU: 0.0152054+-0.000145498 s
    GPU: 1103.37 millions/s
[matrix_transpose_local_bad_banks]
    GPU: 0.0225613+-0.000898613 s
    GPU: 743.627 millions/s
[matrix_transpose_local_good_banks]
    GPU: 0.0224438+-0.000941109 s
    GPU: 747.522 millions/s
$ ./matrix_multiplication
OpenCL devices:
  Device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Using device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Data generated for M=1024, K=1024, N=1024
CPU: 6.33776+-0 s
CPU: 0.315569 GFlops
[naive, ts=4]
    GPU: 0.368016+-0.00387518 s
    GPU: 5.43454 GFlops
    Average difference: 0.000149043%
[naive, ts=8]
    GPU: 0.373403+-0.00552401 s
    GPU: 5.35614 GFlops
    Average difference: 0.000149043%
[naive, ts=16]
    GPU: 0.372796+-0.00299166 s
    GPU: 5.36486 GFlops
    Average difference: 0.000149043%
[local, ts=4]
    GPU: 0.536499+-0.000822247 s
    GPU: 3.72787 GFlops
    Average difference: 0.000149043%
[local, ts=8]
    GPU: 0.310242+-0.00216126 s
    GPU: 6.44658 GFlops
    Average difference: 0.000149043%
[local, ts=16]
    GPU: 0.2867+-0.00103005 s
    GPU: 6.97592 GFlops
    Average difference: 0.000149043%
[local wpt, ts=4, wpt=2]
    GPU: 0.514041+-0.00417185 s
    GPU: 3.89074 GFlops
    Average difference: 0.000149043%
[local wpt, ts=4, wpt=4]
    GPU: 0.441325+-0.00117908 s
    GPU: 4.53181 GFlops
    Average difference: 0.000149043%
[local wpt, ts=8, wpt=2]
    GPU: 0.318689+-0.00104722 s
    GPU: 6.2757 GFlops
    Average difference: 0.000149043%
[local wpt, ts=8, wpt=4]
    GPU: 0.284307+-0.00114197 s
    GPU: 7.03465 GFlops
    Average difference: 0.000149043%
[local wpt, ts=8, wpt=8]
    GPU: 0.275566+-0.00210648 s
    GPU: 7.25779 GFlops
    Average difference: 0.000149043%
[local wpt, ts=16, wpt=2]
    GPU: 0.239857+-0.00118266 s
    GPU: 8.33831 GFlops
    Average difference: 0.000149043%
[local wpt, ts=16, wpt=4]
    GPU: 0.227042+-0.00149647 s
    GPU: 8.80894 GFlops
    Average difference: 0.000149043%
[local wpt, ts=16, wpt=8]
    GPU: 0.199619+-0.00168463 s
    GPU: 10.0191 GFlops
    Average difference: 0.000149043%

Signed-off-by: Valery Matskevich <generalretcher@gmail.com>
Signed-off-by: Valery Matskevich <generalretcher@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant